Previous Chapter: 6 A Bold Vision for the Future of Sequencing RNA and Its Modifications: Conclusions, Recommendations, and a Path Forward
Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.

Appendix A

Tables of Computational Tools

TABLE A-1 State-of-the-Art Nanopore Basecalling Tools with Principle of Analysis, Input, and Output Along with Resources

Tool Principle of Analysis Tool Input Tool Output Resource
Albacore RNN FAST5 FASTA/FASTQ https://community.Nanoporetech.com
Guppy RNN FAST5 FASTA/FASTQ https://community.Nanoporetech.com
MinKNOW
Metrichor RNN FAST5 FASTA/FASTQ https://metrichor.com
Nanocall HMM FAST5 FASTA https://github.com/mateidavid/nanocall
DeepNano RNN FAST5 FASTA https://github.com/jeammimi/deepnano
Nanonet RNN FAST5 FASTA https://github.com/ProgramFiles/nanonet
basecRAWller https://www.osti.gov/biblio/1572483
Chiron CNNs, RNNs, and CTC decoder FAST5 FASTA/FASTQ https://github.com/haotianteng/chiron
Scrappie RNN FAST5 FASTA/SAM https://github.com/Nanoporetech/scrappie
Bonito QuartzNet13(multiple TCSConv-BNReLU) FAST5 FASTQ, SAM, BAM, CRAM https://github.com/Nanoporetech/bonito
Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.
Tool Principle of Analysis Tool Input Tool Output Resource
URNano Refined U-net model for signal segmentation and basecalling FAST5 FASTA/FASTQ https://github.com/yaozhong/URnano
Causalcall TCN FAST5 FASTA/FASTQ https://github.com/scutbioinformatic/causalcall
Heron and Osprey CNN with dynamic pooling FAST5 FASTA https://github.com/fmfi-compbio/heron
https://github.com/fmfi-compbio/osprey
DeepNano-coral Enhanced version of Bonito-CNN that has factorization of a full convolution into smaller operations FAST5 FASTA https://github.com/fmfi-compbio/coral-training
DeepNano-blitz Bi-directional RNN FAST5 FASTA https://github.com/fmfi-compbio/deepnano-blitz
BRAWL Binarization of Chiron basecaller (CNN+RNN+CTC) FAST5 FASTA/FASTQ https://github.com/haotianteng/Chiron
SACall Convolution layers, transformer self-attention layers and a CTC decoder FAST5 FASTA/FASTQ https://github.com/huangnengCSU/SACall-basecaller
CATCaller Convolution-augmented transformer architecture FAST5 FASTA/FASTQ https://github.com/biomed-AI/CATCaller
Halcyon CNN module and RNN-based encoder and decoder FAST5 FASTA https://github.com/relastle/halcyon
Tool Principle of Analysis Tool Input Tool Output Resource
Ravvent Encoder-decoder architecture with attention mechanism and LSTMs as RNNs FAST5 FASTA https://github.com/adamnapieralski/ravventbasecaller
Dorado Based on libtorch, the C++ API for PyTorch POD5 HTS format https://github.com/nanoporetech/dorado

NOTE: CNN = convolutional neural network; CTC = connectionist temporal classification decoder; TCN = temporal convolutional network; LSTM = long short-term memory network; API = application programming interface. Other abbreviations are defined in the Front Matter.

Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.

TABLE A-2 Various Quality Control Tools Used with Nanopore Sequencing Data

QC Tool Principle of Analysis Input Output Resource
Poretools Open-source software and is written in Python FAST5 files Tables & Graphs (PNG & PDF) https://github.com/arq5x/poretools
poRe Cross-platform tool scripted in R language FAST5 files Tables & Graphs https://github.com/mw55309/poRe_docs
NanoOK Alignment-based QC tool, written in Java with supporting R scripts FAST5 files PDF report https://github.com/TGAC/NanoOK
NanoPack Long-read QC tool written in Python language FASTQ, sorted BAM, & sequencing summary files HTML, PDF, PNG & JPG https://github.com/wdecoster/nanopack
ToulligQC QC tool for guppy basecalled files, written in Python 3 Sequencing summary files HTML https://github.com/GenomicParisCentre/toulligQC
MinIONQC Non-interactive tool written in R Sequencing summary file PNG & YAML format https://github.com/roblanf/minion_qc
PycoQC Interactive QC tool written in Python language Sequencing summary file HTML https://github.com/a-slide/pycoQC
NanoR A cross-platform R package FAST5 file or sequencing summary file TSV file or HTML file https://github.com/davidebolo1993/NanoR
PyPore Open-source interactive software written in Python 2.7 FAST5 files HTML https://github.com/rsemeraro/PyPore
LongQC Automated QC tool written in Python 3 FASTA, FASTQ, or a PacBio BAM file JSON and HTML https://github.com/yfukasawa/LongQC
BoardION Web application based on R FAST5 file or sequencing summary file HTML https://github.com/institut-degenomique/BoardION
IONiseR R package FAST5 R plots https://www.bioconductor.org/packages/release/bioc/html/IONiseR.html
Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.

TABLE A-3 ONT Assemblers for Long-Reads

Name of the tool Principle of analysis Input Output Resource
MaSuRCA De Bruijn graph and overlap-layout-consensus FASTA/FASTQ FASTA https://github.com/alekseyzimin/masurca
Canu Detection of overlaps in high-noise sequences using MinHash Alignment Process FASTA/FASTQ, uncompressed or compressed in gzip, bzip2, or xz FASTA https://github.com/marbl/canu
Unicycler Miniasm+Racon pipeline through a short-read-first hybrid assembly FASTQ FASTA https://github.com/rrwick/Unicycler
HINGE Variation of the greedy algorithm, called hinging without a preassembly or read correction step FASTA FASTA https://github.com/HingeAssembler/HINGE
Flye Repeat graphs from the analysis of random paths in an unknown repeat graph, called disjointigs FASTQ or FASTA format, uncompressed or compressed with gzip FASTA https://github.com/fenderglass/Flye
Shasta Run-length representation of the read sequence FASTA/FASTQ FASTA/GFA https://github.com/chanzuckerberg/shasta
Wtdbg2 FBG to long noisy reads assembly FASTA/FASTQ FASTA https://github.com/ruanjue/wtdbg2

NOTE: FBG = fuzzy De Bruijn graph.

Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.

TABLE A-4 Aligners for RNA-Seq Data

Tool Principle of analysis Input Output Source
HISAT Mapping next-generation sequencing reads (both DNA and RNA) to a population of human genomes as well as to a single reference genome FASTA/FASTQ/SEQ SAM https://github.com/DaehwanKimLab/hisat2
Minimap2 Overlapping read-to-read sequences with k-mers FASTA/FASTQ SAM/PAF https://github.com/lh3/minimap2
DART Separation of sequence into segments to replace the seed extension step FASTA/FASTQ SAM/BAM https://github.com/hsinnan75/Dart
Magic-BLAST Selection of the best scoring alignment from sequences found in the NCBI FASTA SAM/BAM https://github.com/ncbi/magicblast/tree/master/magicblast-tools
GraphMap2 Subdivision of k-mer similarity FASTA/FASTQ SAM https://github.com/lbcb-sci/graphmap2
deSALT` De Bruijn graph-based spliced aligner for long-reads FASTA/FASTQ SAM https://github.com/ydLiuHIT/deSALT
uLTRA Alignment of reads to a genome through guided annotation of exons FASTA/FASTQ SAM https://github.com/ksahlin/ultra
mapAlign Map long-reads against a reference genome or a prebuilt index file FASTA/FASTQ SAM https://github.com/yw575/mapAlign

NOTE: NCBI = National Center for Biotechnology Information

Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.

This page intentionally left blank.

Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.
Page 181
Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.
Page 182
Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.
Page 183
Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.
Page 184
Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.
Page 185
Suggested Citation: "Appendix A: Tables of Computational Tools." National Academies of Sciences, Engineering, and Medicine. 2024. Charting a Future for Sequencing RNA and Its Modifications: A New Era for Biology and Medicine. Washington, DC: The National Academies Press. doi: 10.17226/27165.
Page 186
Next Chapter: Appendix B: Public Meeting Agendas
Subscribe to Email from the National Academies
Keep up with all of the activities, publications, and events by subscribing to free updates by email.