Troubleshooting
This page contains Dorado troubleshooting advice to help users resolve issues which are known to appear from time to time.
If you have an issue that cannot be resolved following the advice below please raise a new issue on the Dorado GitHub issues page providing as much information as possible and the Dorado team will aim to respond promptly.
You can also seek advice from the Nanopore Community.
Errors and warnings
Dorado will issue warnings and errors to stderr during runtime and may terminate if an
unrecoverable error occurs. Many errors stem from incorrect configuration of the
command line and the following are examples of common
issues reported on GitHub.
No supported chemistry found
[error] No supported chemistry found for flowcell_code: '__UNKNOWN_FLOWCELL__' sequencing_kit: '__UNKNOWN_KIT__' sample_rate: 5000
[error] This is typically seen when using prototype kits. Please download an appropriate model for your data and select it by model path
When using automatic model selection complex dorado must be able to determine which model to use by inspecting the input data which must be in POD5 format.
If your data doesn't contain a recognised flow cell (e.g. __UNKNOWN_FLOWCELL__) or
sequencing_kit (e.g. __UNKNOWN_KIT__) dorado cannot find a suitable model for your data.
To basecall your data you need to first download a basecalling model which is appropriate for your data and specify the model using its filepath as shown in the simplex basecalling quick-start or in the example below:
dorado download --model dna_r10.4.1_e8.2_400bps_fast@v5.2.0
dorado basecaller dna_r10.4.1_e8.2_400bps_fast@v5.2.0 reads/ > calls.bam
For details please check out the models introduction and the models list.
Incompatible modbase models
[error] Following are incompatible modbase models.
Please select only one of them to run: model_A and model_B have overlapping canonical motif: A
This error is shown when the user selects two modbase models which share a canonical base which is invalid as detailed here.
Common causes of this error are selecting these pairs of modbase models:
5mC_5hmC+5hmCG_5hmCG5mC_5hmC+4mC_5mCm6A_DRACH+inosine_m6A
Runtime issues
CUDA Out Of Memory
Dorado supports multiple model architectures which can vary significantly in size (fast, hac, sup).
Multiple models are also used together when using features such as modification basecalling,
stereo duplex basecalling and hemi-methylation duplex basecalling. As such there are
cases where excessive GPU memory consumption can unexpectedly terminate Dorado.
Unless specified otherwise by the user Dorado will attempt to calculate the optimal batch size using the auto batch size protocol. This algorithm tests multiple batch sizes for the models in use and selects the batch size which gives the best performance. However, many factors could result in this algorithm selecting a batch size which may exceed the available GPU memory especially when combined with modification / stereo duplex models.
These factors include but are not limited to:
- Other processes using GPU resources (including other instances of Dorado)
- Display devices
- GPU with insufficient memory (Dorado does not support GPUs with <8GB of memory)
To resolve CUDA out-of-memory issues inspect the Dorado output from a previous run which should report the batch size used as shown in the example below:
dorado basecaller <model> <reads> ... > calls.bam
[info] > Creating basecall pipeline
[info] - set batch size to 480
This example shows a batch size of 480. We can use this as a guide for specifying the batch size
manually using the --batchsize argument in basecaller and duplex. Reduce the batchsize
by a even values such as 32, 48, 64 starting with approximately 10%
of the original auto batchsize estimate 480-48=432 giving:
Repeat the above until Dorado completes successfully without running out of GPU memory.
Low GPU utilization
Low GPU utilization can lead to reduced basecalling speed. This problem can be identified using
tools such as nvidia-smi and nvtop. Low GPU utilization often stems from I/O bottlenecks
in basecalling.
Here are a few steps you can take to improve the situation:
- Use POD5 instead of .fast5:
- POD5 has superior I/O performance and will enhance the basecall speed in I/O constrained environments.
- Transfer data to the local disk before basecalling:
- Frequently network disks cannot supply Dorado with adequate I/O speeds. To mitigate this, make sure your data is as close to your host machine as possible.
- Choose SSD over HDD:
- Particularly for duplex basecalling, using a local SSD can offer significant speed advantages. This is due to the duplex basecalling algorithm's reliance on heavy random access of data.
Library path errors
Dorado comes equipped with the necessary libraries (such as CUDA) for its execution.
However, on some operating systems, the system libraries might be chosen over Dorado's.
This discrepancy can result in various errors, for instance, CuBLAS error 8.
To resolve this issue, you need to set the LD_LIBRARY_PATH to point to Dorado's libraries.
Use a command like the following to change path as appropriate:
Windows GPU performance
On Windows systems with Nvidia GPUs, open Nvidia Control Panel, navigate into “Manage 3D settings” and then set “CUDA - Sysmem Fallback Policy” to “Prefer No Sysmem Fallback”. This will provide a significant performance improvement.
Windows PowerShell encoding
When running in PowerShell on Windows, care must be taken, as the default encoding for application
output is typically UTF-16LE. This will cause file corruption if standard output is redirected to a file.
It is recommended to use the --output-dir argument to emit BAM files if PowerShell must be used.
For example, the following command will create corrupt output which cannot be read by samtools:
Instead, use:
Warning
Using out-file with Ascii encoding will not produce well-formed BAM (binary) files.
For text-based output formats (SAM or FASTQ), it is possible to override the encoding on output using the out-file command.
This command will produce a well formed ascii SAM file:
Read more about PowerShell output encoding here.