SAM specification
Header
@HD VN:1.6 SO:unknown
@PG ID:basecaller PN:dorado VN:0.2.4+3fc2b0f CL:dorado basecaller hac pod5/ DS:gpu:Quadro GV100
Read Group Header
Tag | Description |
---|---|
ID |
<runid>_<basecalling_model>_<barcode_arrangement> |
PU |
<flow_cell_id> |
PM |
<device_id> |
DT |
<exp_start_time> |
PL |
ONT |
DS |
basecall_model=<basecall_model_name> modbase_models=<modbase_model_names> runid=<run_id> |
LB |
<sample_id> |
SM |
<barcode_name> (only if barcoding, and barcode is not "unclassified") |
al |
<barcode_alias> (only if barcoding, same as SM tag if no alias) |
Read Tags
Tag | Description |
---|---|
RG:Z: |
<runid>_<basecalling_model>_<barcode_arrangement> |
qs:f: |
mean basecall q-score |
ts:i: |
the number of samples trimmed from the start of the signal |
ns:i: |
the basecalled sequence corresponds to the interval signal[ts : ns] the move table maps to the same interval. note that ns reflects trimming (if any) from the rear of the signal. |
mx:i: |
read mux |
ch:i: |
read channel |
rn:i: |
read number |
st:Z: |
read start time (in UTC) |
du:f: |
duration of the read (in seconds) |
fn:Z: |
file name |
sm:f: |
scaling midpoint/mean/median (pA to ~0-mean/1-sd) |
sd:f: |
scaling dispersion (pA to ~0-mean/1-sd) |
sv:Z: |
scaling version |
mv:B: c |
sequence to signal move table (optional) |
dx:i: |
bool to signify duplex read (only in duplex mode) |
pi:Z: |
parent read id for a split read |
sp:i: |
start coordinate of split read in parent read signal |
bh:i: |
number of detected bedfile hits (only if alignment was performed with a specified bed-file) |
MN:i: |
Length of sequence at the time MM and ML were produced |
Poly(A/T) Tags
When dorado
is run with poly(A/T) estimation enabled, additional tags are added to each SAM record as follows:
pt:i
is the estimated poly(A/T) tail length in cDNA and dRNA readspa:B:i
is an array of signal positions related to the poly(A/T) estimation, in order:- The position in the signal used as the anchor for the poly(A/T) search
- The start of the poly(A/T) region
- The end of the poly(A/T) region
- The the start of a secondary poly(A/T) region in the case of plasmids, (-1 otherwise or if not found)
- The end of a secondary poly(A/T) region in the case of plasmids, (-1 otherwise or if not found)