POD5 API Reader Reference
Tools for accessing POD5 data from PyArrow files
ArrowTableHandle
ArrowTableHandle(location: EmbeddedFileData, options: Optional[IpcReadOptions] = None)
Class for managing arrow file handles and memory view mapping of tables
Open a pod5 file at the given path and use the location data to load
an arrow table (e.g. signal table)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
location
|
EmbeddedFileData
|
Location data for how a pod5 file should be spit in memory to read a table. This is returned from p5b.Pod5FileReader.get_file_X_location methods |
required |
options
|
Optional[IpcReadOptions]
|
Serialization options for reading IPC format. |
None
|
Raises:
| Type | Description |
|---|---|
Pod5ApiException
|
If handle could not be opened |
ReadRecord
ReadRecord(
reader: Reader,
batch: ReadRecordBatch,
row: int,
batch_signal_cache: Optional[List[NDArray[int16]]] = None,
selected_batch_index: Optional[int] = None,
)
Represents the data for a single read from a pod5 record.
calibration_digitisation
property
Get the digitisation value used by the sequencer.
Intended to assist workflows ported from legacy file formats.
calibration_range
property
Get the calibration range value.
Intended to assist workflows ported from legacy file formats.
end_reason_index
property
Get the dictionary index of the end reason data associated with the read. This property is the same as the EndReason enumeration value.
median_before
property
Get the median before level (in pico amps) for the read.
num_reads_since_mux_change
property
Number of selected reads since the last mux change on this reads channel.
open_pore_level
property
Get the open pore level for the read.
This is a float value representing the open pore level of the well prior to the read starting.
predicted_scaling
property
predicted_scaling: ShiftScalePair
Find the predicted scaling value in the read.
run_info_index
property
Get the dictionary index of the run info data associated with the read.
signal
property
Get the full signal for the read.
Returns:
| Type | Description |
|---|---|
ndarray[int16]
|
A numpy array of signal data with int16 type. |
signal_pa
property
Get the full signal for the read, calibrated in pico amps.
Returns:
| Type | Description |
|---|---|
ndarray[float32]
|
A numpy array of signal data in pico amps with float32 type. |
signal_rows
property
Get all signal rows for the read
Returns:
| Type | Description |
|---|---|
list[SignalRowInfo]
|
A list of signal row data (as SignalRowInfo) in the read. |
time_since_mux_change
property
Time in seconds since the last mux change on this reads channel.
tracked_scaling
property
tracked_scaling: ShiftScalePair
Find the tracked scaling value in the read.
calibrate_signal_array
Transform an array of int16 signal data from ADC space to pA.
Returns:
| Type | Description |
|---|---|
A numpy array of signal data with float32 type.
|
|
signal_for_chunk
Get the signal for a given chunk of the read.
Returns:
| Type | Description |
|---|---|
ndarray[int16]
|
A numpy array of signal data with int16 type for the specified chunk. |
ReadRecordBatch
ReadRecordBatch(reader: Reader, batch: RecordBatch)
Read data for a batch of reads.
cached_sample_count_column
property
Get the sample counts from the cached signal data
cached_samples_column
property
Get the samples column from the cached signal data
columns
property
Return the data from this batch as a ReadRecordColumns instance
read_id_column
property
Get the column of read ids for this batch
read_number_column
property
Get the column of read numbers for this batch
reads
reads() -> Generator[ReadRecord, None, None]
Iterate all reads in this batch.
Yields:
| Type | Description |
|---|---|
ReadRecord
|
ReadRecord instances in the file. |
set_cached_signal
set_cached_signal(signal_cache: Pod5SignalCacheBatch) -> None
Set the signal cache
Reader
The base reader for POD5 data
Open a pod5 filepath for reading
file_version
property
The version of pod5 that originally generated this file, this is not updated when updating the file.
file_version_pre_migration
property
The version of pod5 that is stored with the file on disk.
inner_file_reader
property
inner_file_reader: Pod5FileReader
Access the inner c_api Pod5FileReader - use with caution
read_ids
property
Return all read_ids as a list of strings.
For the most performant implementation consider Reader.read_ids_raw
read_ids_raw
property
Return chunked arrow array of read ids.
To get read ids as string use Reader.read_ids
signal_table
property
Access the pod5 signal table - use with caution
get_batch
get_batch(index: int) -> ReadRecordBatch
Get a read batch in the file.
Returns:
| Type | Description |
|---|---|
ReadRecordBatch
|
The requested batch as a ReadRecordBatch. |
read_batches
read_batches(
selection: Optional[List[str]] = None,
batch_selection: Optional[Iterable[int]] = None,
missing_ok: bool = False,
preload: Optional[Set[str]] = None,
) -> Generator[ReadRecordBatch, None, None]
Iterate batches in the file, optionally selecting certain rows.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selection
|
iterable[str]
|
The read ids to walk in the file. |
None
|
batch_selection
|
iterable[int]
|
The read batches to walk in the file. |
None
|
missing_ok
|
bool
|
If selection contains entries not found in the file, an error will be raised. |
False
|
preload
|
set[str]
|
Columns to preload - "samples" and "sample_count" are valid values |
None
|
Returns:
| Type | Description |
|---|---|
Generator[ReadRecordBatch, None, None]
|
A generator yielding |
reads
reads(
selection: Optional[Iterable[str]] = None,
missing_ok: bool = False,
preload: Optional[Set[str]] = None,
) -> Generator[ReadRecord, None, None]
Iterate reads in the file, optionally filtering for certain read ids.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
selection
|
iterable[str]
|
The read ids to walk in the file. |
None
|
missing_ok
|
bool
|
If selection contains entries not found in the file, an error will be raised. |
False
|
preload
|
set[str]
|
Columns to preload - "samples" and "sample_count" are valid values |
None
|
Returns:
| Type | Description |
|---|---|
Generator[ReadRecord, None, None]
|
A generator yielding |