view
Field
Bases: NamedTuple
Container class for storing the polars expression for a named field
assert_unique_acquisition_id
Perform a check that the acquisition ids are unique raising AssertionError otherwise
format_view_table
Format the view table based on the selected fields
get_field_or_raise
get_field_or_raise(key: str) -> Field
Get the Field for this key or raise a KeyError
get_reads_tables
get_reads_tables(
path: Path, selected_fields: Set[str], threshold: int = 100000
) -> Generator[LazyFrame, None, None]
Generate lazy dataframes from pod5 records. If the number of records
is greater than threshold then yield chunks to limit memory consumption and
improve overall performance
join_reads_to_run_info
Join the reads and run_info tables
join_workers
Poll workers checking for exceptions which will likely cause
parse_read_table_chunks
parse_read_table_chunks(
reader: Reader, approx_size: int = 99999
) -> Generator[LazyFrame, None, None]
Read record batches and yield polars lazyframes of approx_size records.
Records are yielded in units of whole batches of the underlying table
parse_reads_table_all
parse_reads_table_all(reader: Reader) -> LazyFrame
Parse all records in the reads table returning a polars LazyFrame
parse_reads_table_batch
parse_reads_table_batch(reader: Reader, batch_index: int) -> Tuple[LazyFrame, int]
Parse the reads table record batch at batch_index from a pod5 file returning a
polars LazyFrame and the number of records in it
parse_run_info_table
parse_run_info_table(reader: Reader) -> LazyFrame
Parse the reads table from a pod5 file returning a polars LazyFrame
resolve_output
Resolve the output path if necessary checking for no accidental overwrite and resolving to default output if given a path
select_fields
select_fields(
*,
group_read_id: bool = False,
include: Optional[str] = None,
exclude: Optional[str] = None
) -> Set[str]
Select fields to write
view_pod5
view_pod5(
inputs: List[Path],
output: Path,
separator: str = "\t",
recursive: bool = False,
force_overwrite: bool = False,
list_fields: bool = False,
no_header: bool = False,
threads: int = DEFAULT_THREADS,
**kwargs
) -> None
Given a list of POD5 files write a table to view their contents
worker_process
worker_process(
paths: JoinableQueue,
exceptions: JoinableQueue,
lock: Lock,
output: Path,
separator: bool,
selection: Set[str],
) -> None
Consume pod5 paths from paths queue, parse the records and write to output after
acquiring lock.
Returns None when all finish sentinel None is received in paths queue.
write
Write the polars.LazyFrame