Skip to content

convert_from_fast5

Tool for converting fast5 files to the pod5 format

OutputHandler

OutputHandler(output_root: Path, one_to_one: Optional[Path], force_overwrite: bool)

Class for managing p5.Writer handles

close_all

close_all()

Close all open writers

get_writer

get_writer(input_path: Path) -> Optional[Writer]

Get a Pod5Writer to write data from the input_path

resolve_one_to_one_path staticmethod

resolve_one_to_one_path(path: Path, root: Path, relative_root: Path)

Find the relative path between the input path and the relative root

resolve_output_path staticmethod

resolve_output_path(path: Path, root: Path, relative_root: Optional[Path]) -> Path

Resolve the output path. If relative_root is a path, resolve the relative output path under root, otherwise, the output is either root or a new file within root if root is a directory

set_input_complete

set_input_complete(input_path: Path, is_exception: bool) -> None

Close the Pod5Writer for associated input_path

QueueManager

QueueManager(
    context: SpawnContext, inputs: Collection[Path], threads: int, timeout: float
)

Manager for balancing work queues

await_data

await_data() -> Tuple[Optional[Path], Union[List[CompressedRead], int, None]]

Await compressed reads or the total count of reads compressed (file end) for a input filepath. Enqueues the next request if necessary

await_request

await_request() -> None

Await a request for data

enqueue_data

enqueue_data(
    path: Optional[Path], reads: Union[List[CompressedRead], int, None]
) -> None

Enqueues an input path and either a list of compressed reads to be written, or the total count of reads converted for that path. Otherwise, if path is None, mark the child process as being empty.

enqueue_input

enqueue_input(path: Path) -> None

Enqueue a request

get_exception

get_exception() -> Optional[Tuple[Path, Exception, str]]

Promptly get an exception if any

get_input

get_input() -> Optional[Path]

Promptly get an input if any returning None if queue is empty

shutdown

shutdown() -> Tuple[int, int, int, int]

Shutdown all queues returning the counts of all remaining items

StatusMonitor

StatusMonitor(paths: Sequence[Path])

Class for monitoring the status of the conversion

close

close() -> None

Close the progress bar

increment_reads

increment_reads(n: int) -> None

Increment the reads status by n

update_reads_total

update_reads_total(path: Path, total: int) -> None

Increment the reads status by n and update the total reads

write

write(msg: str, file: Any) -> None

Write runtime message to avoid clobbering tqdm pbar

convert_datetime_as_epoch_ms

convert_datetime_as_epoch_ms(time_str: Union[str, bytes, None]) -> datetime

Convert the fast5 time string to timestamp

convert_fast5_end_reason

convert_fast5_end_reason(fast5_end_reason: int) -> EndReason

Return an EndReason instance from the given end_reason integer from a fast5 file. This will handle the difference between fast5 and pod5 values for this enumeration and set the default "forced" value for each fast5 enumeration value.

convert_fast5_file

convert_fast5_file(
    path: Path, queues: QueueManager, signal_chunk_size: int = DEFAULT_SIGNAL_CHUNK_SIZE
) -> int

Convert the reads in a fast5 file

convert_fast5_files

convert_fast5_files(
    queues: QueueManager, signal_chunk_size: int = DEFAULT_SIGNAL_CHUNK_SIZE
) -> None

Main function for converting fast5s available in queues. Collections of converted reads are emplaced on the data_queue for writing in the main process.

convert_fast5_read

convert_fast5_read(
    fast5_read: Group,
    run_info_cache: Dict[str, RunInfo],
    signal_chunk_size: int = DEFAULT_SIGNAL_CHUNK_SIZE,
) -> CompressedRead

Given a fast5 read parsed from a fast5 file, return a pod5.Read object.

convert_from_fast5

convert_from_fast5(
    inputs: List[Path],
    output: Path,
    recursive: bool = False,
    threads: int = DEFAULT_THREADS,
    one_to_one: Optional[Path] = None,
    force_overwrite: bool = False,
    signal_chunk_size: int = DEFAULT_SIGNAL_CHUNK_SIZE,
    strict: bool = False,
) -> None

Convert fast5 files found (optionally recursively) at the given input Paths into pod5 file(s). If one_to_one is a Path then the new pod5 files are created in a new relative directory structure within output relative to the the one_to_one Path.

convert_run_info

convert_run_info(
    acq_id: str,
    adc_max: int,
    adc_min: int,
    sample_rate: int,
    context_tags: Dict[str, Union[str, bytes]],
    device_type: str,
    tracking_id: Dict[str, Union[str, bytes]],
) -> RunInfo

Create a Pod5RunInfo instance from parsed fast5 data

decode_str

decode_str(value: Union[str, bytes]) -> str

Decode a h5py utf-8 byte string to python string

get_read_from_fast5

get_read_from_fast5(group_name: str, h5_file: File) -> Optional[Group]

Read a group from a h5 file ensuring that it's a read

is_multi_read_fast5

is_multi_read_fast5(path: Path) -> bool

Assert that the given path points to a a multi-read fast5 file for which direct-to-pod5 conversion is supported.

main

main()

Main function for pod5_convert_from_fast5

process_conversion_tasks

process_conversion_tasks(
    queues: QueueManager,
    output_handler: OutputHandler,
    status: StatusMonitor,
    strict: bool,
    threads: int = DEFAULT_THREADS,
) -> None

Work through the queues of data until all work is done