Skip to content

utils

Utility functions for pod5 tools

assert_inputs_exist

assert_inputs_exist(inputs: Iterable[Path])

Assert all inputs exist. Raises FileExistsError otherwise

assert_no_duplicate_filenames

assert_no_duplicate_filenames(inputs: Collection[Path]) -> None

Raises ValueError if there are duplicate filenames in the collection of Paths

collect_inputs

collect_inputs(
    paths: Iterable[Path],
    recursive: bool,
    pattern: Union[str, Collection[str]],
    threads: int = DEFAULT_THREADS,
) -> Set[Path]

Returns a set of path which match any of the given glob-style patterns

If a path is a directory this will be globbed (optionally recursively). If a path is a file then it must also match any of the given patterns.

Raises FileExistsError if any inputs do not exist

init_logging

init_logging()

Initialise logging only if POD5_DEBUG is true

is_disable_pbar

is_disable_pbar() -> bool

Check if POD5_PBAR is set returning true if PBAR should be disabled

is_pod5_debug

is_pod5_debug() -> bool

Check if POD5_DEBUG is set

limit_threads

limit_threads(requested: int) -> int

Santise and limit the number of requested threads to the number of logical cores

logged

logged(log_return: bool = False, log_args: bool = False, log_time: bool = False)

Logging parameterised decorator

search_path

search_path(path: Path, recursive: bool, patterns: Collection[str]) -> Set[Path]

Search path matching pattern searching directories recursively if requested

search_paths

search_paths(
    paths: Iterable[Path],
    recursive: bool,
    pattern: Union[str, Collection[str]],
    threads: int = DEFAULT_THREADS,
) -> Set[Path]

Search all paths matching any of patterns searching directories recursively if requested

terminate_processes

terminate_processes(processes: List[SpawnProcess]) -> None

terminate all child processes

pl_format_empty_string

pl_format_empty_string(expr: Expr, subst: Optional[str]) -> Expr

Empty strings are read as a pair of double-quotes which need to be removed

pl_format_read_id

pl_format_read_id(read_id_col: Expr) -> Expr

Format read ids to in UUID style

pl_from_arrow

pl_from_arrow(table: Table, rechunk: bool) -> DataFrame

Workaround failure to read our arrow extension type

pl_from_arrow_batch

pl_from_arrow_batch(record_batch: RecordBatch, rechunk: bool) -> DataFrame

Workaround failure to read our arrow extension type