smftools.informatics.modkit_functions

smftools.informatics.modkit_functions#

Functions

extract_mods(thresholds, mod_tsv_dir, ...[, ...])

Takes all of the aligned, sorted, split modified BAM files and runs Nanopore Modkit Extract to load the modification data into zipped TSV files

make_modbed(aligned_sorted_output, ...)

Generating position methylation summaries for each barcoded sample starting from the overall BAM file that was direct output of dorado aligner.

modQC(aligned_sorted_output, thresholds)

Output the percentile of bases falling at a call threshold (threshold is a probability between 0-1) for the overall BAM file.

smftools.informatics.modkit_functions.extract_mods(thresholds, mod_tsv_dir, split_dir, bam_suffix, skip_unclassified=True, modkit_summary=False, threads=None, single_bam=None)#

Takes all of the aligned, sorted, split modified BAM files and runs Nanopore Modkit Extract to load the modification data into zipped TSV files

Parameters:
  • thresholds (list) -- A list of thresholds to use for marking each basecalled base as passing or failing on canonical and modification call status.

  • mod_tsv_dir (str) -- A string representing the file path to the directory to hold the modkit extract outputs.

  • split_dir (str) -- A string representing the file path to the directory containing the converted aligned_sorted_split BAM files.

  • bam_suffix (str) -- The suffix to use for the BAM file.

  • skip_unclassified (bool) -- Whether to skip unclassified bam file for modkit extract command

  • modkit_summary (bool) -- Whether to run and display modkit summary

  • threads (int) -- Number of threads to use

  • single_bam (Path | None) -- When set, use this single BAM instead of iterating split_dir.

Returns:

None Runs modkit extract on input aligned_sorted_split modified BAM files to output zipped TSVs containing modification calls.

smftools.informatics.modkit_functions.make_modbed(aligned_sorted_output, thresholds, mod_bed_dir)#

Generating position methylation summaries for each barcoded sample starting from the overall BAM file that was direct output of dorado aligner. :type aligned_sorted_output: :param aligned_sorted_output: A string representing the file path to the aligned_sorted non-split BAM file. :type aligned_sorted_output: str

Returns:

None

smftools.informatics.modkit_functions.modQC(aligned_sorted_output, thresholds)#

Output the percentile of bases falling at a call threshold (threshold is a probability between 0-1) for the overall BAM file. It is generally good to look at these parameters on positive and negative controls.

Parameters:
  • aligned_sorted_output (str) -- A string representing the file path of the aligned_sorted non-split BAM file output by the dorado aligned.

  • thresholds (list) -- A list of floats to pass for call thresholds.

Returns:

None