smftools.preprocessing.calculate_complexity_II#
- smftools.preprocessing.calculate_complexity_II(adata, output_directory='', sample_col='Sample_names', ref_col='Reference_strand', cluster_col='sequence__merged_cluster_id', plot=True, save_plot=False, n_boot=30, n_depths=12, random_state=0, csv_summary=True, uns_flag='calculate_complexity_II_performed', force_redo=False, bypass=False)#
Estimate and optionally plot library complexity.
If
ref_colisNone, the calculation is performed per sample. If provided, complexity is computed for each(sample, reference)pair.- Parameters:
adata (
AnnData) -- AnnData object containing read metadata.output_directory (
str|Path(default:'')) -- Directory for output plots/CSVs.sample_col (
str(default:'Sample_names')) -- Obs column containing sample names.ref_col (
Optional[str] (default:'Reference_strand')) -- Obs column with reference/strand categories, orNone.cluster_col (
str(default:'sequence__merged_cluster_id')) -- Obs column with merged cluster IDs.plot (
bool(default:True)) -- Whether to generate plots.save_plot (
bool(default:False)) -- Whether to save plots to disk.n_boot (
int(default:30)) -- Number of bootstrap iterations per depth.n_depths (
int(default:12)) -- Number of subsampling depths to evaluate.random_state (
int(default:0)) -- Random seed for bootstrapping.csv_summary (
bool(default:True)) -- Whether to write CSV summary files.uns_flag (
str(default:'calculate_complexity_II_performed')) -- Flag inadata.unsindicating prior completion.force_redo (
bool(default:False)) -- Whether to rerun even ifuns_flagis present.bypass (
bool(default:False)) -- Whether to skip processing.
- Return type: