prism.preprocess module

prism.preprocess.param_generator(fp, no_prefilter, full_pattern_proportion, error, bisulfite_conversion_rate, processivity, recruitment_efficiency)[source]

Helper generator that reads in the met file and generates parameters for multiprocessed execution of in silico proofreading.

Parameters:
  • fp (string) – File path to MET file containing extracted epiloci.
  • no_prefilter (bool) – Whether to apply pre-filtering.
  • error (float) – Expected sequencing error rate.
  • bisulfite_conversion_rate (float) – Expected bisulfite conversion rate of the sequencing data.
  • processivity (float) – Expected processivity of DNMT1.
  • recruitment_efficiency (float) – Expected recruitment efficieny of DNMT1.
Returns:

Yields headers, pattern counters, and parameters for HMM.

prism.preprocess.prefiltered(pattern_counter_generator, full_pattern_proportion, no_prefilter)[source]

Generator filter for pattern counter. Used for pre-filtering patterns for the efficient execution of in silico proofreading.

Parameters:
  • pattern_counter_generator (generator) – Generator for epiloci headers and their pattern counters.
  • full_pattern_proportion (float) – Cutoff for the proportion of fully methylated and unmethylated patterns to be retained.
  • no_prefilter (bool) – Whether to apply prefilter or not.
Returns:

Yields retained epiloci headers and their pattern counts, one by one.

prism.preprocess.run(input_fp, output_fp, no_prefilter, full_pattern_proportion, error, bisulfite_conversion_rate, processivity, recruitment_efficiency, threads, seed, verbose)[source]