prism.deconvolute module

prism.deconvolute.common_intersection(headers_list)[source]

Given two or more lists of epiloci headers, returns the list of common headers.

Parameters:headers_list (list) – A list of epiloci headers from two or more samples.
Returns:List of common headers that appear in all of the samples.
prism.deconvolute.get_chromosome(chr_string)[source]
prism.deconvolute.get_subclone_assignment(subclones, assignment, outlier_subclone_mask)[source]

Given cluster assignment, returns the subclone assignment. Note that if the subclone is found to be an outlier, -1 will be returned.

Parameters:
  • subclones (list) – List of subclones.
  • assignment (int) – Index of assigned clsuter.
  • outlier_subclone_mask (list) – Boolean mask denoting outlier subclones.
Returns:

Index of assigned subclone.

prism.deconvolute.identify_subclone(model, merge_cutoff, outlier_cluster_mask)[source]

Given beta-binomial model fit, identify mergeable clusters, merge them, and mark outliers.

Parameters:
  • model (BetaBinomialMixture) – Beta-binomial model fit.
  • merge_cutoff (float) – Cutoff for the distance from midpoint of the two clusters to (0.5, …, 0.5) to be merged.
  • outlier_cluster_mask (list) – A boolean mask denoting if the cluster is outlier.
Returns:

A list of identified subclones, and boolean mask marking outlier subclones.

prism.deconvolute.mark_outlier_clusters(model, outlier_dispersion_cutoff=0.2)[source]

Given model fit, mark overdispered clusters as outlier clusters.

Parameters:
  • model (BetaBinomialMixture) – Beta-binomial mixture model fit.
  • outlier_dispersion_cutoff (float) – Cutoff for dispersion to mark a cluster as an outlier.
Returns:

Boolean mask that denotes if each of the cluster is an outlier.

prism.deconvolute.merge_met_files(met_files, full_pattern_proportion, intersection_method, jaccard_cutoff=0.5, copynumber=None, cn_prior='random')[source]

Given met files, return depths, fingerprint pattern counts and epiloci headers for each common fingerprint epilocus.

Parameters:
  • met_files (list) – List of file paths to (corrected) met files.
  • full_pattern_proportion (float) – Proportion of fully methylated and unmethylated patterns to be retained.
  • intersection_method (string) – Possible values are one of [‘common’, ‘jaccard’]. ‘common’: only epiloci that exactly appears in all of the samples will be retained. ‘jaccard’: This only applies to two-sample analysis. A pair of epiloci that have jaccard similarity greater than 0.5 will be retained.
Returns:

Depths, fingerprint pattern counts and epiloci headers for each common fingerprint epilocus.

prism.deconvolute.merge_subclones(subclones, cluster_a, cluster_b)[source]

Merge two clusters containing cluster a and cluster b. If a and b are already in the same subclone, just return the subclones unchanged.

Parameters:
  • subclones (list) – List of sets of clusters (subclones).
  • cluster_a (int) – Cluster index to merge.
  • cluster_b (int) – Cluster index to merge.
Returns:

Merged subclones as a list.

prism.deconvolute.methylated_pattern(p1, p2)[source]

Given two methylation patterns, returns the pattern with more methylated CpGs.

Parameters:
  • p1 (string) – Binarized methylation pattern (0: unmethylated, 1: methylated).
  • p2 (string) – Binarized methylation pattern (0: unmethylated, 1: methylated).
Returns:

The pattern with more methylated CpGs.

prism.deconvolute.overlaps(chr_a, start_a, end_a, chr_b, start_b, end_b)[source]
prism.deconvolute.parse_cn_line(line)[source]
prism.deconvolute.parse_met_file(fp, full_pattern_proportion=0.8, copynumber=None, cn_prior='random')[source]

Parse entries in MET file and yield depths, counts, and epiloci headers for postfiltered fingerprint epiloci.

Parameters:
  • fp (string) – File path to (corrected) met file.
  • full_pattern_proportion (float) – Proportion of fully methylated and unmethylated patterns to be retained.
Returns:

Yields arrays of depths, counts and headers of postfiltered fingerprint epiloci.

prism.deconvolute.postfiltered(pattern_counter_generator, full_pattern_proportion, copynumber=None, cn_prior='random')[source]

Generator filter for postfiltered patterns.

Parameters:
  • pattern_counter_generator (generator) – Generator emitting epiloci headers and pattern counters.
  • full_pattern_proportion (float) – Proportion of fully methylated and unmethylated patterns to be retained.
  • copynumber (str) – Path to called copynumber file.
  • cn_prior (str) – Priors for probability of methylation patterns on copy-number-gained segment.
Returns:

Yields retained header and pattern counter.

prism.deconvolute.posthoc_process(model, merge_cutoff, outlier_dispersion_cutoff)[source]

Post-hoc processing step. In this step, clusters are merged if they seemed to be ‘reflected’ clusters. Also, overdispered clusters are marked so that they can be excluded in further analyses.

Parameters:
  • model (BetaBinomialMixture) – Beta-binomial model fit.
  • merge_cutoff (float) – Cutoff for the distance from midpoint of the two clusters to (0.5, …, 0.5) to be merged.
  • outlier_dispersion_cutoff (float) – Cutoff for dispersion to mark a cluster as an outlier.
Returns:

List of subclones, and boolean mask representing if each of them is an outlier.

prism.deconvolute.run(input_fps, full_pattern_proportion=0.8, merge_cutoff=0.05, outlier_dispersion_cutoff=0.2, num_max_cluster=15, seed=12345, intersection_method='common', copynumber=None, cn_prior='random', verbose=False, output_fp=None)[source]