hapsburg.postprocessing

Class for calling ROH from Posterior Data. Saves results as a .csv created by Pandas. Contains Sub-Classes, as well as factory Method. Pls always change parameters with set_params method! @ Author: Harald Ringbauer, 2019

Attributes

folder

Classes

PostProcessing

Class that does PostProcessing of HAPSBURG output.

PostProcessingX

Class to post-process IBD on the X of

MMR_PostProcessing

Class that does PostProcessing of HAPSBURG output.

Functions

load_Postprocessing([folder, method, output, save])

Factory Method for PostProcessing class

Module Contents

class hapsburg.postprocessing.PostProcessing(folder='', load=False, output=True, save=True)

Bases: object

Class that does PostProcessing of HAPSBURG output. Has Methods to save the output as

folder = ''
roh_df = []
cutoff_post = 0.999
roh_min_l_initial = 0.02
roh_min_l_final = 0.05
min_len1 = 0.02
min_len2 = 0.04
max_gap = 0.005
snps_extend = 0
merge = True
output = True
save = True
post = True
set_params(**kwargs)

Set the Parameters. Takes keyworded arguments

load_data(folder='')

Load and return genetic Map [l], positions [l] and Posterior0 [l]

merge_called_blocks(df, max_gap=0)

Merge Blocks in Dataframe df and return merged Dataframe

snp_extend(df, r_map, snps_extend=0)

Extend Blocks in df by # n_snps. Return dataframe of same size but with modified blocks

modify_posterior0(posterior0)

Load and return the posterior.

create_df(starts, ends, starts_map, ends_map, l, l_map, iid, ch, roh_min_l, starts_pos=[], ends_pos=[])

Create and returndthe hapROH dataframe.

call_roh(ch=0, iid='')

Call ROH of Homozygosity from Posterior Data bigger than cutoff log: Whether Posterior is given in log space

clean_up(full=True)

Removes all additional Data other than the ROH Calls and the ROH Ground Truth (To save space)

class hapsburg.postprocessing.PostProcessingX(folder='', load=False, output=True, save=True)

Bases: PostProcessing

Class to post-process IBD on the X of two males. Only difference: Two iids, which will get stored seperately.

create_df(starts, ends, starts_map, ends_map, l, l_map, iid, ch, roh_min_l, starts_pos=[], ends_pos=[])

Create and returndthe hapROH dataframe. Difference: Here it is a IBD, so two iids are saved.

class hapsburg.postprocessing.MMR_PostProcessing(folder='', load=False, output=True, save=True)

Bases: PostProcessing

Class that does PostProcessing of HAPSBURG output. Same as PostProcessing but load Posterior differently

modify_posterior0(posterior0)

Load and return the posterior. Don’t do anything

hapsburg.postprocessing.load_Postprocessing(folder='', method='Standard', output=True, save=True)

Factory Method for PostProcessing class

hapsburg.postprocessing.folder = './Simulated/1000G_Mosaic/TSI/ch3_10cm/output/iid0/chr3/'