hapsburg.PackagesSupport.parallel_runs.helper_functions
Helper Functions for Notebook Runs on Cluster @ Author: Harald Ringbauer, 2019
Functions
|
Prepare the path and pipe printing for one Individual. |
|
Implementation of running in Parallel. |
|
Splits up the ROH-dataframe from base_path/file_in into file_out. |
|
Get Seperator for csv/tsv from file extensions. |
|
Function to merge data from one Individual Analysis (all Chromosome) |
|
Take ROH result table from X folder, and move it to parent folder. |
|
Create Folders for ROH analysis with Plink/BCFTOOLs. |
|
Extract only ROH from Individual iid and saves it to save_path |
|
Split up results into roh.csv and roh_gt.csv for each IID. |
Module Contents
- hapsburg.PackagesSupport.parallel_runs.helper_functions.prepare_path(base_path, iid, ch, prefix_out, logfile=True)
Prepare the path and pipe printing for one Individual. Create Path if not already existing. logfile: Whether to pipe output to log-file
- hapsburg.PackagesSupport.parallel_runs.helper_functions.multi_run(fun, prms, processes=4, output=False)
Implementation of running in Parallel. fun: Function prms: The Parameter Files processes: How many Processes to use
- hapsburg.PackagesSupport.parallel_runs.helper_functions.split_up_roh_df(base_path, path_out, iid, file_in='roh_info.csv', file_out='roh_gt.csv')
Splits up the ROH-dataframe from base_path/file_in into file_out. Picks out Individual iid. Done to pass on “ground truth” base_path: Where to find roh_info.csv path_out: Where to save roh_gt to iid: Which Individual to extract from roh_info.csv.
- hapsburg.PackagesSupport.parallel_runs.helper_functions.get_sep_from_extension(path)
Get Seperator for csv/tsv from file extensions. Either comma or tab. Return delimiter
- hapsburg.PackagesSupport.parallel_runs.helper_functions.combine_individual_data(base_path, iid, delete=False, chs=range(1, 23), prefix_out='', file='roh.csv', file_result='_roh_full.csv')
Function to merge data from one Individual Analysis (all Chromosome) chs: Which Chromosomes to combine” file: Which files to combine. Either roh or ibd.csv delete: Whether to delete individual folder and contents after combining.
- hapsburg.PackagesSupport.parallel_runs.helper_functions.move_X_to_parent_folder(base_path, iid, delete=False, ch=23, prefix_out='', file_result='_roh_full.csv')
Take ROH result table from X folder, and move it to parent folder. Delete the original result folder
- hapsburg.PackagesSupport.parallel_runs.helper_functions.create_folders(input_base_folder, outfolder='plink_out/')
Create Folders for ROH analysis with Plink/BCFTOOLs. Operates within HAPSBURG Mosaic Data Structure. Return h5 path, vcf path, and folder for intermediary output
- hapsburg.PackagesSupport.parallel_runs.helper_functions.split_up_inferred_roh(df_t, iid, save_path)
Extract only ROH from Individual iid and saves it to save_path
- hapsburg.PackagesSupport.parallel_runs.helper_functions.postprocess_iid(df_plink, input_base_folder, iids, ch=3, prefix_out='')
Split up results into roh.csv and roh_gt.csv for each IID. df_plink: Data Frame with Plink results, formated correctly