hapsburg.preprocessing_lowmem
Classes
Class for PreProcessing the Data. |
|
Class for PreProcessing Eigenstrat Files |
|
Class for PreProcessing Eigenstrat Files |
Functions
|
Extract genotypes from h5 on ids and markers. |
|
Factory method to load the Transition Model. |
Module Contents
- hapsburg.preprocessing_lowmem.extract_snps_hdf5_lowmem(h5, ids_ref, markers, ch, meta_path_ref, verbose=True, diploid=True)
Extract genotypes from h5 on ids and markers. If diploid, concatenate haplotypes along 0 axis. Extract indivuals first, and then subset to SNPs. Return 2D array [# haplotypes, # markers]
- class hapsburg.preprocessing_lowmem.PreProcessingHDF5_lowmem(conPop=[], save=True, output=True)
Bases:
hapsburg.preprocessing.PreProcessingHDF5Class for PreProcessing the Data. Standard: Intersect Reference Data with Individual Data Return the Intersection Dataset
- load_data(iid='MA89', ch=6, start=-np.inf, end=np.inf)
Return Matrix of reference [k,l], Matrix of Individual Data [2,l], as well as linkage Map [l]
- optional_postprocessing(gts_ind, gts, r_map, pos, out_folder, pCon, read_counts=[])
Postprocessing steps of gts_ind, gts, r_map, and the folder, based on boolean fields of the class.
- class hapsburg.preprocessing_lowmem.PreProcessingEigenstrat_lowmem(save=True, output=True, packed=1, sep='\\s+')
Bases:
hapsburg.preprocessing.PreProcessingEigenstratClass for PreProcessing Eigenstrat Files Same as PreProcessingHDF5 for reference, but with Eigenstrat coe for target
- optional_postprocessing(gts_ind, gts, r_map, pos, out_folder, read_counts=[])
Postprocessing steps of gts_ind, gts, r_map, and the folder, based on boolean fields of the class.
- load_data(iid='MA89', ch=6)
Return Matrix of reference [k,l], Matrix of Individual Data [2,l], as well as linkage Map [l] and the output folder. Save the loaded data if self.save==True Various modifiers in class fields (check also PreProcessingHDF5)
- class hapsburg.preprocessing_lowmem.PreProcessingEigenstratX_lowmem(save=True, output=True, packed=1, sep='\\s+')
Bases:
PreProcessingEigenstrat_lowmem,hapsburg.preprocessing.PreProcessingEigenstratXClass for PreProcessing Eigenstrat Files Same as PreProcessingHDF5 for reference, but with Eigenstrat coe for target
- set_output_folder(iid, ch='X')
Set the output folder after folder_out. General Structure for HAPSBURG: folder_out/iid/chrX/
- get_1000G_path(h5_path1000g, ch='X')
Construct and return the path to the 1000 Genome reference panel
- es_get_index_iid(es, iid)
Get IID of Indices
- extract_snps_es(es, id, markers)
Use Eigenstrat object. Extract genotypes for individual index i (integer) for list of markers. Do conversion from Eigenstrat GT to format used here
- load_data(iid='MA89', ch='X')
Return Matrix of reference [k,l], Matrix of Individual Data [2,l], as well as linkage Map [l] and the output folder. Save the loaded data if self.save==True Various modifiers in class fields (check also PreProcessingHDF5)
- hapsburg.preprocessing_lowmem.load_preprocessing_lowmem(p_model='Eigenstrat', conPop=[], save=True, output=True)
Factory method to load the Transition Model. Return