Tools and Software
We develop analytic tools for preprocessing and harmonizing genomic data across different studies. All data deposited in the repository will be harmonized according to the NCI GDC guidelines. We further develop methodologies and computational tools to perform heritability estimation, genome-wide association studies, polygenic risk score modeling, risk stratification, risk prediction, trans-ethnic fine mapping, and eQTL analyses with SNP data. We also develop tools to perform gene-level exploratory analyses and differential expression analysis with bulk RNA-seq data.
Our tools are available here at the GDC GitHub repository. Some of our genomic tools are listed here:
Adjusted heritability
This is a computationally efficient Method of Moments approach to estimate heritability with Haseman Elston regression. The approach can correct for population stratification and thus provides a robust estimate of heritability in multi-ethnic samples.
Reference: Lin, Z., Seal, S., & Basu, S. (2022). Estimating SNP heritability in presence of population substructure in biobank-scale datasets. Genetics, 220(4), iyac015.
Sparse Supervised PCA
A computationally efficient algorithm to perform sparse supervised PCA (SSPCA) for pre-processing.
Reference: Sharifzadeh, S., Ghodsi, A., Clemmensen, L. H., & Ersbøll, B. K. (2017). Sparse supervised principal component analysis (SSPCA) for dimension reduction and variable selection. Engineering Applications of Artificial Intelligence, 65, 168-177.
DRAB
Differential Regulation Analysis by Bootstrapping is a gene-based method for testing whether patterns of genetic regulation are significantly different between tissues or other biological contexts.