Dedicated professional in translational research and bioinformatics/computational biology field. The combination of two fields have expanded the scope of biological insights. Passionate in application of computational and statistical methods to analyze and interpret results for biological insights. Motivated and adaptive team player work and communicate effectively and collaboratively with teams and individuals to derive innovative solutions. Ability to handle multiple projects simultaneously to meet deadlines in a timely manner.
Data Visualization,Data Mining,Data Wrangling and Cleaning
undefinedAssociate Computational Biologist 2023-06 - Present
Led the CLL drug trial response Whole-exome sequencing (WES) data analysis (>500 samples) with workflow pipeline written in WDL on Terra
Fetched files using Firecloud and FISS enabled Dalmatian functions or gsutil from Terra GCS bucket for data transformation, cleaning and analytics in python/R
Implemented filtering criteria to mitigate germline contamination effect for more accurate somatic calls
Ran ABSOLUTE to estimate tumor purity and infer malignant cell ploidy to help quantify copy number alterations (CNAs)
Conducted mutation signature analysis with CoMut plot to discern underlying mutational processes.
Applied Binomial Mixture Model to characterize somatic or germline mutation without paired normal
Ran GISTIC2.0 to identify significant genomic alterations that drive cancer development and progression to aids researchers in prioritizing candidate genes for further experimental validation.
Drove data analysis of bulk and single cell RNAseq data on multiple CLL projects and communicated effectively with researchers
Provided technical support for other research scientists on Terra workflow issues
Bioinformatics 2022-07 - 2022-12
Jounce Therapeutics, Cambridge, MA
Developed highly maintainable scRNAseq data analysis pipelines and visualization tools that enabled the identification of major immune cell types based on canonical markers and discovery of cell composition changes under different treatments in immuno-oncology.
Automated bulk RNA Seq Bash pipelines on 400+ samples in-house data with open source bioinformatics tools and conducted differential gene expression and statistical analysis to identify signature gene targets for early discovery phase.
Developed Rshiny Web portal for quick and easy, first pass analysis by computational biologist, allowing modular access of tools and visualization methods.
Worked collaborated with team members from different disciplines through regular project updates.
Graduate Student Researcher 2020-09 - 2021-04
Feinstein Institute For Medical Research, Manhasset, USA
Prepared single cell suspensions with proper FMO compensations for multi-panel flow cytometry.
Stained both cell surface and Intracellular targets to find T cell behavior pattern under different conditions. Performed analysis of high dimensional flow cytometry data of immune cells from tumor samples in FlowJo to identify T cell lineages and functional subsets.
ScRNA cell type classification within Myeloid cluster of kidney cells
Deployed Seurat low resolution clustering and labeled all cell types based on canonical lineage markers. Analyzed myeloid cells only and compared global gene expression patterns with external reference samples.
Computed Pearson correlation and identified five cell types with maximal correlation value of reference cluster.
Staff Research Associate II, Dr. Averil Ma 2018-06 - 2020-06
UCSF, San Francisco, CA
Project #1: Investigation of the mechanism of intestinal epithelial cell death in A20 and ABIN-1 deficient mice. *Manuscript (co-author) published in JCI: "Microbial signals, MyD88, and lymphotoxin drive TNF- independent intestinal epithelial tissue damage".
Performed survival analysis of mice with multiple gene deletions mimicking IBD symptoms. Conducted appropriate statistical tests from cell assay experiments and histologic inflammation score. Discovered major genes involved in TNF-independent cell death.
Project #2: Use of dendritic cells (DCs) as a vaccine and therapy against colon cancer on multiple strains of mice.
Performed survival analysis on in vivo mouse experiments.
Proposed new statistical analysis method on measurement of longitudinal data. Improved consistency of trend analysis and achieved 20% treatment effectiveness.
scRNA analysis of PBMC from Merkel Cell Carcinoma Jan 2023
Integrated with Harmony to obtain batch-corrected count matrix.
Compared and evaluated different cell type identification methods.
Retrived top marker genes within each cell type and conducted GO, KEGG, Reactome.
Represented distinct cell trajectories defined by expression profiles on CD14+ and CD16+ monocytes.
Stroke Prediction on Stroke Dataset (Python) Dec 2022 - Mar 2023
Implemented decision tree algorithm to fill missing values.
Performed exploratory data analysis with a focus on association pattern between continuous feature and stroke.
Overcame imbalanced class with SMOTE over-sampling minority strategy.
Conducted in-depth hyperparameter tuning using grid search to optimize model performances.
Selected recall as the most biomedical centric model selection metric.
Measured feature impact and dependency according to SHAP value and uncovered surprising result.
Chronic Kidney Disease Prediction on Kidney Disease Dataset (Python) Dec 2021
Performed exploratory data analysis on patient records and engineered features for higher efficiency of model.
Addressed imbalanced classification with oversampling method on minority class.
Trained machine learning models including Logistic Regression, Tree Based Classifiers, Ensemble Methods.
Evaluated model performance of classification (accuracy and AUC) via k-fold cross-validation technique. Applied grid search for hyperparameter tuning to find optimal combination of parameters.
Performed survival analysis of mice with multiple gene deletions mimicking IBD symptoms. Conducted appropriate statistical tests from cell assay experiments and histologic inflammation score. Discovered major genes involved in TNF-independent cell death.
Linkedin Badge: Python, R, Bash, Machine Learning