[Jan 2022 - Now] BCR CoreML Ranking Team, Meta
Research Scientist
- Main contributor of the personalization modeling for ad supply release project, which is aiming at personalize the ad supply based user sensitivity to increase revenue gain with minimum engagement hurt.
- Designed and developed the user-level sensitivity model to reduce the engagement hurt by 56% under the same revenue gain level.
- Built entire workflow of user personalization model which has been generalized to multiple products, include ML problem definition, offline data collection, feature engineering, model iteration and improvement, online model performance validation and testing, model serving and maintenance.
[Aug 2016 - Oct 2021] Department of Computer Science and Engineering, Michigan State University
Research Assistant
- Designed and developed SERES, a new sequential resampling approach for biological sequences, which has been successfully applied on support estimation of multiple sequence alignment and improved the accuracy of the estimated support by 10%.
- Developed local genealogy inference approach with the SERES resampling approach.
- Developed non-parametric sequential resampling approach RAWR for the support estimation of phylogenetic trees, which improved the PR-AUC performance by 35%.
- Developed survival prediction model for triple-negative breast cancer based on gene expression data.
- Developing reinforcement learning approach for support estimation of multiple sequence alignment.
- Developing algorithm for phylogenetic inference using RNA-seq reads.
- Developed desktop software, commend line tool and online service for phylogenetic support estimation utilizing SERES and RAWR resampling algorithms.
[May 2019 - Aug 2019] LinkedIn, Sunnyvale, CA
Summer Intern
- Worked in the Anti-Abuse AI team for 3 months.
- Developed name ranking model using character-based CNN for fake name detection.
- The name ranking model reached 99.35% ROC AUC on 30 million test dataset.
[Feb 2015 - Sep 2015] Department of Biomedical Engineering, University of Texas at Austin
Research Assistant
- Developed and refined preprocessing of large RNA-Seq data using clustering. Reduced sequencing error rate to 0.003%.
- Built up human donor database to manage basic information and sequencing data of human donors.
[Sep 2013 - Jun 2016] Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Science
Research Assistant
- Built up classification model for mutation pattern of antibody repertoire of HIV infected patients based on large RNA-Seq data.
- Identified patterns involved in different HIV disease progression from high-dimension gene expression data by clustering methods.