CV
Updated Mar 2026
Basics
| Name | Yani Guan |
| Label | Ph.D. Candidate | ML for Science · GNN · LLM · Generative Models |
| yaniguan@ucla.edu | |
| Phone | 310-853-9631 |
| Url | https://yaniguan.github.io |
| Summary | Ph.D. candidate in Chemical Engineering at UCLA building production ML systems at the intersection of machine learning and molecular science. Research spans end-to-end ML pipelines, GNNs, transformer and diffusion-based generative models, and LLM-powered agentic platforms for scientific automation. |
Work
-
2024.01 - now Graduate Student Researcher — End-to-End ML Systems
University of California, Los Angeles
Model development and production ML pipelines for molecular science.
- Designed and deployed end-to-end ML pipelines spanning data cleaning, feature engineering, model training, evaluation, and iterative redeployment
- Built GNN and transformer-based models trained on large structured datasets to predict complex molecular properties
- Developed ML interatomic potentials as high-fidelity surrogates for physics-based simulations with focus on memory efficiency and OOD generalization
- Engineered physics-informed features and uncertainty quantification methods to improve model robustness
-
2023.09 - now Graduate Student Researcher — ChatDFT (LLM Agentic Platform)
University of California, Los Angeles
LLM-powered agentic simulation platform for quantum chemistry automation.
- Architected an AI agent system integrating LLMs with RAG, intent extraction, and HPC job scheduling to fully automate quantum chemistry workflows — achieving >70% reduction in setup-to-results time
- Shipped production-grade agentic pipelines (Python, REST APIs, cloud infrastructure) with structured evaluation loops for continuous model monitoring
- Applied SFT and RLHF to fine-tune domain-specific LLMs for molecular modeling and data interpretation
-
2022.09 - now Graduate Student Researcher — Data-Driven Molecular Design
University of California, Los Angeles
Structure–activity relationships and generative approaches for molecular discovery.
- Applied data-driven ML methods to analyze structure–reactivity relationships across broad chemical libraries
- Designed generative model workflows combining diffusion-based and graph-based approaches to explore chemical space
- Contributed to multiple peer-reviewed publications on electrochemical amination and ML for heterogeneous catalysis
-
2022.06 - 2022.09 Algorithm Researcher — ML Infrastructure & Production Deployment
DP Technology
ML infrastructure and production deployment for neural network potentials.
- Built and shipped production ML pipelines integrating neural network potentials with molecular dynamics engines (Python, C++), deployed on cloud HPC infrastructure
- Developed robust, reusable software tools adopted by 200+ researchers across academia and industry
Education
Awards
- 2025.05.29
Dissertation Year Award
University of California, Los Angeles
- 2022.06.01
Second Prize, College Students Innovation and Entrepreneurship
Hebei Province, China
- 2021.11.01
- 2020.06.01
Youth Star
Hebei Province, China
Certificates
| Python for Everybody Specialization | ||
| University of Michigan | 2024-05-01 |
| Introduction to Generative AI Learning Path Specialization | ||
| Google Cloud Training | 2024-05-01 |
Skills
| ML Frameworks & Models | |
| PyTorch | |
| scikit-learn | |
| HuggingFace Transformers | |
| Graph Neural Networks (GNN) | |
| Transformer Architectures | |
| Diffusion Models | |
| Generative Models | |
| ML Interatomic Potentials |
| LLM & Agentic Systems | |
| RAG (Retrieval-Augmented Generation) | |
| SFT & RLHF Fine-tuning | |
| Agentic Pipelines | |
| Prompt Engineering | |
| LangChain |
| MLOps & Engineering | |
| End-to-End ML Pipelines | |
| A/B Testing | |
| REST APIs | |
| Cloud HPC | |
| Python | |
| C++ |
| Scientific Computing | |
| DFT (Density Functional Theory) | |
| Molecular Dynamics | |
| Microkinetic Modeling | |
| Uncertainty Quantification |
Languages
| Chinese | |
| Native speaker |
| English | |
| Professional |