CV

Updated Mar 2026

Basics

Name Yani Guan
Label Ph.D. Candidate | ML for Science · GNN · LLM · Generative Models
Email yaniguan@ucla.edu
Phone 310-853-9631
Url https://yaniguan.github.io
Summary Ph.D. candidate in Chemical Engineering at UCLA building production ML systems at the intersection of machine learning and molecular science. Research spans end-to-end ML pipelines, GNNs, transformer and diffusion-based generative models, and LLM-powered agentic platforms for scientific automation.

Work

  • 2024.01 - now
    Graduate Student Researcher — End-to-End ML Systems
    University of California, Los Angeles
    Model development and production ML pipelines for molecular science.
    • Designed and deployed end-to-end ML pipelines spanning data cleaning, feature engineering, model training, evaluation, and iterative redeployment
    • Built GNN and transformer-based models trained on large structured datasets to predict complex molecular properties
    • Developed ML interatomic potentials as high-fidelity surrogates for physics-based simulations with focus on memory efficiency and OOD generalization
    • Engineered physics-informed features and uncertainty quantification methods to improve model robustness
  • 2023.09 - now
    Graduate Student Researcher — ChatDFT (LLM Agentic Platform)
    University of California, Los Angeles
    LLM-powered agentic simulation platform for quantum chemistry automation.
    • Architected an AI agent system integrating LLMs with RAG, intent extraction, and HPC job scheduling to fully automate quantum chemistry workflows — achieving >70% reduction in setup-to-results time
    • Shipped production-grade agentic pipelines (Python, REST APIs, cloud infrastructure) with structured evaluation loops for continuous model monitoring
    • Applied SFT and RLHF to fine-tune domain-specific LLMs for molecular modeling and data interpretation
  • 2022.09 - now
    Graduate Student Researcher — Data-Driven Molecular Design
    University of California, Los Angeles
    Structure–activity relationships and generative approaches for molecular discovery.
    • Applied data-driven ML methods to analyze structure–reactivity relationships across broad chemical libraries
    • Designed generative model workflows combining diffusion-based and graph-based approaches to explore chemical space
    • Contributed to multiple peer-reviewed publications on electrochemical amination and ML for heterogeneous catalysis
  • 2022.06 - 2022.09
    Algorithm Researcher — ML Infrastructure & Production Deployment
    DP Technology
    ML infrastructure and production deployment for neural network potentials.
    • Built and shipped production ML pipelines integrating neural network potentials with molecular dynamics engines (Python, C++), deployed on cloud HPC infrastructure
    • Developed robust, reusable software tools adopted by 200+ researchers across academia and industry

Education

  • 2022.09 - 2024.03

    Los Angeles, CA

    M.S.
    University of California, Los Angeles
    Chemical Engineering
  • 2022.09 - 2027.06

    Los Angeles, CA

    Ph.D.
    University of California, Los Angeles
    Chemical Engineering
  • 2018.09 - 2022.06

    Tianjin, China

    B.S.
    Hebei University of Technology
    Chemical Engineering

Awards

Certificates

Python for Everybody Specialization
University of Michigan 2024-05-01

Skills

ML Frameworks & Models
PyTorch
scikit-learn
HuggingFace Transformers
Graph Neural Networks (GNN)
Transformer Architectures
Diffusion Models
Generative Models
ML Interatomic Potentials
LLM & Agentic Systems
RAG (Retrieval-Augmented Generation)
SFT & RLHF Fine-tuning
Agentic Pipelines
Prompt Engineering
LangChain
MLOps & Engineering
End-to-End ML Pipelines
A/B Testing
REST APIs
Cloud HPC
Python
C++
Scientific Computing
DFT (Density Functional Theory)
Molecular Dynamics
Microkinetic Modeling
Uncertainty Quantification

Languages

Chinese
Native speaker
English
Professional