About Me
I am a Ph.D. student in Computer Science & Engineering at UC San Diego, where I am fortunate to be advised by Prof. Hao Zhang. I previously obtained my B.S. in Computer Science from Zhejiang University.
My research focuses on distributed systems, machine learning systems, and efficient machine learning algorithms. Currently, my work focuses on designing optimized algorithms and systems for large language model (LLM) inference. Some of my recent projects include DeepConf, Lookahead Decoding, and Dynasor🦖.
Education
- Ph.D in Computer Science, UCSD
- B.S. in Computer Science, Zhejiang University, 2018-2022
Publications
Full list on Google Scholar.
Selected publications
Deep Think with Confidence
Scaling Speculative Decoding with Lookahead Reasoning
Efficiently Scaling LLM Reasoning with Certaindex
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Efficient LLM Scheduling by Learning to Rank
Show all publicationsHide additional publications
FastKernels: Benchmarking GPU Kernel Generation in Production
Internalizing Agency from Reflective Experience
When Drafts Evolve: Speculative Decoding Meets Online Learning
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing
Reasoning Without Self-Doubt: More Efficient Chain-of-Thought Through Certainty Probing
FoldMoE: Efficient Long Sequence MoE Training via Attention-MoE Pipelining
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Neuron Sensitivity-Guided Test Case Selection
When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models
Internship
- Summer 2025: Research Scientist Intern
- Summer 2021: Game Engine Developer
- Game Engine Group of Aurora Studios, Tencent
- Duties includes: Development of a fabric editor – implementing cloth rendering using GPU shaders and managing collision processing
Skills
- Programming Languages: C++, Python, CUDA, Java
- Tools: Linux, Git, LATEX, PyTorch, Jax
Service
- NeurIPS’24, ICLR’25, ICML’25, NeurIPS’25 Reviewer
- OSDI’23, USENIX ATC’23 Artifact Evaluation Committee
