About Me
I am a Ph.D. student in Computer Science & Engineering at UC San Diego, where I am fortunate to be advised by Prof. Hao Zhang. I previously obtained my B.S. in Computer Science from Zhejiang University.
My research focuses on distributed systems, machine learning systems, and efficient machine learning algorithms. Currently, my work focuses on designing optimized algorithms and systems for large language model (LLM) inference. Some of my recent projects include Lookahead Decoding, vllm-ltr, and Dynasor🦖.
Education
- Ph.D in Computer Science, UCSD, 2029 (expected)
- B.S. in Computer Science, Zhejiang University, 2022
Publications
See a full list on Google Scholar
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu, Peter Bailis, Ion Stoica, Hao Zhang ICML 2024 [paper] [blog] [code]
Efficient LLM Scheduling by Learning to Rank
Yichao Fu, Siqi Zhu, Runlong Su, Aurick Qiao, Ion Stoica, Hao Zhang NeurIPS 2024 [paper] [code]
Shiftaddllm: Accelerating pretrained llms via post-training multiplication-less reparameterization
Haoran You, Yipin Guo, Yichao Fu, Wei Zhou, Huihong Shi, Xiaofan Zhang, Souvik Kundu, Amir Yazdanbakhsh, Yingyan Celine Lin NeurIPS 2024 [paper] [code]
When linear attention meets autoregressive decoding: Towards more effective and efficient linearized large language models
Haoran You, Yichao Fu, Zheng Wang, Amir Yazdanbakhsh, Yingyan Celine Lin ICML 2024 [paper] [code]
Internship
- Summer 2021: Game Engine Developer
- Game Engine Group of Aurora Studios, Tencent
- Duties includes: Development of a fabric editor – implementing cloth rendering using GPU shaders and managing collision processing
Skills
- Programming Languages: C++, Python, CUDA, Java
- Tools: Linux, Git, LATEX, PyTorch, Jax
Service
- NeurIPS’24, ICLR’25, ICML’25, Neurips’25 Reviewer
- OSDI’23, USENIX ATC’23 Artifact Evaluation Committee