CV

Education

Internship

Publications

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Yichao Fu, Peter Bailis, Ion Stoica, Hao Zhang ICML 2024 [paper] [blog] [code]

Efficient LLM Scheduling by Learning to Rank

Yichao Fu, Siqi Zhu, Runlong Su, Aurick Qiao, Ion Stoica, Hao Zhang NeurIPS 2024 [paper] [code]

Skills

Service