BS in Computer Science from Peking University (2019); PhD from UC Berkeley (2024), advised by Ion Stoica. Research at the intersection of machine learning and distributed systems.
Representative work
- vLLM (PagedAttention) — widely adopted LLM inference engine
- Vicuna, AlpaServe, Alpa, TeraPipe
Now a Member of Technical Staff at OpenAI, working on inference infrastructure.