Teng Ma / 马腾
Staff Engineer & Research Leader, Alibaba Apsara Lab
I was born in Anhui, China. I am a Staff Engineer (P8) and Research Leader in the Operating System Lab (OSLab) MLSys group at Alibaba Cloud. I received my Ph.D. from Tsinghua University (advised by Prof. Yongwei Wu and Prof. Kang Chen, working closely with Dr. Mingxing Zhang and Prof. Xuehai Qian), with postdoctoral research at CASIA & Alibaba (advised by Dr. Zhengyu He and Prof. Zhaoxiang Zhang). I spent half a year as a visiting student at Prof. Shan Lu's group at the University of Chicago. I build innovative software systems exploiting new hardware and kernel features in novel architectures such as memory disaggregation and LLM. I maintain and contribute to open-source projects including Mooncake, SGLang, Dynamo, RBG, and AIGW.
News
- 2026.06Invited talk at AICon Shanghai 2026: "记忆感知的大模型KVCache优化".
- 2026.03Received OS2ATC Open Source Contribution Award.
- 20263 papers at EuroSys 2026, 1 paper at VLDB 2026, 1 paper at ICDE 2026.
- 2025Mooncake approved as PyTorch Foundation project.
- 2025Talks at AICon Shenzhen, CNCC, CCF开源大会, SGLang x MUSA Meetup, GDC 2025.
Publications
43+ papers at top systems venues: SOSP (2), ASPLOS (2), ATC (4), EuroSys (2), VLDB (2), ICDE, TPDS (3), ToN, ToS, DAC (2), CLUSTER (2), SC, INFOCOM. Google Scholar.
Open Source
Mooncake 5K+
KVCache-centric disaggregated architecture for LLM serving. 172 contributors. Adopted by Alibaba, Ant, JD, Tencent, iFLYTEK, Meituan.
SGLang
High-performance LLM serving. Co-designed PD/EPD Disaggregation, HiCache, Checkpoint Engine, Sparse KVCache.
Dynamo
NVIDIA's inference framework. Transfer Engine integration for training-inference heterogeneous communication.
RBG & AIGW
AI serving gateway and routing. Designed the full AI serving stack (SGLang + Mooncake + AIGW + RBG), recognized as "Most Influential Open Source Project" by InfoQ.
LMSys Projects
HiCache, PD Disaggregation, EPD Disaggregation, Chunked Pipeline Parallel, rfork, GB200 Deployment, Kimi K2.
Contributor Projects
End-to-end training frameworks, KV cache connectors, and inference engine integrations across the LLM ecosystem.
Talks & Presentations
Experience
Honors & Awards
Academic Service
- ICME 2025/2026 — Meta-Reviewer (Area Chair)
- IEEE ICPP 2026 — Program Committee
- IEEE Cluster 2025 — Program Committee
- PPoPP 2024 — Program Committee
- FAST 2023 — Artifact Evaluation Committee
- DASFAA 2023/24/25 — Program Committee
- ChinaSys 2023/24/25 — PC / Session Chair
- Reviewer — TPDS, TC, ICS, ToS, IEEE Network, ACM Survey
- CCF — Executive Committee: Big Data, Distributed Systems, Open Source
Patents
US Patent 10,613,992 · US Patent 11,237,925 · CN202410712069 · CN202311072436 · CN202310271211 · CN202211510303 · CN202210682464 · CN202210557119 · CN202310573810 · CN202010427533