Announcement_5
A collaboration with my friend Gao Bin, “Cost-Efficient Large Language Model Serving for Multi-turn Conversations with CachedAttention” was accepted at ATC’24, set to be held in Santa Clara, USA on July 10-12, 2024. Bin will be presenting two papers at ATC!