Skip to content
Snippets Groups Projects
Unverified Commit 3e6f37d6 authored by skejriwal44's avatar skejriwal44 Committed by GitHub
Browse files

Request to Add CacheCraft: A Relevant Work on Chunk-Aware KV Cache Reuse for RAG

Thanks for this great list! We’d love to add CacheCraft [PDF]—a chunk-aware KV reuse approach for RAG that minimizes redundant computation while preserving generation quality. Our work is concurrent to CacheBlend, with key differences in chunk-level reuse, selective recompute planning, and optimizations designed for real-world production systems. CacheCraft is accepted at SIGMOD 2025. We’re also open-sourcing a vLLM-based extension soon. Results on real RAG traces show strong efficiency gains in production.
parent 8a2cf646
Branches
No related tags found
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment