Skip to content
Snippets Groups Projects
Unverified Commit 0faf3bf1 authored by skejriwal44's avatar skejriwal44 Committed by GitHub
Browse files

CacheCraft: A Relevant Work on Chunk-Aware KV Cache Reuse for RAG (#126)

Thanks for this great list! We’d love to add CacheCraft —a chunk-aware KV reuse approach for RAG that minimizes redundant computation while preserving generation quality. Our work is concurrent to CacheBlend, with key differences in chunk-level reuse, selective recompute planning, and optimizations designed for real-world production systems. CacheCraft is accepted at SIGMOD 2025.

We’re also open-sourcing a vLLM-based extension soon. Results on real RAG traces show strong efficiency gains in production. Recent works like CacheFocus and EPIC further build on related ideas, highlighting the growing relevance of this research direction.
parent eb7e0d01
Branches
No related tags found
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment