Snippets Groups Projects

Tags

Tags give the ability to mark specific points in history as being important

This project is mirrored from https://github.com/DefTruth/Awesome-LLM-Inference.git. Pull mirroring updated 12 minutes ago.

v2.6.13

0525c4d4 · [DeepSeek-NSA] Native Sparse Attention: Hardware-Aligned and Natively... · 1 month ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.12

1ddf093b · Add Multi-head Latent Attention(MLA) topic (#118) · 1 month ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.11

d7914c03 · [Mooncake] Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving (#117) · 2 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.10

b8b3a43b · [FFPA] FFPA: Yet another Faster Flash Prefill Attention with O(1) SRAM... · 3 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.9

6ad7b307 · [HADACORE] HADACORE: TENSOR CORE ACCELERATED HADAMARD TRANSFORM KERNEL (#108) · 3 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.8

32fdb843 · [BatchLLM] BatchLLM: Optimizing Large Batched LLM Inference with Global... · 4 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.7

9f548f61 · [KV Cache Recomputation] Efficient LLM Inference with I/O-Aware Partial KV... · 4 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.6

40292d73 · [SparseInfer] SparseInfer: Training-free Prediction of Activation Sparsity... · 4 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.5

06c76ad3 · [TP: Comm Compression] Communication Compression for Tensor Parallel LLM Inference (#94) · 4 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.4

f3f27a73 · [VL-CACHE] VL-CACHE: SPARSITY AND MODALITY-AWARE KV CACHE COMPRESSION FOR... · 4 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.3

a854d6cd · [Tensor Product] Acceleration of Tensor-Product Operations with Tensor Cores (#90) · 5 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.2

613300d7 · [FastAttention] FastAttention: Extend FlashAttention2 to NPUs and... · 5 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6.1

7ba03a64 · [PARALLELSPEC] PARALLELSPEC: PARALLEL DRAFTER FOR EFFICIENT SPECULATIVE DECODING (#84) · 5 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.6

c3f14099 · Bump up to v2.6 (#79) · 6 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.5

3e436471 · Bump up to v2.5 (#69) · 6 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.4

829da5ab · Bump up to v2.4 (#64) · 6 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.3

f0860e84 · Bump up to v2.3 (#61) · 6 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.2

6d7e9f8a · Bump up to v2.2 (#58) · 7 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.1

74f887c1 · Bump up to v2.1 (#50) · 7 months ago
Download source code

zip
tar.gz
tar.bz2
tar
v2.0

8c0b51da · Bump up to v2.0 (#39) · 7 months ago
Download source code

zip
tar.gz
tar.bz2
tar

Prev
1
2
Next