Unverified Commit 32fdb843 authored 4 months ago by DefTruth Committed by GitHub 4 months ago

[BatchLLM] BatchLLM: Optimizing Large Batched LLM Inference with Global...

[BatchLLM] BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching (#104)

[BatchLLM] BatchLLM: Optimizing Large Batched LLM Inference with Global Prefix Sharing and Throughput-oriented Token Batching

parent 9bb3f6a3

Branches