Memory bandwidth
-
CXL-SpecKV: A Disaggregated FPGA Speculative KV-Cache for Datacenter LLM Serving
System offloads key-value caches to remote FPGA memory using CXL interconnects, achieving 3.2× throughput gains and 2.8× memory cost reduction for datacenter LLM serving.

