Latency (audio)

  1. Replay-as-a-Service reduces tail latency in storage-disaggregated databases
    Study presents Replay-as-a-Service technique to reduce tail latency in storage-disaggregated OLTP databases by decoupling log replay from storage engine, achieving 40% latency reduction.
  2. Liger+ dynamically balances latency and throughput in large model inference
    Distributed inference system using interleaved parallelism to dynamically balance latency-throughput trade-offs via task-aware batch management and strategic kernel scheduling across multiple GPUs.
  3. MLOps optimizations for high-load recommendation systems
    Engineering optimization of MLOps processes for high-load recommendation systems integrating streaming features, parameter servers, and online training for latency and quality under scale.