Advanced Neural Network Applications

External reference: https://openalex.org/T10036

  1. Liger+ dynamically balances latency and throughput in large model inference
    Distributed inference system using interleaved parallelism to dynamically balance latency-throughput trade-offs via task-aware batch management and strategic kernel scheduling across multiple GPUs.