Alibaba Cloud claims its new Aegaeon GPU pooling system cuts Nvidia GPU use by 82%, letting 213 H20 accelerators handle workloads that previously required 1,192. The advancements have been detailed in a paper (PDF) at the 2025 ACM Symposium on Operating Systems (SOSP) in Seoul. Tom’s Hardware reports: Unlike training-time breakthroughs that chase model quality or speed, Aegaeon is an inference-time scheduler designed to maximize GPU utilization across many models with…

![[CITYPNG.COM]White Google Play PlayStore Logo – 1500×1500](https://startupnews.fyi/wp-content/uploads/2025/08/CITYPNG.COMWhite-Google-Play-PlayStore-Logo-1500x1500-1-630x630.png)