Microsoft is releasing a handful of details about a new project titled Singularity. Under that heading, the company is working on new cloud infrastructure specifically tailored to AI.
Microsoft wants to build cloud infrastructure specifically tailored to AI workloads. The platform would go beyond a simple service on the Azure cloud. Microsoft is working on the project under the code name Singularity. The cloud provider describes it as “a new AI platform service built from the ground up that will become a driving force for AI both within Microsoft and beyond.”
Many details the company is not giving away yet. Some researchers and experts working on the project recently published a paper that hints at the project. That paper is titled Singularity: Planet-Scale, Preemptive and Elastic Scheduling of AI Workloads. That already shows the direction Microsoft wants to go.
Optimal utilization of all capacity
The researchers advance a number of design goals. For example, Singularity must not count unused resources. All accelerators in the hardware will be treated as a single shared logical cluster. There will be no static reservation of capacity. Singularity will have a scheduler that will deploy free capacity anywhere in the world as opportunistically as possible, regardless of clusters, regions or workload types.
Although capacity is distributed, Singularity will handle and isolate workloads in a safe manner. When an inference workload needs more horsepower, the scheduler will free up capacity by releasing system resources currently used preemptively for training purposes.
Further, the researchers point to the resilience of the system. Because training through a neural network can take days or even weeks, restarting a job is not an option. If a job is stopped to free up resources, it is then automatically restarted with no loss of progress.
Modern scheduler on a global scale
From these basics, it appears that Singularity revolves primarily around the innovative scheduler. After all, it will be able to manage a global pool of system resources. Two key mechanisms apply to the scheduler. First is the ability to pause training jobs and free up resources. Next, it must always be able to elastically scale all workloads across a variable amount of accelerators.
When, how and for whom Singularity will appear is unclear at this stage. The idea behind it, though, shines through clearly. Microsoft wants to build a large pool of infrastructure scattered around the world, containing a multitude of accelerators. These should be dynamically assigned to the most relevant workloads, so that the entire infrastructure is always optimally utilized and priority tasks always get the capacity they need. The setup resembles a kind of supercomputer on a global scale.