OpenAI is quietly launching a new developer platform that will allow customers to run the company’s more recent machine learning models, such as GPT-3.5, on dedicated capacity. In screenshots of documentation shared on Twitter by early access users, OpenAI describes the upcoming Foundry offering as “designed for cutting-edge customers running larger workloads.”
Foundry will provide service-level commitments such as uptime and on-demand engineering support. Rentals will be based on dedicated compute units with three-month or one-year commitments; each model instance will require a certain number of compute units (see the chart below). Instances will not be inexpensive. Running a lightweight version of GPT-3.5 will cost $78,000 over three months or $264,000 over one year. To put that in context, one of Nvidia’s most recent-generation supercomputers, the DGX Station, costs $149,000 per unit.