Industry Leading Academia in AI Research
Skyrocketing costs of training AI foundation models have made it difficult for academia and government to keep up with industry in AI research and development, according to the Stanford Institute for Human-Centered Artificial Intelligence (HAI). Google's top-level Gemini Ultra, for example, cost an estimated $191 million worth of compute to train, HAI reported in its latest AI Index report. And closed-source foundation models outperform their open source counterparts by a significant margin.
Yet despite the resource edge that industry has over the academia/government space, the report still noted a move toward open source foundation models.
"This past year, organizations released 149 foundation models, more than double the number released in 2022," HAI said. "Of these newly released models, 65.7% were open source (meaning they can be freely used and modified by anyone), compared with only 44.4% in 2022 and 33.3% in 2021."
That resource edge, though, pays off when foundation models are measured for performance.
Citing results of 10 selected benchmarks, closed models achieved a median performance advantage of 24.2%, HAI said.
As the institute explained: "One of the reasons academia and government have been edged out of the AI race: the exponential increase in cost of training these giant models. Google's Gemini Ultra cost an estimated $191 million worth of compute to train, while OpenAI's GPT-4 cost an estimated $78 million. In comparison, in 2017, the original Transformer model, which introduced the architecture that underpins virtually every modern LLM, cost around $900."
[Click on image for larger view.] Training Costs (source: HAI).
As far as who is creating the most foundation models, cloud giants Google and Microsoft clock in at No. 1 and No. 3 respectively, sandwiching Meta.
[Click on image for larger view.] Leading Players (source: HAI).
"Industry dominates AI, especially in building and releasing foundation models," HAI said. "This past year Google edged out other industry players in releasing the most models, including Gemini and RT-2. In fact, since 2019, Google has led in releasing the most foundation models, with a total of 40, followed by OpenAI with 20. Academia trails industry: This past year, UC Berkeley released three models and Stanford two."
This is the seventh edition of the AI Index, which dates back to 2017.
"The 2024 Index is our most comprehensive to date and arrives at an important moment when AI's influence on society has never been more pronounced," the report said. "This year, we have broadened our scope to more extensively cover essential trends such as technical advancements in AI, public perceptions of the technology, and the geopolitical dynamics surrounding its development. Featuring more original data than ever before, this edition introduces new estimates on AI training costs, detailed analyses of the responsible AI landscape, and an entirely new chapter dedicated to AI's impact on science and medicine."
The report summarized its voluminous findings in 10 top takeaways: