What is the AI Index?
Now in its seventh year, the AI Index tracks, collates, distills and visualizes data related to AI. The mission is to provide unbiased, rigorously vetted, broadly sourced data for policymakers, researchers, executives, journalists and the general public to develop a more thorough and nuanced understanding of the complex field of AI. The index is developed by HAI but is led by a steering committee of leaders in academia, industry and government across multiple disciplines.
Science and medicine has its own chapter this year – can you tell us why that’s significant?
In previous years, information about AI progress in science and medicine was included in various parts of other chapters. For example, in the 2022 report, we highlighted various scientific discoveries in the technical performance chapter. Health-related information appeared in the economy and ethics chapters. This year we saw so many advances in medical and science AI that we knew we needed to devote a chapter to it.
What stood out for you?
2023 was an incredible year for AI. First, we saw bigger, more sophisticated models and more multimodal capabilities. We also saw a continued trend of industry dominating the field. Industry released 51 notable AI systems. Academia released 15, and government was barely on the chart. A big reason for this could be the model training costs. In 2017, it cost about $1,000 to train a transformer model [a transformer model is a deep learning model essential to machine learning and natural language processing]. In 2023, it cost $78 million to train GPT-4 and about $190 million to train Gemini Ultra. In only a handful of years, the costs have increased significantly, to a point where only a few organizations can afford to create new models.
What are some key takeaways for health and medicine from the AI Index’s chapter on responsible AI?
Much of the content in the responsible AI chapter is also applicable to health and medicine. For example, the AI Index highlights the lack of agreement on benchmarks that AI developers use to evaluate the models, including for truthfulness, potential bias, and generation of inappropriate or harmful content. Comparing model capabilities in a standardized way plays an important role in enhancing transparency. This is especially critical when we are applying AI to important application such as health care.