LLM Models tested with Super Mario Gameplay. Who won ?

IMAGE CREDIT: NINTENDO

In an interesting new study, researchers at the University of California San Diego’s Hao AI Lab have put various artificial intelligence (AI) models to the test by having them play the classic 1985 game, Super Mario. This experiment aimed to develop gaming models that could be applied to future AI applications.

The standout performer in this test was Claude-3.7, an AI model that excelled by employing straightforward strategies. In contrast, GPT-4o, another prominent AI model, did not perform as impressively. The researchers highlighted the significance of using games as testing grounds for AI, stating, “We believe games provide challenging and dynamic environments for testing LLM (Language Learning Model) agents.”

The team behind this initiative, known as LMGames, has launched the GamingAgent project and made its source code publicly available under an MIT license. This open-source approach allows others to utilize and modify the code, provided they adhere to the same licensing terms. Currently, GamingAgent supports games like 2084, Tetris, and Super Mario, and is compatible with AI models from OpenAI, Anthropic, and Gemini. This flexibility opens the door for further expansion to include additional games and AI models.

The concept of training AI through gameplay isn’t new. In 2019, Greg Brockman, an overseer for OpenAI, emphasized the importance of games in AI development, remarking, “Games have always been a benchmark for AI. If you can’t solve games, you can’t expect to solve anything else.”

This recent experiment underscores the potential of using classic video games as platforms for advancing AI capabilities. By challenging AI models with dynamic and complex environments, researchers can better understand and enhance their performance.

We’d love to hear your thoughts on this innovative approach to AI development. Please share your comments below.

For more insights and updates on AI advancements, sign up for our AI Newsletter.

Q&A

Q: Why are classic video games like Super Mario used to test AI models?

A: Classic video games offer dynamic and challenging environments that effectively test and enhance AI capabilities. They provide scenarios that require strategic thinking and adaptability, making them ideal for evaluating AI performance.

Q: How can I access the GamingAgent project’s source code?

A: The GamingAgent project’s source code is publicly available under an MIT license, allowing free use and modification. You can find the code on the LMGames team’s repository, where it’s accessible for further development and experimentation.

Curious how AI can work for your organization? Check out our AI Consulting Service — we’d love to help you make it real.

LLM Models tested with Super Mario Gameplay. Who won ?

Leave a Comment Cancel Reply

Subscribe to Our Newsletter

Must Read

Leave a Comment Cancel Reply