Researchers in the field of artificial intelligence (AI) have made significant progress in developing a Minecraft bot named Voyager that has the ability to explore and enhance its skills within the game’s vast open world. What sets Voyager apart from other bots is its unique capability to essentially write its own code through trial and error, with the help of numerous GPT-4 queries.
Voyager represents an “embodied agent,” an AI that can freely and purposefully move and act within a simulated or real environment. Unlike personal assistant AIs or chatbots that do not require physical actions, household robots of the future will be expected to navigate complex worlds to accomplish tasks. This motivates researchers to investigate how such robots can effectively operate.
Minecraft serves as an ideal testing ground for this research due to its approximation of the real world, featuring simple yet comprehensive rules and physics. However, employing random AIs in Minecraft would yield little understanding of the game’s mechanics. To address this, the researchers behind Voyager created MineDojo, a simulation framework centered around Minecraft. MineDojo includes YouTube videos, transcripts, wiki articles, and Reddit posts from r/minecraft, enabling users to fine-tune AI models using this data. The framework also facilitates objective evaluation of these models, such as assessing their ability to construct fences around llamas or locate and mine diamonds.
Voyager outperforms other models, notably Auto-GPT, in these tasks. The approach of both Voyager and Auto-GPT revolves around utilizing GPT-4 to generate code in real-time as they encounter various in-game situations.
While conventional training involves exposing models to Minecraft data and hoping they learn to fight skeletons at night, Voyager takes a more interactive approach. As it encounters challenges in the game, Voyager engages in an internal conversation with GPT-4, discussing what actions it should take. GPT-4 provides guidance on the best course of action, such as crafting and equipping a sword to fend off skeletons while avoiding damage. These general instructions are then translated into concrete goals, such as gathering resources, building a sword at a crafting table, and engaging in combat.
Once accomplished, these tasks are added to a skill library, allowing Voyager to perform more complex actions without starting from scratch. Although Voyager continues to rely on GPT for guidance, it uses the faster and more cost-effective GPT-3.5, which provides situation-specific skills. This ensures that Voyager doesn’t attempt to mine skeletons or fight ore, for instance.
Compared to bots like Auto-GPT, which struggle when faced with unfamiliar interfaces, Voyager thrives in the deeper environment of Minecraft. It acquires more skills, explores a wider area, and discovers more resources. Notably, GPT-4 outshines GPT-3.5 (ChatGPT) in generating useful code. When GPT-3.5 was used instead, Voyager encountered early obstacles and failed to progress. This emphasizes the significant coding advancements achieved in GPT-4.
The purpose of this research is not to render Minecraft players obsolete, but rather to investigate methods by which relatively simple AI models can improve themselves based on their experiences. As we envision robots assisting us in our homes, hospitals, and offices, it is crucial for them to learn from past actions and apply these lessons to future tasks.