This is a fascinating video about how AI can be used for more complex tasks. Not the LLM AI like ChatGPT, Google Gemini, or Apple’s Siri. The AI used for this project is more like the models used by Boston Dynamics to build walking task-focused robots. Something similar to what you see here, https://www.youtube.com/watch?v=L_4BPjLBF4E. This video is also strikingly similar to how humans learn to walk. The AI in the walking video took almost 3,000 attempts to learn to walk like a 14 month old toddler.

Here’s how it works in the Pokemon video. An AI is set up to play the original Pokemon Red game. The AI can see the game and can press all the button inputs for interacting with the game. But the problem Peter Whidden, the creator of the video, was trying to solve was to make the AI actually progress in the game. Pokemon is a complex game, but also one that children are able to play. Theoretically, an AI should be able to play the game and get to the end by defeating the Elite Four. After all, an insane hivemind was able to do it too.

This type of AI has its behavior (button pushes) encouraged by a set of rewards and punishments. The AI wants to get a “high score” which is defined by whatever Whidden sets as its parameters. Whidden set the AI to play the game for a simulated amount of hours a bunch of times. At the conclusion of a batch of plays, the AI processes which strategies worked best for achieving the highest score. The updated AI is then reinserted into the game to improve on the previous strategy.

Whidden started the AI off with a reward for finding new images on the screen. This was intended to encourage the AI to explore the game map. The first iteration got distracted by some pretty flowers, the ocean shore, and people watching. Whidden had to raise the threshold for what counted as a “new image” to prevent pointless sightseeing.

Whidden had to introduce an additional reward next. The AI had no strong incentive to catch Pokemon or win battles. It just wanted to see new screens, so it learned to run from battles so that it could find new areas of the map. Whidden added an incentive to have a highly leveled team of Pokemon. This would encourage the AI to catch Pokemon and win battles to level the team up. Notably, this incentive was ranked higher than the new image incentive, so the AI would prefer battling over sightseeing when given the option.

Whidden’s next change was to introduce a punishment for losing battles. The AI was rushing into difficult battles and had yet to learn how to heal its Pokemon at a Pokemon Center. Unfortunately, the punishment backfired. The AI continued the same strategy but when it was about to lose a battle it would just stop pressing buttons to avoid triggering the punishment. Not the intended effect, so Whidden removed the punishment.

Next, Whidden investigated why the AI wasn’t going to Pokemon Centers. The AI had tried depositing a Pokemon into the PC, lowering the level of Pokemon in the party. This was perceived as a punishment, so the AI avoided the Pokemon Center entirely afterwards to avoid making the same mistake. Whidden changed how the AI got rewards for Pokemon levels, so that it only received rewards and not accidental punishments. Then the AI was sent off once more. It finally learned how to use Pokemon Centers and heal up the injured little monsters.

The AI took 300 simulated days of training to discover how to use super effective moves. This allowed it to defeat Brock with Squirtle. The AI made its way to Mt. Moon, but got stuck due to the visually similar areas there. The boring brown tunnels weren’t unique enough to trigger the exploration reward, so the AI stopped. Whidden called off the test there as his original goal was defeating Brock.

Whidden does an excellent job explaining all of this with great visual representations. The last third of the video explains how to create your own AI to play Pokemon. If you’d like to give it a try, Whidden linked a GitHub page with a bunch of tools and code to get you started: https://github.com/PWhiddy/PokemonRedExperiments
And if you’ve enjoyed his content you can give him a thank you with some tuna melt money: https://buymeacoffee.com/peterwhidden

Leave a comment

I’m Isaac

Welcome to the GoCorral website! I’m Isaac Shaker and this is a place for me to write about D&D and occasionally other topics. I host a podcast called Setting the Stage that interviews different DMs about their campaigns. I’m currently focused on completing the Cimmeria campaign setting and turning it into a book.

Setting the Stage Podcast

72 – Calico and Psychomortis (Part 1) Setting the Stage, Campaigns for D&D and Other RPGs

CalicoVisions tells us about his game system and setting Psychomortis. The game is constructed to challenge players and characters to look inside themselves to find what's really important to them. In the far future the Earth has suffered from the arrival of the Iris which caused vast changes across the world. PCs are trapped beneath the Earth in a vast maze known as Pandora's Labyrinth. They seek an exit and/or spiritual absolution as they survive and explore in the dark depths. The experience is simultaneous a dungeon crawl and philosophical exercise.This is the first of two episodes about Psychomortis. Calico contacted me about how the game and setting had evolved since our first recording. This is the first recording which has the original version of Calico's world and game. The second one is more refined and filled in which you can see in part 2!Psychomortis is still in the early stages of development, but its playable! If you're interested in the game you can learn more on the Psychomortis Discord Server: https://discord.com/invite/JkhpUTYMTCYou can also follow Calico on BlogSpot: https://calicovisions369.blogspot.com/And on BlueSky: https://bsky.app/profile/calicovisions.bsky.socialOur website: https://gocorral.com/stsWant to be on the show? Fill out this survey: https://forms.gle/U11TbxtAReHFKbiVAJoin our Discord: https://discord.gg/Nngc2pQV6CSupport the show on Patreon: https://www.patreon.com/SettingtheStage Hosted on Acast. See acast.com/privacy for more information.
  1. 72 – Calico and Psychomortis (Part 1)
  2. 71 – Aaron Ryan and Dissonance/The End
  3. 70 – Sensei Suplex and Project Aurora
  4. 69 – Siix and Godstorm
  5. 68 – John and Tahlvaen