January 27, 2019 6 min read

likes:

DeepMind Played 200 Years to Defeat StarCraft Pros

Google’s DeepMind artificial intelligence (AI) project has accepted a new challenge – defeating world-class players at StarCraft 2. With the score going 10-1 in favor of the AI, it was a true sight to behold. What made it all tick, though?

DeepMind’s AI Beats Team Liquid at StarCraft 2

We’ve all played versus agents (PCs as we used to call them, or simply AI) back in StarCraft. The agents weren’t exactly a challenge, especially for those of us who’d hunker down behind a line of supply depots, deploying several tanks behind our bunkers and patiently waiting for Battlecruisers.

AlphaStar’s agents had 200 years to train and hone their skills, starting at a player-level and gradually progressing to explore new strategies.

The game’s AI just wasn’t good enough to devise a successful strategy to penetrate the defenses, instead opting for an almost ritual sacrifice of scores of units brought to our front entrance.

Liquid’s TLO faces off with DeepMind’s AlphaStar.

However, Google’s DeepMind project in London has changed that, training an artificial intelligence that can best the world’s finest. Enter – AlphaStar, the conqueror of the Starcaft 2 universe.

DeepMind trained dozens, if not hundreds, of AlphaStar agents, 11 of which ended up playing against Team Liquid’s Dario “TLO” Wünsch and Grzegorz “MaNa” Komincz.

What Happened in London and Why Team Liquid Lost to DeepMind?

Coming in to play first was TL’s Dario “TLO” Wünsch. Known for his upbeat character, Mr. Wünsch had little hesitation sitting behind the computer to face his future overlord.

Caption: TLO meets his maker.

Five (5) games later, TLO was reduced to incredulous pulp. DeepMind won all five of its games against TLO in what appeared to be an effortless series of victories. TLO commented on the non-standard builds that he had witnessed playing against the computer and having difficulty to adapt against the strategies.

He noted the human-like responses of the agent and the machine’s ability to outmanoeuvre his units throughout the entire game. But how good was the agent in fact? First, AlphaStar was not “a machine”. It was machines. DeepMind’s team have created hundreds of iterations, pitting 11 against Team Liquid’s finest.

200 Years of Gaming versus 20 years of Gaming

Why is AlphaStar so good at playing StarCraft 2? You might have assumed an unfair advantage, and to a point, that’s probably true. Let’s have a look at what it actually took to prepare the computer. Preparing against TLO (or any professional at that point), DeepMind’s research team worked for almost a year and when they finally got around to training the AI for the challenge ahead, they gave it about a week to learn the game.

Having accessed “the binary code of the game,” DeepMind’s team could make StarCraft 2 progress at a quicker pace, allowing AlphaStar to play the equivalent of 200 years of StarCraft in a single week.

AlphaStar League model.

AlphaStar’s agents trained in a dedicated league against each other.

But how did AlphaStar train and learn the game? Simple – it watched replays first (which got it to a Platinum level) and then transitioned to play against itself in the so-called AlphaStar League.

The AlphaStar league started simple, with an agent learning the game. Gradually, new agents would be introduced until the point there were dozens (if not more) playing against each other and learning the ins and out of the game.

In the end, DeepMind’s team pulled the 10 best agents (the one to play the final game would come after a new development) and pitted them against Team Liquid’s own gamers.

Now Open Your Mouth and Let’s Have a Look at That Brain

AlphaStar’s cognitive process

There have been many questions whether DeepMind has been cheating. Mr. Silver, though, explained that in many ways DeepMind was much like a human player, with each agent not outpacing their human counterpart by much, but rather being able to make better micro-decisions.

For example, the agent’s action-per-minute (APM) wasn’t greater than a human player’s, meaning that the machine made fewer actions. That’s easy to understand specifically because AlphaStar only hooked at the game when it had to assess a situation or execute something with the rest of the time spent unawares, or so Mr. Silver explained.

Play when you have to.

AlphaStar evidently also perceives as many screens per minute as does a human player, i.e. about 30. This was also another important metric to gauge, as at one point, AlphaStar was “microing” (commanding) units on three screens simultaneously.

Mr. Silver offered another interesting insight, explaining that AlphaStar’s cognitive process took 350 milliseconds. Still comparatively low compared to human counterparts, he added. By this, Mr. Silver meant how long it took AlphaStar to take a look at the game and decide what was happening.

One thing that people questioned was how AlphaStar perceived the game as a whole and if that didn’t give it an advantage. Well, as it were, AlphaStar had a zoomed-out point of view of the entire battlefield displaying units’ health bars and so forth. Was this an unfair edge enough to defeat MaNa?

Mana Played Since He Was 5

Grzegorz “MaNa” Komincz came into the game quite confident. He had known little about his opponent (or, rather – opponents as he was later told) other than the fact that TLO had lost. Apparently, MaNa still remained confident, knowing that TLO was a Zerg player and not a dedicated Protos player, but he also knew that if TLO had lost, he could lose as well.

AlphaStar subdues another professional.

MaNa’s own track record as a Protos player was quite impressive with 90%-win rate in mirror matches. So, he sat down for a challenge and 5 games later was left at 0. MaNa did offer his insight of the AI, explaining that the machine was good enough that he even felt that he was learning new strategies.

A Little Tweak Upsets the Game

Finally, MaNa sat down for an 11th game, after DeepMind had trained their agents for another week. This time, the developers wanted to tell the AI to behave like a human, not having a zoomed-out view of the battlefield, but rather a focused camera-by-camera approach.

Admittedly the agent struggled to grasp the game anew (DeepMind’s team had restarted the learning process, using a brand-new AlphaStar League to teach the agents). Though it took it longer, the AI eventually caught up to the previous iteration, Mr. Silver pointed out.

When it came to the final game, MaNa saw himself losing scores of probes to persistent harassment, but in the end, he managed to out-tech AlphaStar, destroying the agent in one rather effortless attack.

AlphaStar did indeed struggle at the end, but future iterations will most likely prove lethal to even the world’s SC 2 elite.

Lead Author

With 4 years experience as an analyst, Julie—or ‘Jewels’, as we aptly refer to her in the office—is nothing short of a marvel-worthy in her attention to the forex and cryptocurrency space as she quickly became the first pick to co-pilot education to the masses with Mike.

Leave a Reply

Your email address will not be published. Required fields are marked *