Recently I heard that DeepMind has turned its attention towards making a StarCraft II bot in a similar way it made AlphaGo, the bot that recently proved to be capable of playing the game of Go on a very high level. The SC2 bot turned out to be really good as well. It beat two excellent pro players in decisive victories in a series of five games. SC2 is a game that is very dear to me. I first picked up the game in the late 90s when it came out and have at some times played at a fairly high level. A lot has been said about the games and I’d like to add my perspective about the performance of AlphaStar.
How does AlphaStar work?
I’m not an expert in machine learning, and the details are quite dense, so if you want an actual technical explanation, check out their whitepaper if you’re up for it. Otherwise, enjoy my very naive attempt at understanding how this works.
The bot plays programmatically through a headless interface that’s pretty much like a human would use. The domain of possible actions is pretty daunting considering how many places on the screen there are to click. However, it probably simplifies a lot at higher levels of reasoning. If the idea is “move the Stalker away from the Marines”, the exact angle at which that happens is probably not very important, and you really only have maybe like three or four sensible actions in that case. But still it seems like a pretty difficult technical challenge to overcome.
For the higher level gameplay, they broke the game into a few simple challenges.
- Mine as many minerals as you can in a certain time
- Build as many units as you can in a certain time
- Win a battle against enemy units
- Find things on the map to kill
These things are essentially what you do while you play the game. They created “agents” that do these things with complicated neural networks with a lot of different parameters to tweak and selected the best ones through a process of training. Then I think they glued these things together and ran them all at the same time and basically got something that plays StarCraft. The mining part mines minerals, the building part builds units, the finding part finds enemies, and the winning part wins the battles. Repeat until you win or lose.
This is a really fascinating way to think about the game. It seems so obvious, but in reality humans are thinking about things in a completely different way. Humans start with very high level plans and then think about execution afterwards sort of like a basketball play. I am going to open with this build order, then try to do a heavy Immortal push, and if I see X then I’ll do Y, etc. How does a person come up with a crazy idea like this? Who knows. Machines can’t seem to think this way though.
Whatever high level plans you think the machine is thinking of just seems to emerge out of the details. It’s sort of like when you see a V formation of birds in the sky. They don’t all get together and decide to fly in that formation. It’s just the easiest thing to do because flying like that cuts down on wind resistance, and any bird who doesn’t do it won’t be able to keep up. The machine looks at the details of the situation, and then estimates the probabilities of certain actions (actually to the end of the game) and then picks the action with the best chance of winning. You can really see this at work in the bot’s play style.
How does AlphaStar play?
With such a different approach to the game, AlphaStar naturally has come up with some completely new strategies for playing. A lot has been made in particular about two of its behaviors.
For one, AlphaStar will almost always overbuild workers in the beginning part of the game. It builds about 20 to a human’s 16. This is definitely the most practical result I’ve seen come out of the project because it’s something that a human being can easily copy and test to see if it works. This actually makes a lot of sense because it’s an easy way to counter all the early harass that Protoss has that usually picks off about two to four probes anyway. I expect this to become a new standard on ladder. It would be interesting if we still saw the behavior in matchups with less early worker harass pressure.
Another strange thing AlphaStar is doing is not building a wall at the base entrance, which is considered to be a best practice among human players. It’s difficult interpret this however. The purpose of the wall partially is to address the very human problem of a slow reaction to an Adept harass, but also to block an Adept shade from getting into the base to begin with by building a Pylon for a full block. It would be understandable to think that the Pylon block strategy would not emerge quickly because it takes quite a bit of high level thinking. So I think people will continue doing this. The machine is after all not perfect.
There are some other strategies that AlphaStar notably does not use. For instance, it does not use Sentries to block ramps and it doesn’t drop. These might also be a bit too complicated to emerge from the limited time they trained the agents.
AlphaStar does however have a very entertaining play style. It micromanages its units perfectly in every situation, sometimes even at multiple locations at once. At one point, it executed a perfect three-pronged Stalker ambush in the middle of the map. Each group of Stalkers almost seemed to be controlled by separate players. Much of human play optimizes for the limited attention of the person, but a machine has no such restrictions. Each Stalker can move out of the way of fire at exactly the right moment to avoid destruction. Seeing the game played perfectly was truly amazing.
This point however did however receive some criticism. If AlphaStar is trying to teach us how StarCraft should properly be played and the answer is “just have perfect mechanics”, then that is not very interesting. Sort of like how it’s pretty trivial to create a chess AI that can beat a human opponent with only 500ms on the clock. On every tick, AlphaStar has the human equivalent of hours of pondering for each small move.
While AlphaStar did put on a very impressive show, I still found the play style to be very cold and mechanical. I didn’t feel what was described by people who watched the AlphaGo games who thought that agent played in a human-like way. AlphaStar did do some really insane stuff. But it seemed to almost completely ignore the unit composition of its opponent and most of its decisions seemed to be predicated entirely on the assumption of perfect control. For instance, the game against TLO where it massed Disruptors is a strategy you could not possibly use without perfect control. It’s almost as if it is playing a completely different game than we are. There’s an entirely different set of constraints that the game is simply not balanced for. The same isn’t true in a game like Go which does not reward reaction time.
In fact, a lot of StarCraft II mechanics are specifically designed for the fact that humans do have a limited number of things they can focus on. For instance, Queens do not auto-inject Hatcheries precisely because that would make Zerg imbalanced in the early game. Human-scale focus is baked into many of the mechanics of the game.
What can we learn from this?
My primary takeaway is that machines like this just do random dumb shit until they find something that works. I really like what this company is doing though because I think overall these sorts of things can have a positive impact on our culture. They sort of remind me of Boston Dynamics, a company that seems to be in the business of making random cool things for YouTube videos.
I hope this project can influence game designers to make better competitive games. It puts into focus what machines do well versus what humans do well. I think game designers should take a cue from this to maximize game design for human skills by rewarding high-level thinking and creative problem solving over mechanical mastery. Now that computers are better than us at StarCraft II, the challenge should be to create a game that humans will always be better at. There must be some kind of game like that, right? I’m not sure anymore actually.
At least we know if the Zerg are real and they ever invade Earth, we should have a chance to defeat them now.