Building a ML Detector for Zerg Rushes

I wanted to share some of the behind the scenes notes from my first use of ML while building the rush detector :hugs:

first up, here’s the actual signal cheat sheet I built before writing any code. this is what the bot is scanning for:

12 Pool signals:

  • No natural expansion by ~1:20 to 1:30
  • Spawning pool already done when your scout arrives
  • Super low drone count in main (~12 to 14 instead of ~20+ by 2:00)
  • No gas taken or gas not being mined
  • First 6 lings show up around ~1:50, hitting by ~2:15
  • No queen started after pool finishes (full ling commitment)

Speedling signals:

  • Early gas geyser taken with the pool
  • Natural hatch timing feels “off” or fake
  • Zergling speed finishing at ~2:30 to 2:45 vs ~3:30 standard
  • Gas mining stops around 100 gas if its pure speedling flood
  • Early Baneling Nest around ~2:00 to 2:30 for ling bane variant
  • Low drone count and delayed queens compared to standard

after training the ML model I looked at the coefficients, basically how much each signal pushes the model toward saying “yep that’s a rush.” and my hand written rule scores? they were doing almost all the work.

the 12 pool rule score had a coefficient of +1.519. huge. the raw timings like when the natural started? +0.065. not too much

so the ML wasn’t replacing what I built by hand. it was just calibrating it. smoothing out the messy edge cases where the rules weren’t confident.

I used scikit-learn for the model. very simple logistic regression. here’s the actual training code, under 20 lines:

import pandas as pd
from sklearn.linear_model import LogisticRegression

df = pd.read_csv("rush_data.csv")

X = df[["rule_score", "auto_true", "t_ling_seen", "pool_start_est", "t_nat_started", "gas_mined"]]
y = df["was_rushed"]

model = LogisticRegression()
model.fit(X, y)

and at runtime the bot uses it like this:

model = joblib.load("rush_model.joblib")

if auto_true:
    return True

features = [[
    rule_score,
    auto_true,
    t_ling_seen or -1,
    pool_start_est or -1,
    t_nat_started or -1,
    gas_mined or -1
]]

p_rush = model.predict_proba(features)[0][1]

return p_rush >= 0.5

the lesson here, and this one’s for anyone thinking about adding ML to their bot. your rules don’t have to be perfect. they just have to be good enough for the model to build on top of.

I only had 4 speedling games in my training data :sweat_smile: so the model thinks it knows speedlings but it really doesn’t. yet. more games, more labels, better model. that’s the grind.

if you’re working on detecting anything in your bot, cannon rushes, proxy rax, cheese builds, the pattern is honestly the same:

  1. figure out the signals humans use to spot it
  2. turn those into a point system
  3. collect game data
  4. let a simple model learn the weights for you

anyone try this with other cheeses?

Using python, I learned raycasting to find the natural choke. I’m not really using Machine Learning but the solution I have come up with is from my experience. If it’s important to any of you I have also reached M1+GM with zerg.

To get the natural expansion wall choke, I raycast from the natural expansion point out towards the map center in a 100 degree radial sweep up to 40 distance. The idea behind this is to find the direction outside of our natural that is the longest distance out from our choke entry so we almost guarantee that we find our natural exit direction. Once we have that vector that leads out towards the outside of our natural. I then step forward in distance by 7 to make sure I start closer to the natural exit choke and don’t catch any side gases that could interfere with the next part. On a side note don’t forget some maps have mineral patch walls in between natural/3rd that require each step of the raycast to check if there are mineral patches close to it so it can recognise those walls.

I then use a split radial sweep 90 degree to the left of the vector and then a separate sweep 90 degrees to the right. Each separate sweep will give me the 2 closest points, 1 from the left sweep and one from the right, this ensures you don’t get 2 points from the same side. From these 2 points should guarantee that you find the corners of the choke of the natural that you will build your wall upon.

Ramp vs flat ground choke

This one was an interesting challenge. This needed to be a branching point that required 3 separate calculations. Flat ground choke, Wide ramp choke, narrow ramp choke.

Flat ground was simpler as the terrain seems to be pretty much the same (despite some weird terrain height dips causing issues). This for the most part is a simple draw the line in between the 2 choke corner points and then step in by 1 then snap to half grid point to get the 0.5 corner point that your gateway/core becomes place-able on but still connected to the corner properly so there is no side gap.

The narrow ramp was interesting but seemed to be solved with enough refinement and works on multiple ramp maps I have tested. Make sure to use height filtering raycast steps as well. The main consistency I found is that the top of the narrow ramp has shoulders of terrain which have to be compensated for. I compensated for this by reversing the vector from the midpoint to be outward and stepping diagonally outward by 1 and then towards the base perpendicularly from the direction of the choke wall points by 1 then snap to half grid.

Wide ramp was similar to the flat ground choke layout but with the height filtering required. The simple way to check narrow vs wide ramp is corner point to corner point distance >7.5 or <7.5. Then you can use the inward vector and step diagonally inward by 1 with snap to half grid etc.

This is all before finally finding the center points of where we would actually want to build our 3x3 buildings. From those central 3x3 points we can then triangulate using pythagoras theorem to make sure we can power each central point of 3x3 building and placing the pylon as far back as possible.

As for my scouting method. I choose to send my scout out at the very start of the game to scout as early as possible. This specifically allows me to choose between nexus first if my opponent is going macro or gateway first if they are going pressure.

My indicator for 12 pool are workers + morphed buildings (excluding main hatchery) < 16 which triggers once I have seen all 8 mineral fields on the enemy main at around 40-45 seconds.

I was initially thinking I could hold it with 3 gate zealot flood to hold the wall, and mathematically 3 zealots beats 8 zerglings. The main issue I ran into is when the surface area of the zerglings cover a gateway in the front of the wall, it spawns the zealot behind the wall instead, which means my 3 zealots turns into 2 and they lose the fight and it snowballs from there.

I had to switch to reacting with a forge into cannon as im also compensating for a 6 ling + drone pull flood which would completely cover the surface area of the front of the wall. This requires a full 3x3 building wall off to defend efficiently.

Arpy, my latest bot is a work in progress protoss bot currently beating cheater 3 insane AI sometimes with the follow ups. Yet to be uploaded to ladder.

1 Like

First off this is freaking fantastic

Are you generating this on Initialization or are you pre-gening all these and storing the coordinates?
Will this work with any map style at this point or is there any that you haven’t been able to get working?

I have been able to get this working on every AI arena current season ladder map with the same code block with both the side 3x3 buildings and the central 3x3 building which can choose to completely seal in if we are playing against something like a 12 pool, or to leave a 1 tile gap on one side so we have our standard protoss wall setup.

This is being generated about 1 second into the match as it does require half a second of time for it to fully complete its data gathering process. After that we have all of our possible wall off building placements.

Everything is reactive, I even cancel the first pylon and then fallback to successfully build the 3 pylon wall at the top of the main ramp to deny those sneaky worker rushes.

Logic is beautiful.