Picking Counter-Builds With Machine Learning - SharkyML & LightGBM

@sharknice walked through how he connected Shark Bot to a .NET ML decision tree model (LightGBM) to pick counter-builds based on win probability, replacing the default build rotation with a machine learning-powered build selector that only needs two builds and a few games of data to get started. We also covered data shaping for ML input, opponent-specific vs race-fallback data strategies, replay-based training potential, and the limits of ML when you simply don’t have a build that wins.

Key Takeaways:

  • :brain: You can add ML-based build selection to your bot with just two lines of code using Sharky ML — activate the module, enable machine learning, and it replaces the default build rotation
  • :bar_chart: The model gathers unit counts (yours and the enemy’s), detected enemy strategies, build info, and game results every minute, then saves flat JSON data at game end for training
  • :deciduous_tree: LightGBM (Light Gradient Boosting Machine) is the decision tree model used — it runs on CPU, takes ~10 seconds to retrain after a game, and scores builds in under a second at game start
  • :prohibited: When you’re on a losing streak, the system excludes those losing builds from consideration, then scores the remaining builds against historical data to pick the best counter
  • :bullseye: You only need two builds to get started — SharkNice proved the concept with rock-paper-scissors style bots (4-gate, 1-base carrier, 2-base void ray) and got a 100% win rate on the second game after any loss
  • :hourglass_not_done: Stale data is a real problem — bots update their builds, so purge older games (SharkNice keeps ~10-20 per opponent) to avoid making decisions on outdated patterns
  • :crystal_ball: Future potential: feeding replay data (via tools like Phantom’s SC2 reader fork) could let you train on any bot’s games, not just your own — identifying which builds beat specific opponents across the entire ladder

References: