Skip to content

The Great Stockfish NPS Debate

dkappe edited this page Aug 26, 2021 · 10 revisions

Fat Fish

Are Big Nets Hurting Stockfish?

With ever growing NNUE net sizes, the NPS of stockfish is going down. This has to be a bad thing, right? Well, before we try to answer that question, let’s put some actual numbers to this debate. Using the avx2 and generally fastest official Stockfish builds, we captured the nps result of bench on an idle system.

SF NPS

As you can see, while the NPS has gone down, it’s nowhere near the 40%-50% drop that some have been talking about. But it has gone down. Maybe not 40%, but from SF11 to SF14 a not inconsequential 20%. Won’t this have effects that offset the increased strength of the net?

Well, one reason that might not be so is the nature of ab searches. The better the move ordering, the fewer nodes a search will visit to get to the same depth. It stands to reason that a better evaluation might yield a better move ordering. But without a doubt, it does take longer for SF14 to get to the same depth as SF11. Might there not be a horizon effect, where at some point the bigger net is missing moves that it’s weaker, quicker rivals see? Could the big nets be outsearched?

A Simple Test

If the newer Stockfish versions are made vulnerable by their lower NPS, then we should be able to test this hypothesis by playing them against an opponent, increasing the opponent’s NPS while holding all other things equal. Eventually, it stands to reason, the performance of the big nets should fall off a cliff as they succumb to the horizon effect.

Our opponent is the reasonably strong SF10, and our technique for increasing the NPS is to give SF10 more time. We start at 10”+0.1”, then go to 2x, 3x and so on. We use one of Stefan Pohl’s anti draw opening suites to accentuate the rating difference and run until we have single digit error bars. You can see the results below:

SF time trial

Do we see the slow searchers like SF14 falling off a cliff? No. Quite the contrary. It is the NPS champ, SF11, that falls off more steeply. Even SF12 hangs around the 0 point.

Now this is not a definitive test. One could extend the test further past 5x, though I don’t have the patience. A stronger, faster NNUE engine might rough up SF14 or the even bigger nets coming down the pike. If you know of such an engine, point it my way.