A hope also is that this document can serve as a reference for anyone to implement any subset of such rules themselves - including a rigorous nearly-Japanese rules that bots can use for self-play or for competitive matches completely without need for outside adjudication or dispute resolution or any other protocol besides just the bots making ordinary plays.

I believe the nearly-Japanese rules should correctly handle a wide variety of details, so long as both players play to rationally maximize their score. For example:

Parameters

Rules

Basic Definitions

Pseudolegal moves

Additional Definitions

Main Phase

Starting with an empty grid, the players alternate turns, starting with Black. A turn in the main phase is either a pass or a legal move.

(if ScoringRule is Territory)
The game is NOT ended or scored and instead continues with two cleanup phases (see "Cleanup Phases" section below).

Cleanup Phases

These phases only occur if scoringRule is Territory.

Cleanup is designed to try to match most of the ways that positions would be ruled and scored under normal Japanese rules, so long as players self-interestedly maximize their score during cleanup. Broadly, this is done by giving players 1 point of compensation per move during the (second) cleanup phase, such that the players can now capture dead stones and resolve all disputes without loss of points for filling in territory.

A variety of details are also managed to implement other quirks of Japanese rules. Including there-are-no-points-in-seki, and the Japanese conception of each position as "independent", such that ko threats in one part of the board do not affect the status of the rest of the board. For example, a bent-four-in-the-corner will still resolve as dead under optimal play with these rules even if there are unremovable ko threats on the rest of the board. A lot of the mechanism to do this is based on the Japanese rules themselves, attempting to formalize their spirit to try to make them rigorous enough for self-play.

We do not aim for a 100% perfect match, however. For example, under this ruleset, three-points-without-capturing will (usually) entirely naturally be three points without capturing with no need for any special ruling, matching the traditional Japanese ruling (and in effect, justifying it). But the modern Japanese rules instead regard it as a seki, in which black must concede down to two points to get anything. More exotic kinds of positions will also differ between these rules and Japanese rules.

Cleanup Phase Basics and Definitions

A ko-move for a player in a position is any pseudolegal move M where the opponent would have a pseudolegal move in response, the ko-reply, that would result in exactly the grid coloring prior to M.
In addition to the grid coloring, points on the grid may be marked as ko-recapture-blocked.
The state during cleanup phases consists of the grid coloring together with the ko-recapture-blocked status of all points and the color of the player next to take a turn.

Cleanup Phase Play

Cleanup lasts for two phases[6]. In each phase, starting with the grid coloring from the end of the previous phase, the players alternate turns, starting with the opponent of the player who took the last turn of the previous phase. A turn in the cleanup is either a pass, a legal move, or an unblock-ko-recapture action.[7]

A pass cedes the turn with no effect (but may possibly end the phase, as described below).
A legal move by a player during a cleanup phase is any pseudolegal move that either...
- Is NOT a ko-move.
- Is a ko-move that both...
  - Does NOT capture any region containing a point marked as ko-recapture-blocked.
  - AND where that player did NOT on any earlier turn during the same cleanup phase make a legal move on exactly the same point with exactly the same grid coloring.[8]
  Then, followed by marking the point colored by the move as ko-recapture-blocked.
Then, followed by unmarking all ko-recapture-blocked points whose grid color is empty.
An unblock-ko-recapture action consists of a player choosing a a single-point region of the opposing color that is in atari and marked as ko-recapture-blocked, and removing that mark.

Cleanup Phase Ending and Scoring

A cleanup phase ends after any of:

There are two consecutive passes.
OR a player passes from a state that the player has already passed from once before during the same phase.[9]
OR at the start of a player's turn, the current state has already occurred twice before since the most recent pass by either player during this phase. In this case the not only the phase ends but the entire game immediately ends as well, with a result of "no result".

After the first cleanup phase ends, the second cleanup phase begins immediately with the same grid coloring but with all ko-recapture-blocks unmarked.

After the second cleanup phase ends, the game ends and is scored as follows:

(if SelfPlayOpts is Enabled): Before scoring, for each color, empty all points of that color within pass-alive-territory of the opposing color. Points emptied this way also add to the total number of captures of that point's color.
(if TaxRule is None): A player's score is the sum of:
- +1 for every point in empty regions bordered by their color and not by the opposing color.
- + The total number of captures of the opposing color.
- +1 for every move made by that player during the second cleanup phase.
- -1 for every point of their color not within independent-life-regions and that was not their color at the start of the second cleanup phase.
- If the player is white, Komi.
- (if Button is Used): +0.5 if this player was the first to pass during the main phase.
(if TaxRule is Seki): A player's score is the sum of:
- +1 for every empty point within independent-life-regions of their color.
- + The total number of captures of the opposing color.
- +1 for every move made by that player during the second cleanup phase.
- -1 for every point of their color not within independent-life-regions and that was not their color at the start of the second cleanup phase.
- If the player is white, Komi.
- (if Button is Used): +0.5 if this player was the first to pass during the main phase.
(if TaxRule is All): A player's score is the sum of:
- +1 for every empty point within independent-life-regions of their color.
- + The total number of captures of the opposing color.
- +1 for every move made by that player during the second cleanup phase.
- -1 for every point of their color not within independent-life-regions and that was not their color at the start of the second cleanup phase.
- -2 points for every independent-life-region of their color.
- If the player is white, Komi.
- (if Button is Used): +0.5 if this player was the first to pass during the main phase.

See [10] for some remarks about the scoring.
Although handicap games are not a focus of these rules, see [11] for some notes about handicap game scoring.

The player with the higher score wins, or the game is a draw if equal score.

For computer AI training, the following equivalent formulation for a player's score could also be used if desired. This formulation is much more similar to area scoring, in that it factors over the board as simply a sum of +1/0/-1 for each point on the board, and moves within independent-life-regions by either player do not affect this "ownership" sum whatsoever (so long as dead stones are cleaned up and borders and dame are finished).

(if TaxRule is None): A player's score is the sum of:
- -1 for every move made by that player in the main phase OR first cleanup phase.
- +1 for every point in empty regions bordered by their color and not by the opposing color.
- +1 for every point of their color that is within independent-life-regions OR that was their color at the start of the second cleanup phase.
- If the player is white, Komi.
- (if Button is Used): +0.5 if this player was the first to pass during the main phase.
(if TaxRule is Seki): A player's score is the sum of:
- -1 for every move made by that player in the main phase OR first cleanup phase.
- +1 for every empty point within independent-life-regions of their color.
- +1 for every point of their color that is within independent-life-regions OR that was their color at the start of the second cleanup phase.
- If the player is white, Komi.
- (if Button is Used): +0.5 if this player was the first to pass during the main phase.
(if TaxRule is All): A player's score is the sum of:
- -1 for every move made by that player in the main phase OR first cleanup phase.
- +1 for every empty point within independent-life-regions of their color.
- +1 for every point of their color that is within independent-life-regions OR that was their color at the start of the second cleanup phase.
- -2 points for every independent-life-region of their color.
- If the player is white, Komi.
- (if Button is Used): +0.5 if this player was the first to pass during the main phase.

[1] The intent is "independent-life-regions" indicate regions that are not seki, so long as both players finish all borders and fill all dame. This is motivated by the way Japanese rules attempt to define "seki" using dame. Using the presence of dame to determine seki is actually a pretty clever solution - my original idea had only been to use ability-to-make-pass-alive-ness, but this is considerably more awkward in practice than using dame.

We also include the condition of "atari" to handle groups that have no dame but still survive without two eyes by virtue of having ko mouths. This handles double ko seki. (Back)

[2] Pass-aliveness can be computed by a straightforward algorithm: https://en.wikipedia.org/wiki/Benson%27s_algorithm_(Go). Note that a slight adjustment to the algorithm presented is technically needed if multi-stone suicide is allowed. (Back)

[3] Under this definition, it is possible that a region with one completely interior point is pass-alive-territory but the addition of a single stone on that interior point results in the region no longer being considered pass-alive-territory because the single stone is not a pass-alive-group. We ignore this minor "flaw" since it makes for a simpler definition and algorithmic implementation. (Back)

[4] This Spight-style termination condition ensures that sending-two-returning-one-like positions will terminate, even under area scoring where the cycle does not "cost" points. It also cuts it shorter under territory scoring, so that a badly behaving bot doesn't lose by ~infinity.

The approach taken taken by many Chinese tournaments is to simply prohibit sending-two-returning-one. (Chinese written rules appear to say positional superko, but this written rule is often not used for real tournaments). This would also be easy to implement and KataGo could easily choose to support it in the future, since for all practical purposes a neural net trained under simple ko rules should work fine without modification in an engine that bans sending-two-returning-one.

However, there is also sending-three-returning-one - and perhaps there are others messy cases too, that one would imagine professional players balking at allowing despite not having formally listed and prohibited them ahead of time in written rules. Spight's condition is a much cleaner way to handle them for now.

Some computer tournament rules handle this by simply declaring long cycles to be draws/wins/losses depending on number of stones captured. It would be easy trivial to support these too in the future if needed and probably would not in practice require retraining a neural net either. But for now, no actual human rulesets use this rule, and even in the computer world, positional or situational superko are often more popular. (Back)

[5] Under some real-life human rules, an unbounded cycle would not end the game in and of itself at exactly such a point, rather the game may be manually adjudicated as a no-result. But our goal here is to get a formalization of Japanese-like ko rules for computer self-play, so dictating a precise ending point is necessary. The requirement for no intervening passes makes absolutely sure that we do not no-resultify sending-two-returning-one style positions, even with weird unforeseen move orderings. (Back)

[6] Why have two phases instead of just one?

The intent is that the first phase introduces changes to the ko rules alone, allowing any positions destabilized by it to settle down. Then, the second phase additionally introduces a +1 point per move that allows players to actually begin capturing dead stones without loss of points. If both changes were introduced at once, in some cases, this leads to a highly non-intuitive "pass fight" that is absent from true Japanese rules. This can occur if a protective move becomes necessary once the ko rules change - then we may see players exchange ko threats to try to be not the second to pass and therefore to be first to play in cleanup, since being first to play in cleanup would enable making the protective move with +1 point instead of with +0 points.

Introducing the ko rule and score bonus changes in separate phases eliminates this issue. (Back)

[7] The unblock-ko-recapture action is effectively the Japanese rules's "pass for ko". We name it this way to avoid calling it a pass, since it shares little else in common with a pass with regard to the rules necessary to make cleanup work. Also, highly conveniently, an unblock-ko-recapture for a ko-move location is always mutually exclusive with a legal move for that location, which means we have no need to change the protocol for GTP or introduce new move encodings. We can continue to use the exact same 19x19 + 1 encodings in all existing protocols to represent moves. (Back)

[8] This condition prevents a double ko seki from looping forever in the cleanup phase, at least in the simplest cases, in theory. It must depend on the exact grid coloring rather than be a general prohibition on continuing to unblock and recapture a ko or kos over and over because if the seki is temporary such that one side can capture a surrounding group to collapse it, we must make sure capturing into the ko is not prohibited at that point.

Unfortunately, as stated, this rule still allows quite a large amount of game-prolonging due to double-ko-seki, which makes it not ideal for selfplay. Is there a better formulation that is still clean to state and implement, that limits the ability of the attacker to fruitlessly cycle the double-ko-seki? (Back)

[9] This Spight-style termination condition ensures that sending-two-returning-one-type positions will terminate, even during the second cleanup phase when the cycle no longer "costs" points. (Back)

[10] The "color at the start of the second cleanup phase" condition prevents one-sided dame from granting points to the side able to fill the dame.

We go ahead and have an allowance for Button Go here too. This may seem odd, since normally the intent is as a way obtain territory-scoring granularity with area-scoring, so if already using territory-scoring, why would one want such a rule? But at least when we came to implement it in KataGo, it seemed programmatically no more complex (simpler, even) to just have it as an option always, and formulating it this way makes for more natural extension to Coupon Go if desired, which *does* make sense in territory scoring. KataGo for now does NOT actually support territory + button though. (Back)

[11] A number of human rulesets allow for Black to begin with N >= 2 stones on the board when playing a game between two differently-skilled players, with these stones being placed in a ruleset-specified way, and with White making the first actual game move instead of Black. For various reasons, during handicap games many of these rulesets give White bonus points based on N, additionally on top of any komi or other settings. Some rulesets give 0 points, some give N-1 points, and some give N points. (Back)

KataGo's Supported Go Rules (Version 2)