The strategy for solving a Sudoku puzzle may be regarded as
comprising a combination of three processes: scanning, marking up,
and analysing.
Sudoku Scanning
Scanning is performed at the outset and periodically throughout the
Sudoku solution. Scans may have to be performed several times in
between analysis periods. Scanning consists of two basic techniques:
1) Crosshatching: the scanning of rows (or columns) to
identify which line in a particular region may contain a certain
number by a process of elimination. This process is then repeated
with the columns (or rows). For fastest results, the numbers are
scanned in order of their frequency. It is important to perform this
process systematically, checking all of the digits 1–9.
2) Counting 1–9 in regions, rows, and columns to identify missing
numbers. Counting based upon the last number discovered may
speed up the search. It also can be the case (typically in tougher
puzzles) that the value of an individual cell can be determined by
counting in reverse—that is, scanning its region, row, and column
for values it cannot be to see which is left.
Advanced solvers look for "contingencies" while scanning—that is,
narrowing a number's location within a row, column, or region to two
or three cells. When those cells all lie within the same row (or
column) and region, they can be used for elimination purposes during
crosshatching and counting (Contingency example at Puzzle Japan).
Particularly challenging puzzles may require multiple contingencies
to be recognized, perhaps in multiple directions or even
intersecting—relegating most solvers to marking up (as described
below). Puzzles which can be solved by scanning alone without
requiring the detection of contingencies are classified as "easy"
puzzles; more difficult puzzles, by definition, cannot be solved by
basic scanning alone.
Marking up a Sudoku
Scanning comes to a halt when no further numbers can be discovered.
From this point, it is necessary to engage in some logical analysis.
Many find it useful to guide this analysis by marking candidate
numbers in the blank cells. There are two popular notations:
subscripts and dots.
In the subscript notation the candidate numbers are written in
subscript in the cells. The drawback to this is that original
puzzles printed in a newspaper usually are too small to accommodate
more than a few digits of normal handwriting. If using the subscript
notation, solvers often create a larger copy of the puzzle or employ
a sharp or mechanical pencil.
The second notation is a pattern of dots with a dot in the top left
hand corner representing a 1 and a dot in the bottom right hand
corner representing a 9. The dot notation has the advantage that it
can be used on the original puzzle. Dexterity is required in placing
the dots, since misplaced dots or inadvertent marks inevitably lead
to confusion and may not be easy to erase without adding to the
confusion. Using a pencil would then be recommended.
An alternative technique that some find easier is to mark up those
numbers that a cell cannot be. Thus a cell will start empty and as
more constraints become known it will slowly fill. When only one
marking is missing, that has to be the value of the cell.
Sudoku Analysis
The two main approaches to analysis are "candidate elimination" and
"whatif".
In elimination, progress is made by successively eliminating
candidate numbers from one or more cells to leave just one choice.
After each answer has been achieved, another scan may be performed—usually
checking to see the effect of the latest number. There are a number
of elimination tactics, all of which are based on the simple rules
given above, which have important and useful corollaries, including:
1) A given set of n cells in any particular block, row, or column
can only accommodate n different numbers. This is the basis for the
"unmatched candidate deletion" technique, discussed below.
2) Each set of candidate numbers, 1–9, must ultimately be in an
independently selfconsistent pattern. This is the basis for
advanced analysis techniques that require inspection of the entire
set of possibilities for a given candidate number. Only certain "closed
circuit" or "n×n grid" possibilities exist (which have acquired
peculiar names such as "Xwing" and "Swordfish", among others; see
List of Sudoku terms and jargon for more information). If these
patterns can be identified, elimination of candidate possibilities
external to the grid framework can sometimes be achieved.
One of the most common elimination tactics is "unmatched
candidate deletion". Cells with identical sets of candidate
numbers are said to be matched if the quantity of candidate numbers
in each is equal to the number of cells containing them; essentially,
these are perfectly coincident contingencies. For example, cells are
said to be matched within a particular row, column, or region (scope)
if two cells contain the same pair of candidate numbers (p,q) and no
others, or if three cells contain the same triplet of candidate
numbers (p,q,r) and no others. The placement of these numbers
anywhere else in the matching scope would make a solution for the
matched cells impossible; thus, the candidate numbers (p,q,r)
appearing in unmatched cells in the row, column or region scope can
be deleted. This principle also works with candidate number subsets—if
three cells have candidates (p,q,r), (p,q), and (q,r) or even just (p,r),
(q,r), and (p,q), all of the set (p,q,r) elsewhere in the scope can
be deleted. The principle is true for all quantities of candidate
numbers.
A second related principle is also true — if each cell within a set of
cells (in a row, column or region scope) contains the same set of
candidate numbers, and if the number of cells is equal to the
quantity of candidate numbers, the cells and numbers are matched and
only those numbers can appear in matched cells. Other candidates in
the matched cells can be eliminated. For example, if (p,q) can only
appear in 2 cells (within a specific row, column, region scope),
other candidates in the 2 cells can be eliminated.
The first principle is based on cells where only matched numbers
appear. The second is based on numbers that appear only in matched
cells. The validity of either principle is demonstrated by posing
the question 'Would entering the eliminated number prevent
completion of the other necessary placements?' Advanced techniques
carry these concepts further to include multiple rows, columns, and
blocks. (See "Xwing" and "Swordfish" above.)
In the whatif approach, a cell with only two candidate numbers is
selected, and a guess is made. The steps above are repeated unless a
duplication is found or a cell is left with no possible candidate,
in which case the alternative candidate is the solution. In logical
terms, this is known as reductio ad absurdum. Nishio is a limited
form of this approach: for each candidate for a cell, the question
is posed: will entering a particular number prevent completion of
the other placements of that number? If the answer is yes, then that
candidate can be eliminated. The whatif approach requires a pencil
and eraser. This approach may be frowned on by logical purists as
trial and error (and most published puzzles are built to ensure that
it will never be necessary to resort to this tactic,) but it can
arrive at solutions fairly rapidly.
Ideally one needs to find a combination of techniques which avoids
some of the drawbacks of the above elements. The counting of
regions, rows, and columns can feel boring. Writing candidate
numbers into empty cells can be timeconsuming. The whatif approach
can be confusing unless you are well organised. The proverbial Holy
Grail is to find a technique which minimises counting, marking up,
and rubbing out.
