Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

Canonical sudoku algorithm?
Goto page Previous  1, 2, 3, 4, 5, 6  Next
 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Setting sudoku
View previous topic :: View next topic  
Author Message
vidarino

Joined: 10 Feb 2006
Posts: 38
:
Location: Haugesund, Norway

Items
PostPosted: Mon Mar 27, 2006 8:48 pm    Post subject: Reply with quote

Lummox JR wrote:
Daj, my interest is in a canonical puzzle form, not canonical solutions. That is, to detemine if two puzzles are equivalent. Two non-equivalent puzzles--with the same canonical solution or without--will almost definitely need different solve methods. However two equivalent puzzles should require the exact same methods.


In that case, finding the solution grids for the two puzzles in question and then canonicalize them is enough. Repeating the respective transformations for the starting puzzles will give you the answer; if the transformed puzzles are identical, so are their solving paths, regardless of how they looked before. If the transformed puzzles differ in any way, you have two essentially different puzzles.
Back to top
View user's profile Send private message Visit poster's website
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Mon Mar 27, 2006 10:03 pm    Post subject: Reply with quote

vidarino wrote:
In that case, finding the solution grids for the two puzzles in question and then canonicalize them is enough. Repeating the respective transformations for the starting puzzles will give you the answer; if the transformed puzzles are identical, so are their solving paths, regardless of how they looked before. If the transformed puzzles differ in any way, you have two essentially different puzzles.

Well this is the part I'm not clear on. I don't get how canonicalizing the solution guarantees the same for the puzzle, unless there's no way the same canonical solution can have an equivalent puzzle for any set of givens. That is, there could theoretically be some sort of isomorphic transformation that would preserve the grid while rearranging the givens to a new configuration. If this is not the case, great, but I'd like to see a proof of that. Moreover, I'd like to know how even the solution grid is reduced to canonical form.
Back to top
View user's profile Send private message
vidarino

Joined: 10 Feb 2006
Posts: 38
:
Location: Haugesund, Norway

Items
PostPosted: Tue Mar 28, 2006 7:26 am    Post subject: Reply with quote

Ah, OK, I think it's slowly dawning on me what you're after. You want to find (or check that) two puzzles (that (may?) have the same solution) are to be considered equivalent, despite having different sets of givens.

I'm far from sure that such a comparison is possible, though. One way to compare two puzzles would be to compile some kind of solving paths / graphs for the puzzles and compare those.

In this thread on the Players' Forum a group of us are trying to find the hardest possible (and recently, the easiest possible) puzzle that requires nothing but singles. In that regard, we're using a solving path "breadth" measurement which may be similar to what you're looking for. But it would have to be extended beyond just counting singles, of course.

But as said, I'm not sure if even this will show that two different puzzles are equivalent, although the solver (human or not) might find them to be fairly close. As far as I can see, the only way to know for sure if they are equivalent is to canonicalize them by their solution grids, and see if they match up. If they do, they are definitely equivalent. If not, good luck. Wink
Back to top
View user's profile Send private message Visit poster's website
daj95376

Joined: 05 Feb 2006
Posts: 349
:

Items
PostPosted: Tue Mar 28, 2006 4:42 pm    Post subject: Reply with quote

Lummox JR wrote:
Quote:
Daj, my interest is in a canonical puzzle form, not canonical solutions. That is, to detemine if two puzzles are equivalent. Two non-equivalent puzzles--with the same canonical solution or without--will almost definitely need different solve methods. However two equivalent puzzles should require the exact same methods.


I disagree with your assertion that '... two equivalent puzzles should require the exact same methods'. Any method of solving a puzzle is going to be based on visiting the cells is some orderly manner. So, it seems to me that puzzles should exist that can be rotated 180 degrees and not be solved in the exact same method for any number of reasons. It all depends on the methodology of the solver you're using.

----- ----- -----

Here's what I've sifted from my readings in this and other threads.

1) Multiple CNFs exist for solution grids. Each simply guarantees that its results are identical when applied to an initial solution grid and any grids produced by legitimate transformations on it.

2) gsf has a solver that takes a puzzle, canonicalizes its solution, and lists the original puzzle after identical transformations are applied to it. You could run his solver on two puzzles and compare the transformed puzzles to see if they're identical.

This is as close as I can get to an answer to your query.
Back to top
View user's profile Send private message
Ruud
Site Admin
Joined: 17 Sep 2005
Posts: 708
:
Location: Netherlands

Items
PostPosted: Tue Mar 28, 2006 6:22 pm    Post subject: Reply with quote

Quote:
I disagree with your assertion that '... two equivalent puzzles should require the exact same methods'. Any method of solving a puzzle is going to be based on visiting the cells is some orderly manner. So, it seems to me that puzzles should exist that can be rotated 180 degrees and not be solved in the exact same method for any number of reasons. It all depends on the methodology of the solver you're using.

There should be room for some confusion, as there is no single definition of a canonical form.

According to Gordon Royle in this thread on the player's forum, a canonical form is only an individual grid that represents an equivalence class. The rules that define the equivalence classes are what matters. In stead of a canonical form you can also use a checksum as the class representative, as long as all members of the class can produce the same checksum.

I was thinking about an algorithm that could can produce a checksum for a solving path. With such an algorithm, I can create a database with one sudoku for each solving path. There may be no way to transform each member of such a class into each other member, but with a different way defining the classes, that is not required. Besides the checksum, I can keep the original puzzle as the classes representative.

For me, Lummox JR, and many sudoku players, the solving path is what defines the sudoku. It must be possible to classify a puzzle based on that.

For example:

Take the initial puzzle, and locate all singles. Now permutate the grid in such a way that it places those singles in the topmost-leftmost position. Solve the singles, rinse and find the next round. Rearrange the grid to place the newfound singles in the topmost-leftmost position, but without changing the shape of the first group. It is OK to swap 2 members of that group, but a member of the first group cannot be swapped with a non-member. Iterate until the puzzle is completely solved. Now you can calculate a checksum on the relative positions of the cells in the order they were solved. Special solving techniques can be part of the checksum. Relabel the resulting grid to use as the canonical grid.

Ruud.
Back to top
View user's profile Send private message Visit poster's website
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Tue Mar 28, 2006 9:29 pm    Post subject: Reply with quote

daj95376 wrote:
I disagree with your assertion that '... two equivalent puzzles should require the exact same methods'. Any method of solving a puzzle is going to be based on visiting the cells is some orderly manner. So, it seems to me that puzzles should exist that can be rotated 180 degrees and not be solved in the exact same method for any number of reasons. It all depends on the methodology of the solver you're using.

I think I conveyed this poorly. I mean that the exact same methods, merely transformed, should solve an equivalent puzzle. Whether a particular solver would succeed with different tests in a different order (as it surely would) is beside the point.
Ruud wrote:
According to Gordon Royle in this thread on the player's forum, a canonical form is only an individual grid that represents an equivalence class. The rules that define the equivalence classes are what matters. In stead of a canonical form you can also use a checksum as the class representative, as long as all members of the class can produce the same checksum.

Corollary: Members of different classes must produce different checksums. Otherwise the checksum is not canonical. So far I haven't found a checksum algorithm that can guarantee that.
Quote:
For me, Lummox JR, and many sudoku players, the solving path is what defines the sudoku. It must be possible to classify a puzzle based on that.

Hrm, to an extent I'd agree the solving path matters, except I'd phrase it such that the same solving path, with transformations, would solve any equivalent sudoku. I think, however, that classifying puzzles based solely on solve paths is a losing proposition, since there is no such thing as a quantifiable minimum solve path (except perhaps with techniques adapted from DLX, ignoring human-style logic techniques altogether). If we can't rate puzzles objectively, neither can we canonicalize them on that basis.
Back to top
View user's profile Send private message
Moschopulus

Joined: 12 Aug 2005
Posts: 39
:

Items
PostPosted: Tue Mar 28, 2006 10:56 pm    Post subject: Reply with quote

Soultaker wrote:

I'm not sure myself (maybe gsf can clear it up?) but from his comments it seems he defines the normal solution as the lexicographically smallest solution with the top-left box being 1,2,3/4,5,6/7,8,9. This additional requirement seems unneccessary and a bit strange to me; why not simply pick the smallest solution, regardless of the top-left box?


He orders the first row like this to get around the issue of relabelling the digits.

I think there is confusion caused in this thread over whether to allow permutations of the digits.

Some posters here seem to include just the 6^8 permutations of the cells only, when going to a canonical form. Others are also allowing the 9! permutations of the digits. Which do you want?
Back to top
View user's profile Send private message Visit poster's website
daj95376

Joined: 05 Feb 2006
Posts: 349
:

Items
PostPosted: Tue Mar 28, 2006 11:06 pm    Post subject: Reply with quote

Ruud wrote:
According to Gordon Royle in this thread on the player's forum, a canonical form is only an individual grid that represents an equivalence class. The rules that define the equivalence classes are what matters. Instead of a canonical form you can also use a checksum as the class representative, as long as all members of the class can produce the same checksum.


Please list any 'rules that define the equivalence classes' for puzzles.
Back to top
View user's profile Send private message
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Wed Mar 29, 2006 12:06 am    Post subject: Reply with quote

Well daj, there are just a few rules. Two puzzles (and therefore solutions) are equivalent if:

- The puzzle is transposed (flipped along a diagonal)
- The order of 9x3 and/or 3x9 bands is permuted
- The order of columns or rows within bands are permuted
- The digits are permuted

None of those operations will change a puzzle in any significant way. The same solve path can still be followed to the solution, provided you also transform the steps of the path.
Back to top
View user's profile Send private message
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Wed Mar 29, 2006 12:36 am    Post subject: Reply with quote

I think I understand gsf's algorithm from the source code, now, well enough to explain it here.

Let's start with terminology. The sudoku is NxN in size, and PQ=N, where P>=Q, but P and Q are as close to each other as possible. If N is a square number, like in a standard 9x9 sudoku, P=Q. The boxes are arranged such that each is P columns by Q rows.

Each column or row of entire boxes we'll call a band. The P-bands will be P boxes stacked to PxN in size, and the Q-bands are Q boxes stacked to QxN in size. If you prefer, think of these as towers and slabs, respectively.

Step 1: "Loop A". Loop through each box, and each orientation (original or transposed). This is 2N iterations if P=Q (that is, 18 for a 9x9 puzzle), or N iterations otherwise; ignore transposition if P=Q. Move the chosen box to the box 1 position by transposing the puzzle if necessary, then swapping P- and Q-bands till they reach the desired position.
Step 2: "Loop B". Loop through each permutation of the columns and rows in box 1. This is P!Q! iterations, which in a 9x9 puzzle is 3!3!=36.
Step 3: Remap all digits in box 1 in the order 1, 2, 3, 4, etc. going left to right from the top line down.
Step 4: Find the smallest digit in row 1 that is not in box 1. Rearrange the P-bands so this goes in box 2, then rearrange the columns in this band so this part of row 1 is in ascending order. Repeat for all remaining P-bands.
Step 5: You can now look at the puzzle as an NQ-digit number (27 digits in a 9x9 puzzle), with r1c1 as the leftmost digit. Compare numbers; if your result is higher than the current minimum, bail out and move to the next iteration of loop B (step 2). If it is equal or lower, continue to the next steps.
Step 6: Find the smallest digit in column 1 that is not in box 1. Rearrange the Q-bands so this goes in box Q+1, then rearrange the rows in this band so this part of column 1 is in ascending order. Repeat for all remaining Q-bands.
Step 7: At this point, compare N^2-digit number like in step 5. If this is lower, it is the new minimum and the best candidate so far for canonical form. If it is equal, then two different transforms have resulted in the same candidate canonical form, and this algorithm cannot be used to canonicalize a set of givens.

Now, consider the last part I added. If it is possible for two solution grids to have the same candidate canonical form, it means it could be possible for two puzzles to have the same actual canonical form, which means two different methods of transforming the puzzle will result in two equivalent puzzles with different givens but the same exact solution.

It is possible to test this, at least if N is known. I don't think it'd be possible to determine if this is true in general for all sudoku of any arbitrary size. The way I'd test this would be as follows:

Step 1: "Loop A". Set r1c1 to 1, and iterate through all permutations of box 1's digits where r1c1 through r1cP are in ascending order, and if P=Q then r1c2<r2c1.
Step 2: "Loop B". Iterate through all permutations of row 1 where each P-band segment of row 1 is in ascending order. Iterate further through the rest of the possibilities for the entire top Q-band.
Step 3: "Loop C". Iterate through all permutations of column 1 where each Q-band segment of column 1 is in ascending order. Iterate further through the possibilities for the rest of the puzzle.
Step 4: At this point we have a sort of organized form of solution grid, formed by rearranging only, not by changing digits. Reduce this to canonical form. If the same form can be reached more than one way by the above algorithm, then that algorithm will not suffice to find canonical versions of all puzzles.

Now, who wants to try it? Anyone?
Back to top
View user's profile Send private message
daj95376

Joined: 05 Feb 2006
Posts: 349
:

Items
PostPosted: Wed Mar 29, 2006 12:51 am    Post subject: Reply with quote

Okay. Below are two puzzles. The first is the easy1.ss puzzle supplied with Simple Sudoku. The second is the puzzle after two naked cells have been filled.

Code:
 *-----------*
 |.6.|1.4|.5.|
 |..8|3.5|6..|
 |2..|...|..1|
 |---+---+---|
 |8..|4.7|..6|
 |..6|...|3..|
 |7..|9.1|..4|
 |---+---+---|
 |5..|...|..2|
 |..7|2.6|9..|
 |.4.|5.8|.7.|
 *-----------*


 *-----------*
 |.6.|1.4|.5.|
 |..8|3.5|6..|
 |2..|...|..1|
 |---+---+---|
 |8..|4.7|..6|
 |..6|8.2|3..|
 |7..|9.1|..4|
 |---+---+---|
 |5..|...|..2|
 |..7|2.6|9..|
 |.4.|5.8|.7.|
 *-----------*



Now, none of the transformations you describe will ever make these two puzzles identical. Similarly, the distinction of two cells being 'given' vs. 'candidate cells' will prevent any solver from ever generating identical steps for their solutions.

Thus, I deduce from your rules that these two puzzles are not equivalent. Okay!
Back to top
View user's profile Send private message
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Wed Mar 29, 2006 1:30 am    Post subject: Reply with quote

Correct, they are not equivalent. The second one has extra givens, therefore it is impossible to follow the same solve path.
Back to top
View user's profile Send private message
Moschopulus

Joined: 12 Aug 2005
Posts: 39
:

Items
PostPosted: Wed Mar 29, 2006 9:20 am    Post subject: Reply with quote

Here is a grid:

124567893
378294516
659831742
987123465
231456978
546789321
863972154
495618237
712345689

If you rotate clockwise by 90 degrees, and relabel the digits by 1->3->9->7->1 and 2->6->8->4->2, you get the same grid.

Here is a puzzle whose solution is the above grid:

104000890
378200010
050830700
987000000
000050070
000009300
063070000
400008000
000000000


If you rotate clockwise by 90 degrees, and relabel the digits by 1->3->9->7->1 and 2->6->8->4->2, you don't get the same puzzle.

I must admit that I'm not sure if this is relevant to the discussion, but I hope so!
Back to top
View user's profile Send private message Visit poster's website
gsf

Joined: 18 Aug 2005
Posts: 411
:
Location: NJ USA

Items
PostPosted: Wed Mar 29, 2006 11:36 am    Post subject: Reply with quote

Lummox JR wrote:
I think I understand gsf's algorithm from the source code, now, well enough to explain it here.


thanks lummox

a few general points:
fixing the top left box is a technique used in sudoku counting
it drastically cuts down the number of cases that must be examined
for 9x9 its 18*3*3, with the inner loop computations ~O(100K)
the implementation has a bunch of loop unrolling which makes it unwieldy but efficient
it bails on the current permutation at the earliest point
another uwieldy but efficient part is that it never constructs the grid under test
it instead constructs the row,col,value permutations (27)
this cuts down on loop sentinal tests and memory writes

the main points

(1) it operates on a solution grid, not a partial puzzle
this means that to canonicalize a partial grid (using this method) the
grid must first be solved, and it must have exactly 1 solution
this also means that the original puzzle grid can be permuted using
a (see below for the signifcance of a) canonical solution grid permutation to yield a canonical puzzle
proof by construction:
the original puzzle grid has exactly one solution
the canonical puzzle grid is simply the original puzzle grid permuted by
operations that leave the grid unchanged

(2) the operations performed on the solution grid are permutations that leave the puzzle unchanged
there is no way (short of coding errors) that non-equivalent grid can slip in

(3) the permutations that leave the puzzle unchanged are
(a) swap any two bands (edit)
(b) swap any two rows within a band
(c) swap any two columns within a band
(d) swap any two cell values for all cells with those values
(e) rotate the grid 90 degrees
a combination of these permutations can be used to place any box in
the upper left in row or col order -- this is the basis of the canonicalization loop

Lummox JR wrote:

Step 7: At this point, compare N^2-digit number like in step 5. If this is lower, it is the new minimum and the best candidate so far for canonical form. If it is equal, then two different transforms have resulted in the same candidate canonical form, and this algorithm cannot be used to canonicalize a set of givens.


good observation, wrong conclusion
if two different permutations of the same solution grid yield the same canonical grid
then an automorphism has been uncovered
the canonical form is not invalid
instead a special property of the grid has been uncovered
this is related to a key part of counting the number of essentially different sudoku solutions

the algorithm would have to change substantially to count all non-trivial automorphisms,
but a simple change (counter clear/increment at the end of the inner loop)
allows it to count the number of non-trivial automorphisms of the canonical solution grid


Last edited by gsf on Wed Mar 29, 2006 4:55 pm; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website
Soultaker

Joined: 28 Feb 2006
Posts: 49
:

Items
PostPosted: Wed Mar 29, 2006 2:52 pm    Post subject: Reply with quote

This thread becomes more interesting by the day!

gsf wrote:

(3) the permutations that leave the puzzle unchanged are
(a) swap any two rows within a band
(b) swap any two columns within a band
(c) swap any two cell values for all cells with those values
(d) rotate the grid 90 degrees
a combination of these permutations can be used to place any box in
the upper left in row or col order -- this is the basis of the canonicalization loop

Didn't you forget to include swapping bands as permissable permutations?

Note that my brute-force code generates the entire permutation group for cell permutations (which excludes symbol permutations) from just three fixed permutations:
- rotate the grid by 90 degrees
- swap the top two rows
- swap the top two bands

(This surprised me a bit at first, because Ed Russel and Frazer Jarvis used ten generators for their calculation: http://www.afjarvis.staff.shef.ac.uk/sudoku/ but if you think about it makes sense; operations like reflections and swapping other rows, columns and bands can be created as a combination of the permutations above.)
Back to top
View user's profile Send private message MSN Messenger
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Setting sudoku All times are GMT
Goto page Previous  1, 2, 3, 4, 5, 6  Next
Page 2 of 6

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron