
View previous topic :: View next topic 
Author 
Message 
 Lummox JR
 Joined: 07 Sep 2005  Posts: 202  :   Items 

Posted: Sun Mar 26, 2006 2:49 am Post subject: Canonical sudoku algorithm? 


I'm interested in implementing canonicalization in my sudoku generator/solver. I checked out one thread, and another on the players' forums, but nowhere is the actual algorithm spelled out. The best I can find is source code, which is difficult to understand and therefore difficult to trust (even if it's known to work). Canonicalization is a fairly sticky problem.
I know of course the basic operations you can perform to keep an equivalent grid. What I don't get is how to navigate through these to find a canonical form. It's one thing to canonicalize a solution grid, but then that doesn't tell you how to rearrange the givens, whether by position and/or digit. The fact that both are mutable suggests that it might be possible for two different canonical solution grids to hold equivalent givens. That is, there might be two or more operative paths from a solution grid to its canonical form, and in order for the givens to also be canonical, the same path must apply to all equivalent puzzles. If this is actually the case, could anyone explain why? 

Back to top 


 Soultaker
 Joined: 28 Feb 2006  Posts: 49  :   Items 

Posted: Sun Mar 26, 2006 11:34 am Post subject: 


It's very simple, really. Given a puzzle (an empty grid with some hints) and a solution (a filled in grid), you figure out how to transform the solution into it's canonical form. This is transformation is simply a permutation of cell locations and a permutation of symbols. You can easily perform the same transformation to the original puzzle as well to bring it into normalized form.
This is where you're mistaken:
Quote:  It's one thing to canonicalize a solution grid, but then that doesn't tell you how to rearrange the givens, whether by position and/or digit. 
On the contrary: if you know how to canonicalize the solution, that tells you exactly how to rearrange the givens in the solution.
I might have misunderstood this part:
Quote:  The fact that both are mutable suggests that it might be possible for two different canonical solution grids to hold equivalent givens. That is, there might be two or more operative paths from a solution grid to its canonical form, and in order for the givens to also be canonical, the same path must apply to all equivalent puzzles. If this is actually the case, could anyone explain why? 
You seem to suggest that for a given puzzle there might be two different transformations that lead to the same canconical form. That's simply impossible; would you like a proof of that? 

Back to top 


 Lummox JR
 Joined: 07 Sep 2005  Posts: 202  :   Items 

Posted: Sun Mar 26, 2006 7:38 pm Post subject: 


Yes, that was what I was thinking about the canonical form, that two different transformations might result. I'd love to see a proof of that, and moreover some details of the actual algorithm involved so I could figure out how to implement it myself. 

Back to top 


 Soultaker
 Joined: 28 Feb 2006  Posts: 49  :   Items 

Posted: Sun Mar 26, 2006 9:15 pm Post subject: 


I think I didn't describe it properly. There might be different transformations on the solution grid that result in the same canonical solution, but these also transform the puzzle to a unique form. Does this answer your question? (I'm afraid not, but I'm not sure what to get into.) 

Back to top 


 Lummox JR
 Joined: 07 Sep 2005  Posts: 202  :   Items 

Posted: Sun Mar 26, 2006 11:41 pm Post subject: 


That's my point of confusion. I don't get how you could apply different transformations to the solution grid and still bring the givens together in the same unique form. If digits are permuted first and then columns and rows, then a form of the puzzle that changed columns and rows might escape canonicalization. If column and row realignment comes first, then a puzzle with different digits could invalidate such an algorithm. So to my mind it seems that depending on the preferred transformation order, some permutation of the original puzzle would lead to a different canonical form. It doesn't seem like permuting the solution grid is quite enough.
But then, I don't know the algorithm, so I don't know how it works or how it would avoid this problem. Just seeing source code doesn't help. What I'd like most is to find a link explaining the algorithm itself. 

Back to top 


 vidarino
 Joined: 10 Feb 2006  Posts: 38  :  Location: Haugesund, Norway  Items 

Posted: Mon Mar 27, 2006 7:26 am Post subject: 


Lummox JR wrote:  That's my point of confusion. I don't get how you could apply different transformations to the solution grid and still bring the givens together in the same unique form. 
You need to apply the same transformation to the puzzle.
Consider this 2x2 sample (for demonstration purposes only  probably not even solvable):
Code:  .3.. 4312
2..4 2134
...1 3241
.4.. 1423 
So, relabelling the top row, both the puzzle and the solution is transformed;
Code:  .2.. 1234
4..1 4321
...3 2413
.1.. 3142 
However, switching around columns 3 and 4, then relabelling again gives us this;
Code:  .2.. 1234
3.1. 3412
..4. 2341
.1.. 4123 
... which is numerically (considering the grid to be a 16digit number) lower than the one before. Next you could switch the two topmost rows with the bottom two, relabel and check again. Basically try all the permutations systematically, and pick the one that's numerically lowest. (Or highest. As long as all are puzzles are picked on the same criteria.)
That's how I do canonicalization. There are probably shortcuts, but I haven't found a pressing need for a speedy algorithm just yet. Suggestions welcome, though. 

Back to top 


 Soultaker
 Joined: 28 Feb 2006  Posts: 49  :   Items 

Posted: Mon Mar 27, 2006 2:44 pm Post subject: 


vidarino: what I worry about with that approach is that you might miss a chance to normalize the grid with a combination of operations. For example, swapping to columns and then swapping two rows may create an even 'smaller' grid, but if neither of those operations gives you an immediate improvement, your algorithm does neither and returns a nonoptimal solution.
In addition, as Lummox JR also said, you arbitrarily choose a permutation of symbols based on the first row of the initial solution. Why not swap a few rows and columns and then permute the symbols? It may result in an entirely different grid.
From my calculations (and those of others) there are about 3.3 million different valid permutations of cells (not symbols). After picking one of those, the best symbol permutation is fixed by the symbols in the top row. So, I think a correct algorithm tries all 3.3 million permutations and then picks the lowest. However, it's easy to apply the same permutation to the original puzzle.
This is ofcourse not very efficient (and for larger grids it's even worse) so the question is if there are any 'shortcuts' allowed. I'm not sure of this yet. 

Back to top 


 Lummox JR
 Joined: 07 Sep 2005  Posts: 202  :   Items 

Posted: Mon Mar 27, 2006 5:16 pm Post subject: 


Ah, but how is such an algorithm achieved, and why would it work on givens and not just the solution? This is the burning question. I know gsf has implemented an algorithm, but from his source code it's difficult to understand the algorithm itself, let alone why (or if?) it works on givens. 

Back to top 


 Soultaker
 Joined: 28 Feb 2006  Posts: 49  :   Items 

Posted: Mon Mar 27, 2006 5:28 pm Post subject: 


For clarity; the (naive) algorithm I proposed was: try each of the 3 million or so possible permutations on the solution grid and determine which one results in the lexicographically smallest solution. Apply this same transformation to the original puzzle to bring it into normal form.
(By definition, a puzzle is in normal form if no transformation of the puzzle existst that results in a lexicographically smaller solution. A permutation is permissable only if you can apply it to any valid puzzle to obtain another valid puzzle.)
Lummox JR wrote:  Ah, but how is such an algorithm achieved, and why would it work on givens and not just the solution? 
I still have trouble understanding this question. Why wouldn't it work on givens? 

Back to top 


 vidarino
 Joined: 10 Feb 2006  Posts: 38  :  Location: Haugesund, Norway  Items 

Posted: Mon Mar 27, 2006 5:52 pm Post subject: 


Soultaker wrote:  vidarino: what I worry about with that approach is that you might miss a chance to normalize the grid with a combination of operations. For example, swapping to columns and then swapping two rows may create an even 'smaller' grid, but if neither of those operations gives you an immediate improvement, your algorithm does neither and returns a nonoptimal solution. 
Ah, don't worry about me.
I was just using those transformations to explain the workings of such an algorithm. I have actually implemented the straightforward algorithm you proposed, trying all 3359232 (6^8*2) permutations and finding the lexicographically smallest of the bunch.
It takes a few seconds to run, though, so I should probably look into cleaning it up a bit.
Vidar 

Back to top 


 Lummox JR
 Joined: 07 Sep 2005  Posts: 202  :   Items 

Posted: Mon Mar 27, 2006 6:20 pm Post subject: 


Transformation from one solution grid to a canonical form could be done in a number of different ways via different steps. Givens comprise only part of the solution, so depending on how those transformations were done, two equivalent puzzles could conceivably have the same canonical solution grid while still having two different sets of givens.
If two isomorphic solution grids have the same canonical form (as they should, if it's truly canonical), there's no guarantee the transformations involved would provide a canonical set of givensmerely two isomorphic puzzles with the same solution. If the first step of canonicalization is to relabel the digits in box 1, for instance, then a puzzle that had been flipped horizontally would have to go through a different set of transformations (besides just adding a horizontal flip) to reach canonical form than one that had not. If the first step was rearranging columns and/or rows, then a puzzle that had had its digits permuted would go through a different set of rearrangements.
It does not necessarily follow that a process to find a canonical form of the solution grid also yields a canonical puzzle. You can compare this to sorting terminology, where a sort guarantees the results are in order, but only a stable sort guarantees that equivalent items remain in the same relative order they started. A sudoku puzzle may have three 9's, and the transformations will tell you which 3 givens correspond to them, but that result could be different for other forms of the same puzzle, if the transformation process is not stable.
I could probably figure out a way of canonicalizing the solution. That's a difficult problem but there are some resources to figure out out. What's not an easy problem is finding such an algorithm that also canonicalizes the givens.
The only idea I have contrary to this is that maybe the concept of equivalent puzzles with the same solution is bogus. (Though obviously, nonequivalent or merely similar puzzles could share a solution.) The only way to be sure is to answer the question: Is there any sequence of valid sudoku transforms that can be applied to a solution to produce the very same solution, that does not ultimately involve simply undoing the transforms? If the answer to that is no, then indeed any canonicalization of the solution will affect the givens also. 

Back to top 


 Soultaker
 Joined: 28 Feb 2006  Posts: 49  :   Items 

Posted: Mon Mar 27, 2006 6:25 pm Post subject: 


Lummox JR wrote:  I know gsf has implemented an algorithm, but from his source code it's difficult to understand the algorithm itself 
I'm not sure myself (maybe gsf can clear it up?) but from his comments it seems he defines the normal solution as the lexicographically smallest solution with the topleft box being 1,2,3/4,5,6/7,8,9. This additional requirement seems unneccessary and a bit strange to me; why not simply pick the smallest solution, regardless of the topleft box? Maybe it has something to do with his algoritm (i.e. it's easier to the normal form according to this definition). 

Back to top 


 daj95376
 Joined: 05 Feb 2006  Posts: 349  :   Items 

Posted: Mon Mar 27, 2006 7:45 pm Post subject: 


I may be a simple country boy, but I don't understand the reason for all the discussions on finding a canonical normal form (CNF) to compare solutions.
If two puzzles have identical CNFs for their solutions, that does not mean the puzzles are identical.
(an example)
If your puzzle requires multicoloring to solve it ...
and another puzzle only needs naked/hidden singles to solve it ...
then I don't care if their solutions have matching CNFs.
To me, Sudoku is all about 'solving the puzzles' and not about 'comparing the solutions'.
Last edited by daj95376 on Tue Mar 28, 2006 1:20 am; edited 1 time in total 

Back to top 


 Lummox JR
 Joined: 07 Sep 2005  Posts: 202  :   Items 

Posted: Mon Mar 27, 2006 7:56 pm Post subject: 


Daj, my interest is in a canonical puzzle form, not canonical solutions. That is, to detemine if two puzzles are equivalent. Two nonequivalent puzzleswith the same canonical solution or withoutwill almost definitely need different solve methods. However two equivalent puzzles should require the exact same methods. If you flip a puzzle around, if you exchange two of its rows within the same 9x3 stack, if you rearrange the 3x9 "tower" sections of blocks, the puzzle will be superficially different but is still solved the same way. 

Back to top 


 Soultaker
 Joined: 28 Feb 2006  Posts: 49  :   Items 

Posted: Mon Mar 27, 2006 8:05 pm Post subject: 


You can safely replace almost definitely with definitely. Two 'equivalent' puzzles have the same basic properties and require the same solution techniques; they are essentially the same puzzle cleverly disguised.
edit:
Never mind what I said here... I had a bug in my code. 

Back to top 




You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum

Powered by phpBB © 2001, 2005 phpBB Group
Igloo Theme Version 1.0 :: Created By: Andrew Charron
