
View previous topic :: View next topic 
Author 
Message 
 dukuso
 Joined: 14 Jul 2005  Posts: 424  :  Location: germany  Items 

Posted: Tue Jul 26, 2005 9:23 am Post subject: generating random sudokugrids 


how do you generate a random sudokugrid ?
For 9*9 sudokugrids we could just enumerate equivalence classes,
choose one class with weighted probability according to
its size and then choose a random member in the class.
For larger sudokus this is infeasable.
For latin squares we have the nice JacobsenMatthews algorithm.
Is there something similar for sudokugrids ? 

Back to top 


 jaap
 Joined: 13 Jun 2005  Posts: 24  :  Location: NL  Items 

Posted: Tue Jul 26, 2005 10:27 am Post subject: Re: generating random sudokugrids 


dukuso wrote:  For latin squares we have the nice JacobsenMatthews algorithm.
Is there something similar for sudokugrids ? 
I hadn't heard of that algorithm before, so I had to look it up. Sadly, I don't think it can be generalised to Sudoku in a straightforward manner.
The algorithm considers a latin square as a function f(row,column,value), which is 1 if that value is in the cell at that row and column, and 0 otherwise. The constraints of the latin square (every value once in each column and row, every cell only one value) can then be given as conditions on f, namely that the sum of f() over each of its coordinates is exactly 1.
The algorithm then gives a way of making a small change to the function such that it still satisfies the conditions. One example of such a change is replacing the value of 4 cells on a rectangle from this:
1...2
2...1
to this:
2...1
1...2
(Note: The algorithm also allows f to temporarily take on value 1 at one point, which is then usually rectified by the next change. This occurs if the rectangle doesn't contain only 2 distinct values. I'll ignore this for now as it is not important.)
All this relies on the fact however that the coordinates are orthogonal. If we were to add the Sudoku boxes as an extra condition, that condition is no longer guaranteed to hold after the change is made, so that won't work.
In other words, if the four corners of the rectangle lie in 4 different boxes, then we break the box condition by such a swap.
The only question that remains now is this:
Suppose we start with a filled grid, and allow these rectangle swaps only when the corners do not lie in 4 boxes (but only in 2), but in all other respects follow the JM algorithm, can we reach every other possible Sudoku grid?
I suspect not. _________________ Jaap

Jaap's Puzzle Page:
http://www.geocities.com/jaapsch/puzzles 

Back to top 


 dukuso
 Joined: 14 Jul 2005  Posts: 424  :  Location: germany  Items 

Posted: Tue Jul 26, 2005 11:53 am Post subject: 


>dukuso wrote:
>For latin squares we have the nice JacobsenMatthews algorithm.
sorry, it's Jacobson Matthews with two "o"
>Is there something similar for sudokugrids ?
>
>
>I hadn't heard of that algorithm before, so I had to look it up.
>Sadly, I don't think it can be generalised to Sudoku in a
>straightforward manner.
>
>The algorithm considers a latin square as a function
>f(row,column,value), which is 1 if that value is in the cell
>at that row and column, and 0 otherwise. The constraints of
>the latin square (every value once in each column and row,
>every cell only one value) can then be given as conditions on f,
>namely that the sum of f() over each of its coordinates is exactly 1.
>
>The algorithm then gives a way of making a small
>change to the function such that it still satisfies
>the conditions. One example of such a change is replacing
>the value of 4 cells on a rectangle from this:
>
>1...2
>2...1
>
>to this:
>
>2...1
>1...2
>
>(Note: The algorithm also allows f to temporarily take on
>value 1 at one point, which is then usually rectified by
>the next change. This occurs if the rectangle doesn't contain
>only 2 distinct values. I'll ignore this for now as it is not important.)
very good, that you figured this out in this short time.
I didn't find such a description on the web. I implemented
the algo some time ago, but didn't quite understand it.
>All this relies on the fact however that the coordinates
>are orthogonal. If we were to add the Sudoku boxes as an
>extra condition, that condition is no longer guaranteed
>to hold after the change is made, so that won't work.
>
>In other words, if the four corners of the rectangle lie
>in 4 different boxes, then we break the box condition by such a swap.
>
>The only question that remains now is this:
>Suppose we start with a filled grid, and allow these rectangle
>swaps only when the corners do not lie in 4 boxes (but only in 2),
>but in all other respects follow the JM algorithm, can we reach
>every other possible Sudoku grid?
>
>I suspect not.
that's right. You can determine the cyclestructure for any pair
of symbols, that is the connected components in the graph whose
vertices are the 18cells holding the 2 symbols and there is an edge,
iff the cells are in the same row,column,block.
The cycle structure is not changed by your operation and when I
tried 450 sudokus, 420 of them had different cyclestructure.
But the same is true for latin squares, so I assume where this
1 conditions comes in to change the cyclestructure .
Maybe we could still generate a random latin square and then
make some small changes so the blockconstraint is met.
But that would probably no longer be random enough.
You could also allow the 1 as in the JM and then check
every k JMiterations whether the LS is still a sudoku,
if not then backtrack to the last known sudoku. 

Back to top 


 jaap
 Joined: 13 Jun 2005  Posts: 24  :  Location: NL  Items 

Posted: Tue Jul 26, 2005 1:23 pm Post subject: 


dukuso wrote:  >(Note: The algorithm also allows f to temporarily take on
>value 1 at one point, which is then usually rectified by
>the next change. This occurs if the rectangle doesn't contain
>only 2 distinct values. I'll ignore this for now as it is not important.)
very good, that you figured this out in this short time.
I didn't find such a description on the web. I implemented
the algo some time ago, but didn't quite understand it.

Obviously I haven't tried the algo yet, but I was wrong saying that the 1 gets rectified at the next change. The 1 does get fixed, but more often than not a 1 appears at another point. The larger the square, the longer it takes before the 1 dissappears. It is only then that you have reached another latin square.
dukuso wrote: 
But the same is true for latin squares, so I assume where this
1 conditions comes in to change the cyclestructure .

That is my understanding too. What is really happening with that 1 value for f is that you are allowed one cell containing several things, namely two normal values as well as a debt of another value.
As said before, the change the algo makes to a rectangle is something like this:
3..5
5..3
becomes
5..3
3..5
Think of this as removing the first numbers, and then placing the next ones. If this same change were done to this rectangle:
3..5
5..7
then afterwards the bottom right cell still contains a 7, but also a 5 and a debt of 3:
3..5
5..{3,5,7}
The next step must use the multivalued cell as the first corner of the rectangle, and have in the adjacent corners threes. For example:
4...3
3...{3,5,7}
Swapping threes and for example sevens on these corners, we get:
{7,4,3}...7
7.....{3,3,5}
The added 3 cancels the debt of 3 in the bottom right so that we get:
{7,4,3}...7
7.............5
Continue doing this until the fourth corner of the random rectangle happens to have a cancellation, and then we get a real latin square.
The problem is that you don't have much choice of rectangles when you have one of these multivalued cells. You must use the rectangle that has the debt value at the adjacent corners. The only choice seems to be which of the two other values you will use in the swap.
dukuso wrote: 
You could also allow the 1 as in the JM and then check
every k JMiterations whether the LS is still a sudoku,
if not then backtrack to the last known sudoku. 
Maybe. I have my doubts though, because the number of latin squares is so much higher than the number of sudoku's. Once it is no longer a sudoku, there is too little chance of it becoming a sudoku again. Therefore, this method will probably get stuck in a loop for a very long time. _________________ Jaap

Jaap's Puzzle Page:
http://www.geocities.com/jaapsch/puzzles 

Back to top 


 dukuso
 Joined: 14 Jul 2005  Posts: 424  :  Location: germany  Items 

Posted: Tue Jul 26, 2005 3:06 pm Post subject: 


>As said before, the change the algo makes to a rectangle is
>something like this:
>3..5
>5..3
>becomes
>5..3
>3..5
>
>Think of this as removing the first numbers, and then placing
>the next ones. If this same change were done to this
>rectangle:
>3..5
>5..7
>then afterwards the bottom right cell still contains a 7, but
>also a 5 and a debt of 3:
>3..5
>5..{3,5,7}
typo here ?
>The next step must use the multivalued cell as the first
>corner of the rectangle, and have in the adjacent corners
>threes. For example:
>
>4...3
>3...{3,5,7}
>
>Swapping threes and for example sevens on these corners, we
>get:
>
>{7,4,3}...7
>7.....{3,3,5}
>
>The added 3 cancels the debt of 3 in the bottom right so that
>we get:
>
>{7,4,3}...7
>7.............5
>
>Continue doing this until the fourth corner of the random
>rectangle happens to have a cancellation, and then we get a
>real latin square.
I have an idea how it works, still not exactly.
Sort of generalized chain
>The problem is that you don't have much choice of rectangles
>when you have one of these multivalued cells. You must use
>the rectangle that has the debt value at the adjacent corners.
>The only choice seems to be which of the two other values you
>will use in the swap.
maybe it can be extended to sudokus nevertheless ?
You have to do further adjustments for the blocks though.
>dukuso wrote:
>
>You could also allow the 1 as in the JM and then check
>every k JMiterations whether the LS is still a sudoku,
>if not then backtrack to the last known sudoku.
>
>
>Maybe. I have my doubts though, because the number of latin
>squares is so much higher than the number of sudoku's. Once it
>is no longer a sudoku, there is too little chance of it
>becoming a sudoku again. Therefore, this method will probably
>get stuck in a loop for a very long time.
just take care that the number of multivalued cells don't increase
and then hope that there will be a chance eventually to reduce them.
Doesn't matter, if this takes longer than with latin squares
Guenter. 

Back to top 


 jaap
 Joined: 13 Jun 2005  Posts: 24  :  Location: NL  Items 

Posted: Thu Jul 28, 2005 12:35 pm Post subject: 


dukuso wrote: 
>Think of this as removing the first numbers, and then placing
>the next ones. If this same change were done to this
>rectangle:
>3..5
>5..7
>then afterwards the bottom right cell still contains a 7, but
>also a 5 and a debt of 3:
>3..5
>5..{3,5,7}
typo here ?

I meant
5..3
3..{3,5,7}
Each row/column still contains the same set of values as before.
dukuso wrote:  maybe it can be extended to sudokus nevertheless ?
You have to do further adjustments for the blocks though. 
This is all interesting theory, but probably not practical until it is thoroughly researched and tested.
I am of the opinion it would be easier in practice to generate a full sudoku grid using a normal brute force sudoku solver. Just adjust it so that when it is at a branching point (e.g. choosing which value to try in the next cell) it tries each possibility in a random order, instead of numerical order. Then give this solver an empty grid and have it return the first solution it finds.
The only drawback with this kind of method is that you don't know for certain whether each grid has an equal chance of being generated. At some point one choice may still leave many possible grids to choose from, and another choice may force just one possible grid. On such a path the probabilities for the grids are obviously not the same. However, for every grid there are many paths for generating it, so I would hope that on average these biases would cancel each other out.
On second thought, this would probably be a little biased against sudoku grids containing one of those 2value rectangles we mentioned before, e.g.
1...3
3...1
because the choice between that and its opposite (with 1,3 swapped) can occur fairly late in the decision tree. The more minor variations a grid has, the less likely it is. _________________ Jaap

Jaap's Puzzle Page:
http://www.geocities.com/jaapsch/puzzles 

Back to top 


 dukuso
 Joined: 14 Jul 2005  Posts: 424  :  Location: germany  Items 

Posted: Thu Jul 28, 2005 1:46 pm Post subject: 


yes, the beauty of the JMalgo (I think I understand now how it works)
is, that they proved that
each LS is created with _exactly_ the same probability.
In practice you won't care, of course whether some grid
is a tiny bit more likely of being chosen than another one ;)
You can always just start at some sudokugrid and then just try to
find the lexicographically next one(s) with backtracking
in however order. Or backtrack through all possibilities
to permute entries containing digits say 2,5,7 while keeping
the other entries. And then apply the 6^8*9!*2 sudoku
transformations etc. 

Back to top 


 antony
 Joined: 22 Jul 2005  Posts: 13  :   Items 

Posted: Fri Jul 29, 2005 1:43 am Post subject: 2 methods for generating random grids 


As far as I am concerned, I only experimented with 2 ways to generate random grids. In both cases, the absence of need for guessing is not guaranted, so a humanlike solver should be used on the output grids.
The first way is mentioned on Wikipedia. It is quite fast with DLX, and produces a random (symmetric) pattern with a random number of hints, typically ranging from 21 to 27.
Code:  Start with an empty grid.
Repeat
If there are too many hints (30 for a 3x3), empty the grid.
Add one (or two symmetric) random hint(s).
Solve.
If no solution, remove the hint(s).
until the solution is unique.
For every hint (or pair of hints), in a random order,
Remove it.
Solve.
If multiple solutions, put the hint(s) back.

The second way can generate grids according to a given pattern. I used it quite successfully on an experimental samuraisudoku generator. Of course, one can still feed the algorithm with a random pattern. The number of givens can be fixed (it depends on the pattern.) For symmetric 3x3 grids, it generates 20 or 21hint grids much faster than the algorithm above, but is slower for 24 to 27hint grids.
There are 2 steps. The first one produces a random full grid, by using the solver with a few hints only. Then I apply the mask. The grid remains solvable, but with multiple solutions. The second step tries and reduces the number of solutions, down to 1 (not always, though) by randomly changing the digits in the disclosed cells.
Code:  Repeat
Start with an empty grid.
Repeat
Add one random hint.
Solve.
If no solution, remove the hint.
until there are a few hints. (e.g N=10)
(at this point, the grid has multiple solutions.)
Take hints from a solution according to a predefined pattern (i.e mask the first solution you find).
Until the solution is unique, repeat
Change the value of a hint randomly.
Solve.
If there is no solution, or more solutions than before*, restore the hint.
If too many iterations (e.g 1000) without improvement, break.
until the solution is unique.

(*) Here, I use my small DLX solver to count the number of solutions. Since it could take days on grids with few hints, it aborts after a fixed (high) number of iterations. This gives me a poor (but good enough) estimate for the number of solutions. But a complete search has priority over a partial search, even if the number of solutions is a bit worse.
There is certainly nothing new in all this. Just wanted to mention it as a starting point for some interesting discussion
Last edited by antony on Fri Jul 29, 2005 2:32 pm; edited 1 time in total 

Back to top 


 tilps
 Joined: 19 Jun 2005  Posts: 44  :   Items 

Posted: Fri Jul 29, 2005 9:26 am Post subject: Re: 2 methods for generating random grids 


antony wrote:  As far as I am concerned, I only experimented with 2 ways to generate random grids. In both cases, the absence of need for guessing is not guaranted, so a humanlike solver should be used on the output grids.
The first way is mentioned on Wikipedia. It is quite fast with DLX, and produces a random (symmetric) pattern with a random number of hints, typically ranging from 21 to 27.
Code:  Start with an empty grid.
Repeat
If there are too many hints (30 for a 3x3), empty the grid.
Add one (or two symmetric) random hint(s).
Solve.
If no solution, remove the hint(s).
until the solution is unique.
Repeat
For every hint (or pair of hints), in any order,
Remove it.
Solve.
If multiple solutions, put the hint(s) back.
until the pass removed no hint.


A couple of points, 'too many hints' is ofcourse arbitary. I have a loop counter which clears the board if the number of loops is > boardsize cubed.
In the final step, one can make this faster by not attempting to remove the same hint/pair of hints twice. There is no need to repeat the pass, as the removing of hints wont allow you to remove other hints. 

Back to top 


 antony
 Joined: 22 Jul 2005  Posts: 13  :   Items 

Posted: Fri Jul 29, 2005 2:31 pm Post subject: 


Oh, right! The last For loop does not need to be in a Repeat loop: if a pair can't be removed in the first pass, it can't be either in the second. I'm editing my previous message. 

Back to top 


 ethel
 Joined: 26 Aug 2005  Posts: 2  :   Items 

Posted: Fri Aug 26, 2005 7:09 pm Post subject: Re: generating random sudokugrids 


dukuso wrote:  how do you generate a random sudokugrid ? 
Easy. Use any backtracking algorithm you like to fill in an empty grid, but work through the list of choices at each level in random order.
For example: set a random 81long array of digits 08, called the offset list. Now build up from a blank sudoku so that at level r in the recursion, when we're trying to fill in the r^th cell of the sudoku grid, our k^th choice digit is 1 + (OffsetList[r]+k)%9.
Note that if you want to generate many independentlyrandom sudokus then you need to start from scratch (with a new offset list) each time. 

Back to top 


 dukuso
 Joined: 14 Jul 2005  Posts: 424  :  Location: germany  Items 

Posted: Fri Sep 09, 2005 4:31 am Post subject: Re: generating random sudokugrids 


ethel wrote:  dukuso wrote:  how do you generate a random sudokugrid ? 
Easy. Use any backtracking algorithm you like to fill in an empty grid, but work through the list of choices at each level in random order.
For example: set a random 81long array of digits 08, called the offset list. Now build up from a blank sudoku so that at level r in the recursion, when we're trying to fill in the r^th cell of the sudoku grid, our k^th choice digit is 1 + (OffsetList[r]+k)%9.
Note that if you want to generate many independentlyrandom sudokus then you need to start from scratch (with a new offset list) each time. 
but it won't generate each grid with same probability 

Back to top 


 ethel
 Joined: 26 Aug 2005  Posts: 2  :   Items 

Posted: Sat Sep 10, 2005 7:21 pm Post subject: Re: generating random sudokugrids 


dukuso wrote:  ethel wrote:  dukuso wrote:  how do you generate a random sudokugrid ?  Easy. Use any backtracking algorithm you like to fill in an empty grid, but work through the list of choices at each level in random order.
... 
but it won't generate each grid with same probability 
Yeah, okay, you got me. The algorithm assumes that each valid branch in the search tree has an equal number of grids below it, which might be rather far from the truth. I assume you're generating random grids in order to find 'nice' ones in some sense? In which case a fast biased algorithm might be better than a slow exact one. 

Back to top 


 mugnyte
 Joined: 07 Dec 2005  Posts: 13  :  Location: portland, or  Items 

Posted: Wed Jan 11, 2006 12:50 am Post subject: Generation Algorithm 


Just to add to the survey, here's mine:
(1) Place 9 randomized unique starts on the board, one in each box. This guarantees a solvable board. Or place a randomized pattern on the board (input).
(2) Complete the puzzle according to basic rules, accepting the first solution (there are several with only 9 starts)
(3) Remove cells until either S starts (S>=17, input) are left, or the board cannot remove any cells without making the solutions > 1 (count solutions using all logic possible for speed).
To ensure a specific difficulty level, I go a bit further:
(4) Solve the game left at (3) according to the logic modules chosen (input). Count the moves made, if >= required number of moves (input), keep. Otherwise, start over at (1).
The UI maps a simple slider to modules + moves, which is a "quick difficulty" chooser. Inputs: starts, modules, moves, [pattern], [start board]
This method is fairly efficient, except that step (4) can discard quite a few puzzles before attaining the required number of moves on tough puzzles. (max ~50) _________________ thanks 

Back to top 


 Soultaker
 Joined: 28 Feb 2006  Posts: 49  :   Items 

Posted: Tue Feb 28, 2006 9:54 pm Post subject: 


I 'invented' a different method of generating a random solution grid, which, I hope, does not suffer from the problem of not generating all classes of valid grids with equal probability.
What I do is initialize the grid with a random permutation of all symbols (so I start, for example, with nine 1's on the first row, nine 2's on the second, and so on, and then shuffle the entire grid). This, ofcourse, is not a valid grid yet because there are many cases of two symbols (eg. two 1's) on the same row, column or block. I count each extra symbol in a row, column or block as a conflict.
To reduce the number of conflicts I use a process that can be seen as a primitive form of simulated annealing: I pick two random positions in the grid and swap their symbols. If this increases the number of conflicts, I swap them back. If however, the number of conflicts either decreases or stays the same I leave it at that. This step is repeated until the number of conflicts is reduced to zero, at which point the grid is valid (by the definition of a conflict).
This method may sound a little flaky (like many probabilistic algorithms) but in practice it works very well. I would expect to run into problematic grids, where I need to increase the number of conflicts before I can ever reduce them again, but this situation nevers seems to arise in practice.
I can easily generate over a hundred 9x9 grids per second. I tried larger grids and altough they are slower, I can still generate a valid grid as large as 64x64 (!) in less than five minutes. So, I think, it is a very practical algorithm, especially for grid sizes for which no equivalence classes are known (i.e. everything except 9x9).
Now, I have two questions that I'd like to pose here (which I have not been able to solve myself):
1) Does this algorithm select every possible grid with equal probability? It is clear that it does not exclude any possible grids and I don't see why it would favor one class over the other, but that is not a complete proof. Does anyone have any thoughts about this?
2) Is it possible to arrive at a grid where it is impossible to reduce the number of conflicts, without first increasing the number of conflicts (which is, in my current algorithm, impossible)? If not, why not? 

Back to top 




You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum

Powered by phpBB © 2001, 2005 phpBB Group
Igloo Theme Version 1.0 :: Created By: Andrew Charron
