
View previous topic :: View next topic 
Author 
Message 
 briturner
 Joined: 14 Oct 2008  Posts: 75  :   Items 

Posted: Sat Nov 29, 2008 11:58 pm Post subject: What is the best way to create a Sudoku? 


Greetings,
I got some time to build a Sudoku generator, but I am not very happy with my results. I set my solver to generate 1,000,000 puzzles, but the quality of the puzzles is very heavily slanted towards easy puzzles.
I judge my puzzles based on solving time, and rate very easy puzzles as puzzles that take under 20 usec to solve. My first pass generator created 98.36% easy puzzles, compared to Zerothbase's 35.59%. I added a test to remove any given not needed, but was still hitting 71.52% easy. So, for generating complex puzzles, Zerothbase's method looks twice as good as mine, if not better, though I do not know how quickly his runs.
I am thinking of 3 obvious methods to generate puzzles:
1) start with a given random completed sudoku, and remove cells randomly verifying it still produces a unique sudoku. At some point, check each remaining cell to see if it can be removed.
2) start with a desired random completed sudoku, and add givens based on the desired final solution, until a unique solution is achieved.
3) start randomly placing cells, and solve for uniqueness. If the puzzle has no solution, go back, if multiple solutions, add another cell.
I implemented method 3, and can generate many puzzles quickly (1,000,000 sudokus with unique solutions in 9.4 minutes). I then added a check to see if each given cell could be removed, which increased the time to 21 minutes per million.
So, my question is, what method do people use, and method works best for making difficult puzzles? Please feel free to add better methods.
Thanks
Brit 

Back to top 


 zerothbase
 Joined: 02 Nov 2008  Posts: 17  :   Items 

Posted: Sun Nov 30, 2008 2:16 am Post subject: 


I'll give you my method in detail:
1) Generate a filled solution grid.
I use a simple backtracking "solver" on a blank grid. I iterate over the squares 0 to 80, and at each square, I choose a random value of the possibilities to put in that square. If no possibilities, then backtrack. This is very fast  I can generate ~100,000 solutions per second this way.
Code:  int backTrackGen(int k)
{
if(k == GRID)
return 1;
int possib = GET_POSSIBLE(k);
while(possib)
{
int count = BITCOUNT(possib);
int try = possib;
int clearCnt = lrand48() % count;
int j;
for(j = 0; j < clearCnt; j++)
try &= ~(try & try);
try &= try;
set(k, try);
if(backTrackGen(k+1))
return 1;
clear(k);
possib &= ~try;
}
return 0;
} 
2) In a loop  generate a random subset of this solution (~30 squares), and check to verify 1 solution. This again is very fast for each puzzle. With a parameter of 30 values as the subset, it usually only has to check less than a dozen subsets before it finds one.
3) Randomly reduce the puzzle:
a) Choose a random square to start from, and iterate through all 81 squares (wrapping around)
b) If it is set to a value, set it to 0, check for 1 solution.
If 1 solution, repeat from step a to keep reducing further.
If multiple solutions, replace the value with the original and keep going.
c) Stop when there are no changes, after making 1 full pass.
This is still pretty fast, but starts to slow down as you perform more and more solve() steps, each time you reduce.
4) Hillclimbing:
Iterate through the puzzle and find squares that are empty. Try putting the value for that square from the solution back into this empty square and call the reduce function above to see if you can get it any smaller. If it is smaller, save this puzzle off and keep iterating until you have iterated over every empty square in the puzzle. Return the puzzle with the lowest number of clues.
Theoretically, you could call Hillclimbing repeatedly until it didn't reduce any further. The 600k puzzles I generated only had the hillclimbing function called once. Also you could try hill climbing by adding 2+ values at a time to get over bigger "humps" in the local minima.
Hillclimbing is by far the slowest part of the whole operation. Calling reduce() for each of the ~60 empty squares (which calls solve() 1 time for each clue removed) is rough.
5) Try to solve the puzzle with a completely stupid solver that only knows how to find naked/hidden singles (no backtracking/guessing). If it finds a solution, throw the whole puzzle out and start over. Otherwise, print the puzzle and start again at step 1, so that every puzzle has a completely different solution grid.
The ~600k puzzles I generated took around 10 hours I think. I actually generated 1 million solutions and step 5 took it down to ~600k (which is why it is such an odd number of puzzles).
This is certainly not the most efficient way to generate puzzles, but it was how I hacked one together a few months back. The only upgrade was really changing to use your solver (BB_Sudoku v0.5).
I used lrand48() for slightly better randomness than rand() or random(). I know that linear congruential algorithms aren't great, but at least it is 48 bit math (and therefore a 2^48 size cylce) rather than the 32 bit other ones. I'm sure there are many better libraries out there, but it was the best I found built in to the C library.
Zerothbase
"Anyone who uses arithmetic methods to produce random numbers is in a state of sin." John von Neumann 

Back to top 


 Lunatic
 Joined: 11 Mar 2007  Posts: 166  :  Location: Ghent  Belgium  Items 

Posted: Sun Nov 30, 2008 8:17 am Post subject: 


Step 1: Generate a random solution by simply using backtracking brute force on all cells. I only randomly change the order of the cells before starting the brute force.
Step 2: Remove as much cells as possible from the solution, while checking on unique solution for each cell removed. If multiple solutions, replace that cell and try next cell until all cells are tried. Again, i randomly change the order of the cells before removing.
Depending on the type of sudoku you want (horizontal mirroring, vertical mirroring, diagonal, antidiagonal, 90° rotational, 180° rotational, or whatever..., you can remove cells in pairs or quads (and replace them if multiple solutions).
Click here to view sample ccode.
On the other hand i've made a pattern based generator too. In that case, clues are randomly generated for a given pattern, and tested if the generated clues leads to a unique solution. _________________ Marc
~~~<><~~~<><~~~<><~~~<><~~~ 

Back to top 


 gsf
 Joined: 18 Aug 2005  Posts: 408  :  Location: NJ USA  Items 

Posted: Sun Nov 30, 2008 1:20 pm Post subject: 


zerothbase wrote: 
I used lrand48() for slightly better randomness than rand() or random(). I know that linear congruential algorithms aren't great, but at least it is 48 bit math (and therefore a 2^48 size cylce) rather than the 32 bit other ones. I'm sure there are many better libraries out there, but it was the best I found built in to the C library.

you can use a 64 bit FNV hash
its fast and has a 2^^64 cycle
here's what I use in my solver
Code:  #define HASH_ADD(h) (0x9c39c33dUL)
#if _WIN32
#define HASH_MPY(h) ((h)*0x00000100000001b3)
#else
#define HASH_MPY(h) ((h)*0x00000100000001b3ULL)
#endif
#define HASHPART(h,c) (h = HASH_MPY(h) + HASH_ADD(h) + (c))
#define RAND(m) ((HASHPART(seed,0)>>8)%(m))
#if _WIN32
typedef unsigned __int64 uint64_t;
#else
#include <inttypes.h>
#endif
uint64_t seed;

RAND(n) produces a pesudorandom integer in the range 0..n1
you can set the seed to a fixed value to enable reproducible runs
e.g., you can pass pesudorandom puzzle catalogs around with just 1 seed rather than a file with a million entries 

Back to top 


 briturner
 Joined: 14 Oct 2008  Posts: 75  :   Items 

Posted: Sun Nov 30, 2008 7:10 pm Post subject: 


I found this post by George Marsaglia at http://www.bobwheeler.com/statistics/Password/MarsagliaPost.txt. It provides a few different random number generators.
I use the combination multiplywithcarry, 3shiftregister, congruential generator.
Here is the main lines of the generator, but the link provides more information:
Code: 
#define UL unsigned long
#define znew ((z=36969*(z&65535)+(z>>16))<<16>>16))&65535)
#define MWC (znew+wnew)
#define SHR3 (jsr=(jsr=(jsr=jsr^(jsr<<17>>13))^(jsr<<5))
#define CONG (jcong=69069*jcong+1234567)
#define KISS ((MWC^CONG)+SHR3)
#define LFIB4 (t[c]=t[c]+t[c+58]+t[c+119]+t[++c+178])
#define SWB (t[c+237]=(x=t[c+15])(y=t[++c]+(x<y)))
#define UNI (KISS*2.328306e10)
#define VNI ((long) KISS)*4.656613e10
/* Global static variables: */
static UL z=362436069, w=521288629, jsr=123456789, jcong=380116160;
static UL t[256];
static UL x=0,y=0; static unsigned char c=0;
/* Random seeds must be used to reset z,w,jsr,jcong and
the table t[256] Here is an example procedure, using KISS: */
void settable(UL i1,UL i2,UL i3,UL i4)
{ int i; z=i1;w=i2,jsr=i3; jcong=i4;
for(i=0;i<256;i++) t[i]=KISS; }

brit 

Back to top 


 m_b_metcalf
 Joined: 13 Mar 2006  Posts: 210  :  Location: Berlin  Items 

Posted: Mon Dec 01, 2008 8:23 am Post subject: Re: What is the best way to create a Sudoku? 


briturner wrote:  So, my question is, what method do people use, and method works best for making difficult puzzles? 
You were given detailed algorithms for creating valid sudoku puzzles, but they all skirted around your actual question, which was how to create difficult puzzles. Over on the Sudoku Players' Forum you will find a thread dealing with just such puzzles. While there is no obvious recipe, there is some evidence that difficulty ratings can be enhanced if puzzles obey some of these rules:
1) between 20 and 24 clues;
2) clues arranged on diagonals;
3) clues arranged such that within a box no two clues share a row or column;
4) no two clues in a chute share the same value;
5) some symmetry in the values of the clues, as in:
Code: 
5 . . . . . . . 9
. 2 . 1 . . . 7 .
. . 8 . . . 3 . .
. 4 . 7 . 2 . . . SE 11.4
. . . . 5 . . . .
. . . . . 6 . 1 .
. . 3 . . . 8 . .
. 6 . . . 4 . 2 .
9 . . . . . . . 5

For extremely difficult puzzles, serendipity seems to play a rôle.
Of course, all that begs the question of how to rate difficuty. The Patterns Game, which you might care to join when you're ready, relies on Sudoku Explainer, but there are other measures. The Patterns Game incidentally demonstrates to just what a surprising extent it is possible to generate a wide range of difficulty for certain patterns.
HTH
Mike Metcalf 

Back to top 


 m_b_metcalf
 Joined: 13 Mar 2006  Posts: 210  :  Location: Berlin  Items 

Posted: Tue Jan 13, 2009 9:58 am Post subject: 


zerothbase wrote:  1) Generate a filled solution grid.
I use a simple backtracking "solver" on a blank grid. I iterate over the squares 0 to 80, and at each square, I choose a random value of the possibilities to put in that square. If no possibilities, then backtrack. This is very fast  I can generate ~100,000 solutions per second this way.

Very fast indeed, but is it unbiased? In other words, do you take the first 100,000 solutions to the blank grid, or do you 100,000 times take the first found solution to the blank grid?
Regards,
Mike Metcalf 

Back to top 


 zerothbase
 Joined: 02 Nov 2008  Posts: 17  :   Items 

Posted: Sat Jan 17, 2009 6:53 am Post subject: 


The later.
Each puzzle starts from a blank grid and uses the first solution found as its solution grid.
zerothbase 

Back to top 


 briturner
 Joined: 14 Oct 2008  Posts: 75  :   Items 

Posted: Sat Jan 17, 2009 2:30 pm Post subject: 


Greetings,
I have been trying to generate hard puzzles, but I have to admit, I and not in the same league as the big boys on the players forum yet. My top score so far is around 98423 Q2 rating.
I am on my 3rd generation method. Here are the 3 I tried:
1) pick a random cell, place a random number, then pick another random spot and place another random number. Continue until you have a unique solution, then minimize. I later altered my solver to indicate which cells were different in multiple solutions, so I would not randomly pick a cell that was already set.
Pros  fast, simple
Cons  while you could theoretically generate any puzzle, and thus hard ones, I did not generate a single difficult puzzle out of millions of puzzles.
2) started with a filled grid, added in 3 cells from each box such that no 2 were on the same row or column. If needed, added more to make the solution uniqu
Pros  still very fast, and generated harder puzzles then complete randomness
Cons  Still no real hard puzzles
3) started with a predefined set, in this case I set the corner 4 boxes with 123 123 on the diagonal and 456 456 in the boxes next to them. Then placed a single number the remaining boxes and completely filled in the last box. If I have a unique solution, I minimize it. I then iterated around variations on this puzzle.
Pros  Much harder puzzles generated (over 1000 with 95000+ Q2 rating)
Cons  very time consuming. been running for days now, and probably another week just to finish this variation.
So, puzzle generation is now running as a background task, and I just let it run. I already have new patterns as a baseline for the next run.
brit 

Back to top 


 gsf
 Joined: 18 Aug 2005  Posts: 408  :  Location: NJ USA  Items 

Posted: Sat Jan 17, 2009 4:44 pm Post subject: 


briturner wrote: 
I have been trying to generate hard puzzles, but I have to admit, I and not in the same league as the big boys on the players forum yet. My top score so far is around 98423 Q2 rating.

another method is to start with a hardish puzzle and do a tour around it
remove some clues, then add some clues, then check for validity
this is probably how many of the puzzles in the hardest list were generated
my solver's go option accepts a notation for doing a tour
a catenation of {n+m}xr terms: remove n clues, add m clues, repeat r times, then check
this is also a handy way of minimizing the number of clues (like in the search for 17's)
the patterns game (you should try your hand at that too) is another way to search for hard puzzles
only add/delete clues according to a fixed pattern (the diagonal patterns seem to be good seeds for hard puzzles)
see the gop option in my solver for an example of doing a tour on a pattern
you can also use the random clue drop/add, but only in pattern cells 

Back to top 


 lkSudoku
 Joined: 16 May 2009  Posts: 60  :   Items 

Posted: Thu May 21, 2009 4:31 pm Post subject: 


My generator chooses difficulty by trying to solve the puzzle with different methods without backtracking. If you create a solver that is powerfull and uses strong methods, then if your solver cannot solve a puzzle, this means it is a hard puzzle
If the puzzle is not hard enough, the puzzle creation process begins again 

Back to top 




You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum

Powered by phpBB © 2001, 2005 phpBB Group
Igloo Theme Version 1.0 :: Created By: Andrew Charron
