Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

New programmer, new effort, comments desired

 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Solving sudoku
View previous topic :: View next topic  
Author Message
Billy Joe

Joined: 10 Nov 2005
Posts: 3
:
Location: Southern California

Items
PostPosted: Fri Nov 11, 2005 12:11 am    Post subject: New programmer, new effort, comments desired Reply with quote

<pre>
Total newcomer to sudoku puzzles, just discovered three weeks ago in
the NYTimes crossword area. I'm retired and merely a hobbyist, VB6
programmer.

Goals:

* Generate repeatable, random grids and seed patterns.
* Produce printable puzzles solvable by human players with pencil
and paper or on-screen.
* Provide user-options to influence the difficulty of (or length of
time to determine) a solution.

My problem:

Seed generation. Avoiding the seed patterns that would be
unsolvable in any grid? For example: one seed, or should seventeen
prove to be the low limit, not generating seeds that are all in
columns zero and eight.

Add to that, a grid should be reproducible (from the random seed,
for example) and so should the seed pattern chosen for it.

Here's what I've done so far:

Grid generation -

random generation of column 0 (1-9) followed by redistribution of
rows 3-5,6-8,0-2 to column 1 and 6-8,0-2,3-5 to column 2. Then
randomize each of the three cells in each group, followed by random
selection of remaining values in each column, one row at a time.

For lack of "look ahead" a good number of grids could not be
completed, tho most were accomplished in satisfactory time for a
user. However, observation suggested that I need only look ahead to
the last column to significantly reduce the failure rate.

As a player pacifier (and debugging tool), I displayed the
generation (and seeding, and solution) but eliminated this after
discovering how slowly the digital display responds compared with
memory!

I also quickly eliminated the redistribution method mentioned, as it
became too obvious to a player, switching to random-unused numbers
for all of the grid, after randomizing column 0. Needless to say,
this took noticeably longer at times, as a very large number of
grids could not be completely generated - despite a slightly more
aggressive look ahead.

I suppose it's safe to say here, that backtracking would probably
have been the cure for this malady. I chose differently.

Lacking algorithmic knowledge, I finally chose to generate a grid
template that would produce ~360,000 grids based on randomizing the
first row or column of the target grid. The template is saved to
disc and its deletion from disc causes the program to generate a new
template which should be randomly different from the last.

The template is based on building a grid by the above technique but
where the first row of 1-9 is not randomized. The first solution
discovered becomes the uniquely numbered template. Grids are
generated from the template by randomizing column 0 then using the
template's r0 thru r8 values as an index into c0 to fill the rest of
the columns, thusly:

Dim r As Integer, c As Integer
For r = 0 To 8
For c = 1 To 8
cols(r, c) = cols(mastCols(r, c) - 1, 0)
Next
Next

Using my kindergarten math skills, it appears that there are a huge
number of different grids wherein column or row 0 contain the
sequential values 1 thru 9. And that each of these can be used as a
template for the 362,880 combinations of randomizing column 0.

Somehow though, this seems the least of problems?

Seed generation -

This is the stickiest wicket, at the moment.
Presently I randomize the numbers 0 thru 80 in an array then,
beginning with the first element, seed the grid per parameters:
a) Symmetric (rotational & mirror), Semi-symmetric (diagonal),
Asymmetric
b) 40, 36, 32, 28, 24, or Other seeds
until the chosen number of seeds has been sown.

By the way, the user options, in order to reduce attempted
solutions on unlikely seed patterns, are: setting max row/column
seeds and max 3x3 seeds, and/or forcing at least one empty row or
column into the pattern.

Solve-

I use a three step approach to a process-of-elimination solution for
each seed pattern sown on the grid.

a) In an array of 81 structures, turn on "needs" 1 thru 9 for any
non-seeded cell.

b) Remove from the "needs" any revealed cells in scope.

c) Examine the remaining needs for solutions
When there is only one known need within the scope, the cell is
answered.

Reiterate b) and c) while cells with any needs > 3 and new cells
revealed > 0.

The number of iterations thru b) and c) are indicative of the time
required for a human solution, although the method used by a person
will indeed differ from this simplistic strategy. The solutions
found in this way are unique.

The program also offers the user an option of allowing guesses. A
guess may be made when an iteration reveals no new cells and that
iteration number >= min guess level assigned. A guess will only be
made in a cell having but 2 needs. The program merely chooses the
correct answer, in this case, and continues attempting to solve the
problem with the process-of-elimination strategy. If the program
exceeds max guesses (2 as programmed), the solution fails. It's
possible to produce non-unique puzzles in this manner, but limited -
at worst - to two matching pairs (4 cells).

Other difficulties:

1) Too may unsuccessful seed patterns.
My thought here is to seed/solve in reverse - i.e. eliminate
cells from the answer according to the selected pattern and if
that step produces an unanswerable puzzle from that point on,
back up and change the seed pattern at that point.

Frankly, I'm unsure of how I'd approach this tho.

Any comment as to being on track or way off track is welcome.

2) I've no idea how to measure (human) difficulty vs. number of
iterations. Clearly, some puzzles solved by the method employed in
4 iterations are harder for a human (me) to solve than others
produced in 10 iterations. Tho generally, the result is as
expected.

3) The program rarely finds a solution to seeds of 24. There seems
to be no likelihood at all that this method will ever find a
solution with fewer seeds. But I also don't know, of the solution
algorithms mentioned in relevant articles, whether humans can solve
these grids with a pencil?

4) My present methods of generating grids and seeds leads to a three
number identifier (template, grid, and seed) plus the necessity of
reinstating any other options which might influence the outcomes of
these IDs. With more than 2^60 grid possibilities, and I've no idea
how many solvable seed combinations, the user is confronted with a
rather large value, prone to entry errors, to regenerate the same
puzzle at another time.

5) The program supports "playing." In addition to the seeded grid,
which can be filled in by the "player," there is a scratch-pad grid,
which allows the user to make multiple entries per cell of possible
answers. These are displayed in 8 point, while answers are 18 point
font. Each cell has room for about 10 scratch numbers.

I don't know whether all keyboards are alike (though I'd hope so)
but I had a devil of a time implementing the numeric pad vs. the
numbered typing keys for this grid, as my intent was to allow the
Shift key to place an answer, while the unshifted number would be
a scratch answer.

It's easy enough, in VB6, to get the shift state and the key
pressed, but my Logitech cordless KB appears to change the key value
of pad keys and the shift state when the shift key is held. The
Ctrl key posed other issues and I've ended up - at the moment -
using Shift/Number on key press for the typing keys and Ctrl/number
on key release for the pad.

General observation:

As I solve one of these puzzles on paper (or the on-screen scratch
pad) I find myself noting all the combinations of needs I've seen
mentioned among the algorithms proposed for solutions. However, I'm
constantly reminding myself that the program originally solved the
problem by process of elimination only!

My source code -

http://users.adelphia.net/~bjb1939/VB6_SuDoKu_Source.zip

and, for any who need it, the VB6sp5 runtime

http://users.adelphia.net/~bjb1939/VBRun60sp5.exe
</pre>
Back to top
View user's profile Send private message
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Fri Nov 11, 2005 2:49 am    Post subject: Reply with quote

Most success in generating grids has come from using an algorithm called dancing links, which will fully solve the grid and tell you if it has multiple solutions.

My method of generating grids works like this:

- Add givens from the set of available choices, up to a predetermined random number (somewhere from about N^2/4 to N*sqrt(N)--I haven't come up with a good metric for that yet). At each stage mark off the clues used via the dancing links algorithm, to eliminate givens that are no good (like placing another 3 in the same row).
- Run the grid through dancing links at this point. If it is unsolvable, remove clues at random and keep retesting until it is solvable.
- If the puzzle has multiple solutions, add a new given from one of the solution grids, and test again; continue until the puzzle has only one solution.
- Looping through the clues at random, try removing one at a time and retesting; if the result has multiple solutions, return the clue and move on to the next one. If any clue could be removed, repeat this step again.

With symmetry basically any addition or deletion of clues is done with a set. (Actually I find it easier to start out just adding the original clues, then filling in with symmetrical ones from the solution grid. At that point I then add or remove clues symmetrically as needed.) In crossword-style symmetry that set could be 2 clues or it could be the center clue. You can also have quad symmetry and several other kinds. I'd recommend overall that you try a technique like this without symmetry to start, and add symmetry later.

Above all it's important to have a non-human solver algorithm just to verify puzzle integrity. Dancing links is fairly easy to implement and is the fastest known solve algorithm.
Back to top
View user's profile Send private message
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Fri Nov 11, 2005 7:00 am    Post subject: Reply with quote

you can also just solve the empty grid with a randomized solver
and then delete clues by a randomized permutation.
Whenever the resulting puzzle has multiple solutions put that clue in again.
Very simple. But maybe a bit biased towards easy puzzles ? (not much)
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Billy Joe

Joined: 10 Nov 2005
Posts: 3
:
Location: Southern California

Items
PostPosted: Fri Nov 11, 2005 4:16 pm    Post subject: Reply with quote

Thanks Lummox JR & dukuso.

I intend to pursue DLX, as I've seen it mentioned so often in just a brief reading period. However, I'm looking for ideas on how to avoid randomly generating seeds that are completely hopeless - for example 18 seeds appearing only in columns 0 and 8. I've implemented some user parameters which set max seeds per column or row and max seeds per 3x3, which improves performance overall (as I don't then attempt a solution for patterns outside these params). I'm wondering if anyone else has done any testing along these lines and may have other ideas?

BJ
Back to top
View user's profile Send private message
Ruud
Site Admin
Joined: 17 Sep 2005
Posts: 708
:
Location: Netherlands

Items
PostPosted: Fri Nov 11, 2005 5:06 pm    Post subject: Reply with quote

I've done some experiments with this type of random grid generation. To avoid these "swarms", assign priorities to possible candidates.

Priority can be based on:

a. Number of empty cells in row,
b. Number of empty cells in column,
c. Number of empty cells in 3x3 box,
d. 9 - Number of times the digit already occurs in the grid.

Using these priorities, the generator will create an evenly distribution pattern.

To avoid a strong bias, and you playing the same game each time, allow a random number to rearrange the priorities by a small amount.

Ruud.
Back to top
View user's profile Send private message Visit poster's website
Billy Joe

Joined: 10 Nov 2005
Posts: 3
:
Location: Southern California

Items
PostPosted: Sun Nov 13, 2005 12:45 am    Post subject: Reply with quote

Good advice, Ruud, thanks.

I have user parameters for max col/row and max 3x3 now. These certainly help, along with 'forced blank' for one or more row/col.

I think that my approach may be entirely different from others here (in that I generate a filled grid first, then sprinkle random seeds, then solve the seeded grid back to the filled grid).

I had been generating the grid then, for each failed seed, generating a new seed pattern. It has just occurred to me that, limiting seed patters (as you've suggested +) really means that - having found a seed pattern that is acceptable, I need to mate it with a grid - thus I should be generating new grids. I've may have been doing this backwards.

Many many thanks for the answers from each of you.

Of course, I may try this and find no improvement at all:-(

I'll be back.

BJ
Back to top
View user's profile Send private message
gaby

Joined: 02 Jul 2005
Posts: 120
:

Items
PostPosted: Tue Nov 15, 2005 10:09 am    Post subject: Reply with quote

I rather like your idea on avoiding "swarms", as you put that. I find that my generator makes so many "easy" puzzles that I've started throwing them away, as I already have 300,000 in my DB. I've limited it to rating 2 and above.

Using a random method, what sort of difficulty distribution to do you find with this method? I presume it's still biased towards the easy puzzles?
_________________
Free daily sudoku - Online puzzle database
http://vanhegan.net/sudoku/
Back to top
View user's profile Send private message Visit poster's website
Ruud
Site Admin
Joined: 17 Sep 2005
Posts: 708
:
Location: Netherlands

Items
PostPosted: Tue Nov 15, 2005 10:42 am    Post subject: Reply with quote

gaby wrote:
Using a random method, what sort of difficulty distribution to do you find with this method? I presume it's still biased towards the easy puzzles?

It mainly generates 22-26 clue puzzles, that require no more than singles and line-box interactions. When optimized, these easies turn into little monsters, requiring tabling, with about 20 clues left.

Maybe 1 in 10-20 is a "medium", requiring x-wing, uniqueness, subsets, etc. But that's not too bad for my solver. I am not in the puzzle-selling business (yet). I think that most players consider what we call "easy" already more difficult than they find in the papers.

Ruud.
Back to top
View user's profile Send private message Visit poster's website
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Wed Nov 16, 2005 4:29 pm    Post subject: Reply with quote

One method I use is to randomize the columns used by DLX in the solver. The reason for this is that I also use DLX for adding clues. I have it bail out at the 2nd solution, and return a random clue from the set of non-givens. (This could use some optimization of course; I'd rather have it return a random clue from the 2nd solution that also doesn't match the 1st solution.)

All my human-style solve routines use the exact cover matrix to represent the possibilities, so before using those I unscramble the columns.

As to your observations on puzzle difficulty, gaby, my generator also produces about 50% easy puzzles. The other half is scattered very widely in difficulty, from near-easy all the way up to nightmarish. I think the upper half of the distribution also has a cluster to it, around puzzles that require the use of supercoloring but nothing worse.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Solving sudoku All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron