Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

Barely avoidable sets and the Holy Grail

 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Setting sudoku
View previous topic :: View next topic  
Author Message
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Wed May 31, 2006 8:11 am    Post subject: Barely avoidable sets and the Holy Grail Reply with quote

I've been intrigued lately by the idea of using unavoidable sets to generate givens from a filled grid vs. just adding clue by clue until a board is valid (then removing extras, not necessarily the best ones). I haven't developed any algorithm to find such sets yet, but I have a rough idea where to start.

While pondering the topic it occurred to me that in the more difficult puzzles, I frequently encounter situations where, say, a trio of cells can have values 14, 18, 148, only to discover that the 148 often reduces to 48. There is no logical rule that says this must happen; it's an artifact of such puzzles. In every case however the set can be broken because it is not self-contained; something else intersects it and forces a value, collapsing the set.

I call such sets barely avoidable, because they're a givenless segment of what could have been (and likely was) one of the key unavoidable sets in the solved grid.

It seemed to me, pondering this, that the number and complexity of the barely avoidable sets in a puzzle are the best possible indicators of its difficulty. The solve path is basically nothing more than one of the many possible iterations through the barely avoidable sets. Rather than trying to weight different iterative paths, then, why not assess the entire thing? By finding the barely avoidable sets, it might be possible to determine how their interactions with each other ultimately predict the puzzle's difficulty. (I suspect, for example, that in the toughest puzzles a group of several complex barely avoidable sets intersect each other, but hardly any simpler ones. The simpler the sets, the lower the difficulty for that portion of the puzzle. The interaction between such sets will however require more study.)

In short, I am proposing nothing less than a possible solution to the objective difficulty problem. While difficulty has many subjective factors, it's a fact that some puzzles are harder than others by all standards. This kind of set analysis might not just explain why, but tell us how objectively difficult a puzzle really is. Of course it's late, and I'm spitballing here with a brain running on fumes, so if anyone else has come up with an answer already, I'm all ears.
Back to top
View user's profile Send private message
Ruud
Site Admin
Joined: 17 Sep 2005
Posts: 708
:
Location: Netherlands

Items
PostPosted: Thu Jun 01, 2006 5:04 pm    Post subject: Reply with quote

Hi,

I've been thinking about that method of generating puzzles, but it seems like finding all unavoidable sets (or as many as possible) would take more time than just plugging in clues. I can imagine that the smaller sets (4 & 6) are easy to locate and can aid in selecting clues.

How did you plan to locate the larger sets?

cheers,
Ruud.
Back to top
View user's profile Send private message Visit poster's website
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Thu Jun 01, 2006 10:48 pm    Post subject: Reply with quote

Well, my plan to locate sets is to use an algorithm like this:

- For each pair/trio/quad of digits,
- Test each box, in turn, for which an unavoidable set was not found.
- Add the box to the list of boxes in the set and add its columns and rows (the ones containing the target digits) to the column/row lists.
- Proliferate columns, rows, and boxes by finding places where a digit in the target set is not covered by a marked box, column, and row.
- If a set was found covering less than all 9 boxes, continue with any unmarked boxes to find more sets for these digits.

Each set based on digits could be marked with bit flags: One for digits, one for boxes involved.

Every puzzle will have a minimum of 3618 unavoidable sets:
- In each band, any two columns or rows must contain at least one digit. (18)
- For any 2 digits, one of them must appear in the puzzle. (72+)
- For any 3 digits, two of them must appear in the puzzle. (504+)
- For any 4 digits, three of them must appear in the puzzle. (3024+)

It is sufficient for any one given to collapse the 3- and 4-digit sets, because the 2-digit sets should handle all other cases. For the digit-based sets I marked their number with a +, because one or more minimal unavoidable sets may comprise each. Each of the digit-based sets will have the following properties:

- It will cover D digits.
- It will cover S boxes, such that S>=D.
- It will cover exactly S columns and exactly S rows.

Unique loops are a 2-digit version of this concept, occurring when S>2. Unique rectangles appear when S=2.

(Edit: I believe that for D>2, any set such that D+S>N can be skipped, since that means S=9 and those sets are totally redundant for D>2. When D>2 we're only interested in the smaller sets. Therefore, the actual number of unavoidable sets may be significantly lower, only a minimum of 90.)

Because identifying a set based on its digits and either its boxes, columns, or rows will identify it uniquely, this means each set can be represented by two binary numbers. Only the 18 line-based sets work differently.

The only side effect I can think of to this generation method is that unless the generator is allowed to be a little bit random, it will always tend to fill in givens in a predictable pattern. Obviously you can make it randomly choose between two equally good candidates, but otherwise it'll follow the same path for every solved grid. And by good candidates, I mean for example that the cells touching the highest number of unavoidable sets will always be the most favored choice for finding the minimal puzzle for a filled grid. (Although I can see a flaw in that approach, in that multiple good choices might also touch the apparent best choice.)
Back to top
View user's profile Send private message
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Fri Jun 02, 2006 4:05 am    Post subject: Reply with quote

My previous calculations need serious updating. I forgot this works with triangular numbers, not factorials. Doh!

There are only 45+ 2-digit sets, 165+ 3-digit sets, and 495+ 4-digit sets. Those all include the trivial sets which take up all 9 boxes. Again the 3- and 4-digit sets don't need to be counted if they touch all 9 boxes because then they're redundant, so that leaves a minimum of only 63 unavoidable sets (including the 18 line sets) in any given puzzle.

Here's the output of my set finder without the line sets included, and excluding any trivial sets or the larger complements to small sets:

Code:
8 7 4|2 3 6|5 9 1
5 9 3|8 1 7|4 6 2
2 1 6|4 5 9|7 3 8
-----------------
4 6 2|7 8 3|1 5 9
3 8 7|5 9 1|2 4 6
1 5 9|6 2 4|8 7 3
-----------------
7 4 1|9 6 2|3 8 5
9 3 8|1 7 5|6 2 4
6 2 5|3 4 8|9 1 7

Unavoidable set #5
. . 4|2 . .|. . .
. . .|. . .|. . .
2 . .|4 . .|. . .
-----------------
4 . 2|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .

Unavoidable set #6
. . .|. . .|. . .
. . .|. . .|4 . 2
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|2 4 .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. 2 4
. . .|. . .|. . .

Unavoidable set #7
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. 2 4|. . .
-----------------
. 4 .|. . 2|. . .
. . .|. . .|. . .
. 2 .|. 4 .|. . .

Unavoidable set #9
. . .|. . .|. . .
5 . .|. 1 .|. . .
. 1 .|. 5 .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
1 5 .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .

Unavoidable set #10
. . .|. . .|5 . 1
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|1 5 .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . 1|. . .|. . 5
. . .|. . .|. . .
. . 5|. . .|. 1 .

Unavoidable set #11
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|5 . 1|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|1 . 5|. . .
. . .|. . .|. . .

Unavoidable set #16
. . .|. . .|. . .
. . .|. . .|. . .
2 . 6|. . .|. . .
-----------------
. 6 2|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
6 2 .|. . .|. . .

Unavoidable set #17
. . .|2 . 6|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|6 2 .|. . .
-----------------
. . .|. 6 2|. . .
. . .|. . .|. . .
. . .|. . .|. . .

Unavoidable set #18
. . .|. . .|. . .
. . .|. . .|. 6 2
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|2 . 6
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|6 2 .
. . .|. . .|. . .

Unavoidable set #19
. . .|. . .|. . .
. . 3|. . .|. 6 .
. . 6|. . .|. 3 .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .

Unavoidable set #21
. . 4|. . 6|. . .
. . .|. . .|. . .
. . 6|4 . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|6 . 4|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .

Unavoidable set #22
. . .|. . .|. . .
. . .|. . .|4 6 .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. 4 6
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|6 . 4
. . .|. . .|. . .

Unavoidable set #23
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
4 6 .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. 4 .|. 6 .|. . .
. . .|. . .|. . .
6 . .|. 4 .|. . .

Unavoidable set #25
. . .|. . .|. . .
. . .|. . .|4 6 2
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|2 4 6
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|6 2 4
. . .|. . .|. . .
(large set)

Unavoidable set #31
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. 7 .|. . 4
. . .|. 4 .|. . 7

Unavoidable set #35
8 . .|2 . .|. . .
. . .|8 . .|. . 2
2 . .|. . .|. . 8
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .

Unavoidable set #38
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. 3 8
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|8 . 3
-----------------
. . .|. . .|3 8 .
. . .|. . .|. . .
. . .|. . .|. . .

Unavoidable set #41
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . 8|. . 5|. . .
. . 5|. . 8|. . .

Unavoidable set #44
. . .|. . .|. . .
. 9 .|. 1 .|. . .
. 1 .|. . 9|. . .
-----------------
. . .|. . .|. . .
. . .|. 9 1|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .

Unavoidable set #45
. . .|. . .|. 9 1
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|1 . 9
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|9 1 .

Unavoidable set #46
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
1 . 9|. . .|. . .
-----------------
. . 1|9 . .|. . .
9 . .|1 . .|. . .
. . .|. . .|. . .

Unavoidable set #49
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|9 . .|3 . .
. . .|. . .|. . .
. . .|3 . .|9 . .

Unavoidable set #53
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
. . .|. . .|. . .
. . .|. . .|. . .
-----------------
. . .|. . .|. . .
9 . .|. . .|6 . .
6 . .|. . .|9 . .

I deliberately chose a run with a nontrivial 3-digit unavoidable set to show that they could be found. They are somewhat uncommon, appearing in under 10% (closer to about 5%) of the random grids I tested.

To properly use unavoidable sets to set givens in a puzzle, you need:

- All 18 of the line sets
- All minimal 2-digit unavoidable sets
- All nontrivial 3- or 4-digit sets

In a limited time testing I have yet to find a 4-digit nontrivial set. This jibes with my analysis, since if 3-digit nontrivial sets are uncommon, 4-digit sets must be downright rare. In well over 1000 grids I didn't find one.
Back to top
View user's profile Send private message
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Sat Jun 03, 2006 3:53 pm    Post subject: Reply with quote

Okay, it appears I was way off with my estimation of unavoidable sets. It seems that such sets can be far more complicated than the simple type I've been looking for. I suppose it didn't help that there appear to be no good resources about the types of unavoidable sets that exist. I still think there has to be a better way to find them than by exhaustive search though. Right now the only tried-and-true method seems to be:

1) Blank out a few boxes.
2) Recurse with a backtracking solver like DLX to find cells that are the same in all solutions.
3) Look for ways to split the remaining set into smaller sets.

Mind you this comes from looking at the output of the unavoid program, not from actually studying its behavior. I noticed the sets seem to be clustered by boxes. If that program doesn't use the behavior I described, then I think it's possible to find such sets in a much simpler way. Presumably DLX would be very fast at finding the large non-minimal sets because most of the grid is filled in, unless you get into territory covering 5 boxes or more. It would also be a very fast way to split existing compound sets, since a set of size S would only require O(S) to find a potential split and O(S log S) to completely minimize the sets.

One question that arises from me here is whether the larger complex sets have complements, as the smaller simple ones do. A 4-cell set always has a 14-cell complement with the same digits.

One thing of interest that I did find was that it's possible to reduce a non-minimal unavoidable set to a minimal one by various tests, including the uniqueness tests. A uniqueness test will either collapse the set entirely by removing all solutions, or will reduce the number of solutions, or will simply remove unnecessary candidates and lead you closer to removing some cells from the set. It should not, as I've hypothesized before, reduce a set to a single solution; my results seem to back that up.
Back to top
View user's profile Send private message
Moschopulus

Joined: 12 Aug 2005
Posts: 39
:

Items
PostPosted: Fri Jun 09, 2006 9:59 am    Post subject: Re: Barely avoidable sets and the Holy Grail Reply with quote

Lummox JR wrote:
I've been intrigued lately by the idea of using unavoidable sets to generate givens from a filled grid vs. just adding clue by clue until a board is valid (then removing extras, not necessarily the best ones).


That is essentially what checker does. Say you want a puzzle with 20 clues. Checker will generate a maximal collection of disjoint unavoidable sets, choose one clue from each of these sets, and then add in as many clues as necessary to get up to 20 clues.
Typically there might be 11 disjoint unavoidable sets, so checker would choose one clue from each and then another 9.

I should say, checker was not really written to generate puzzles but to search a completed grid for a 16. Using it to generate puzzles of a certain difficulty is interesting, I never thought about that.

Some discussion here about classification of unavoidable sets.

The program unavoid was updated based on that thread, and on Red Ed's computations. Unavoid now finds ALL unavoidable sets of size up to 12, and some of size 13 and 14. This typically takes about 2 minutes. I figured that was reasonable, the time grows exponentially as you include larger sets. (The option -uquick finds only some sets but is much quicker)

You probably didn't want to know any of this. I don't know if I can say anything to help! Source code for checker and unavoid is all there, feel free to modify.
Back to top
View user's profile Send private message Visit poster's website
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Fri Jun 16, 2006 4:31 am    Post subject: Reply with quote

That's a pretty interesting thread, although I'm still left wondering how unavoid manages to find those sets in the first place. My own algorithm isn't optimal.
Back to top
View user's profile Send private message
Moschopulus

Joined: 12 Aug 2005
Posts: 39
:

Items
PostPosted: Fri Jun 16, 2006 12:37 pm    Post subject: Reply with quote

Lummox JR wrote:
That's a pretty interesting thread, although I'm still left wondering how unavoid manages to find those sets in the first place.



By a "prototype" I mean an unavoidable set up to equivalence. There is 1 prototype of size 4, 4 prototypes of size 6, 9 of size 8, 3 of size 9, and so on. The numbers are in that thread, found by Red Ed. (You can do up to size 9 by hand, they are listed in the thread.) We got the list of prototypes from Red Ed and put them into unavoid. Then unavoid does all possible equivalences on the prototypes to find all unavoidables.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Setting sudoku All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron