Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

Sudoku truths from Sudoku Assistant

 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Solving sudoku
View previous topic :: View next topic  
Author Message
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Sat Nov 05, 2005 1:13 pm    Post subject: Sudoku truths from Sudoku Assistant Reply with quote

Sudoku Assistant

http://www.stolaf.edu/people/hansonr/sudoku

This web-based JavaScript application now clearly demonstrates that:

--There is absolutely no difference in using forward vs. reverse logic.

--There is no particular need for "bifurcation".

--There is no particular need for "uniqueness tests".

--There is no particular need for "block row/column exclusion".

--X-wings, swordfish, hidden and naked ntuples are simply
different manifestations of the same thing -- namely, 2D subsets.

--Subset analysis IS required for general Sudoku solution using either
forward-directed ("This cell cannot be eliminated.")
or reverse ("This cell can be eliminated.") logic and is, in fact,
a form of logical bifurcation.

--There is no such thing as a "magic cell" except in the context of
whatever algorithm you happen to be using.

--The key to transcending "force chain logic" is to recognize when all
except ONE of a particular row, column, block, or cell are accounted
for. In that case, in a forward logic sense, if N-1 are FALSE the last
can be assigned TRUE, and we are off and running again. This is the
key component that removes the need for bifurcation.

--When doing this analysis, one need not test every node ("mark").
One simply needs to focus on the weak nodes and their associated
strong chains.

--The "Medusa" idea (3D strong chains) is simply a method of speeding
the overall solution process so that only two nodes in any particular
strong chain has to be tested, and so that nodes not directly
associated with a strong chain can be identified and ignored.

--recursive programming techniques fit such analyses well.

While it has only solved about 1500 puzzles, I cannot seem to find a puzzle the Sudoku Assistant cannot solve and am reasonably confident that the full 3D strong chain+weakly associated node+hypothesis is
in fact a complete system for solving nxn Sudoku.

I would very much like to find a puzzle that the Sudoku Assistant cannot solve, but I'm a bit skeptical now of that possibility.

Thanks again for all the great input from this forum.
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Sat Nov 05, 2005 1:36 pm    Post subject: Reply with quote

but if you want it to be taken serious, you'll have to
talk about its speed earlier or later.
I mean, without speed constraints I can do almost anything....
Back to top
View user's profile Send private message Send e-mail Visit poster's website
xyzzy

Joined: 24 Aug 2005
Posts: 80
:

Items
PostPosted: Sat Nov 05, 2005 4:18 pm    Post subject: Reply with quote

But we already know trial and error can find all solutions without need for other techniques....
Back to top
View user's profile Send private message Visit poster's website
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Sat Nov 05, 2005 8:22 pm    Post subject: Reply with quote

sure. That's for the speed demons to work on. If this is trial and error, then so be it. I like the "reverse" logic because it is trial but no "error". It hones in on a specific proof. As with all proofs, there are generally different ways to get to the same solution. What it does NOT do is randomly select nodes, run the puzzle, and see what happens. It's very systematic, and it specifically chooses (a) just the weak nodes and (b) the strong chains as just TWO possibilities -- I think that's probably new here -- the consideration of all members of a chain or linked strong chain system as one "object" for testing. I know that when I turned that off, it went dreadfully slowly. So I know that it's important for speed. I think anyone who is using forced chain logic should consider doing this.

But the value is not that it can solve the puzzles fast. The value is that it is UNDERSTANDABLE and APPLICABLE to essentially any explanation desired. Show me a neat human device for solving Sudoku, and I'll show you the logical equivalent that explains why it works. THAT I think is the value here -- in the explanation of various devices -- XY-wings, XYZ-wings, I suspect in uniqueness tests, but I haven't seen an actual implementation of that yet.

As for trial and error. This is much more fundamental than trial and error. Or, to put it another way, this provides the rules for trial and error.

My understanding of trial and error -- correct me if I am wrong -- is that you pick a value, run with it until you either get stuck or there is an error. (So far the same, in a way.) If you get an error, you know something. If you don't get an error, do you then now try another value starting at that interim point, branching out as needed to try this and that? If so, then this is NOT trial and error. Each "trial" is done once. There is no branching, no further testing. Explore the consequences and go on.

Speed. I'm not interested in testing this. My suspicion is that if I can do this in the slowest language ever written -- JavaScript -- then if someone implemented this in C++, it would really fly. I could be wrong. Maybe just guessing is the way to go there.

What I'm interested in now is how the technique can be used to explain truly human devices. I think it has lots of potential there as an interactive learning tool.


In addition, it's interesting that all this can be displayed in simple -- ok, maybe not so SIMPLE -- table format that anyone familiar with such can understand if it's not TOO complex an example. I've highlighted some of the monsters, but most only go to AaBbCcDd or so, and they are perfectly understandable when you see the table format and read through the few lines of logical output.
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
xyzzy

Joined: 24 Aug 2005
Posts: 80
:

Items
PostPosted: Sun Nov 06, 2005 1:07 am    Post subject: Reply with quote

Bob Hanson wrote:
sure. That's for the speed demons to work on. If this is trial and error, then so be it. I like the "reverse" logic because it is trial but no "error". It hones in on a specific proof.

But that's just the words you're using.

forward: Make an assumption X, find possibilities to 'mark', if all possibilities in a grouping are marked you have an 'error'. The error tells you X is not true.

reverse: Make an assumption X, find (the same) possibilities to 'mark', if all possibilities in a grouping are marked you have a 'proof'. The 'proof' tells you that not X is true.

So one way you prove X is false and the other you prove not X is true. You prove the same thing by marking the same cells in the same order from the same assumption. The only difference is the word you're using to describe your marking.

Bob Hanson wrote:

My understanding of trial and error -- correct me if I am wrong -- is that you pick a value, run with it until you either get stuck or there is an error. (So far the same, in a way.) If you get an error, you know something. If you don't get an error, do you then now try another value starting at that interim point, branching out as needed to try this and that? If so, then this is NOT trial and error. Each "trial" is done once. There is no branching, no further testing. Explore the consequences and go on.


That is only the most obvious and naive way of implementing trial and error.

First off, you don't have to consider every single possibility. Just a set of possibilities where at least one must be correct. For instance, you could pick one cell and assume in turn that each possible digit for the cell was true. You don't have to make any assumptions about other cells, just that one. You would probably not pick a cell at random, but pick one that has just two possibilities, so they are fewer digits to try.

Another way to guess would be to assuming two possibilities in the same subset are false. You only have to look at two, as either one or both of them must in fact be false, so one (or more) of your guesses must be correct.

Another way would be to pick one possibility and assume it's true, if not, then assume it's false. It's either got to be true or false, so you know one of the two assumptions is correct.

You also don't have recurse if your assumption doesn't lead to the correct answer or a contradiction. If you do, that's called a depth first search. But you can do a breadth first search. You make an assumption and find what you can after making it until you get stuck. Then you remember the state you're at, or if you could say you remember the chain of implications you discovered after the assumption, it's the same thing. Now you go back to where you started and pick a new assumption, save the implications you can make, and so on.

When you have enough assumptions that one must be correct (and two is enough depending on how you choose them), then you can go back to those saved states and make more assumptions. You can also compare the saved states, or implication lists if you call them that, and any common implications must be true.

Bob Hanson wrote:

Speed. I'm not interested in testing this. My suspicion is that if I can do this in the slowest language ever written -- JavaScript -- then if someone implemented this in C++, it would really fly. I could be wrong. Maybe just guessing is the way to go there.

As the author of the fastest sudoku solving program of anyone on this board, I think I have skill in knowing what a fast algorithm is. This algorithm sounds very slow to me.
Back to top
View user's profile Send private message Visit poster's website
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Sun Nov 06, 2005 12:00 pm    Post subject: Reply with quote

Quote:


forward: Make an assumption X, find possibilities to 'mark', if all possibilities in a grouping are marked you have an 'error'. The error tells you X is not true.

reverse: Make an assumption X, find (the same) possibilities to 'mark', if all possibilities in a grouping are marked you have a 'proof'. The 'proof' tells you that not X is true.


This is not correct. It should read:

forward: Make an assumption X. Follow the logical consequences of this assumption. If a logical inconsistency is found, you know that X is not true.

reverse: Make an assumption X. Follow the logical consequences of this assumption, looking for proof that X is true.

I think that's a significant difference. In the reverse method, no logical inconsistencies can be found. It is strictly, "Assume X. Try to prove that X is true."

Trial and error: I think it's clear from all those examples that trial and error as generally implemented is based on forward logic. The whole idea is to follow the consequences of an assumption and either (a) find a logical inconsistency or (b) solve the puzzle (i.e. find a magic cell). Is that fair enough to say?

I think what Sudoku Assistant shows is that one could implement the current methods of trial and error in a different way, focusing on logical proof rather than logical disproof -- reverse rather than forward. But, it also shows that this different way is really identical, so why bother.

Quote:

You would probably not pick a cell at random, but pick one that has just two possibilities, so they are fewer digits to try.


Right, that is, choose one of the strong chain nodes of the type AB. (One could equally well start with a node of the A-----A type, only two occurances of the same number in the same row, cell, or block). These are equivalent from a solving sense, just two different dimensions of the same thing.

Quote:


Another way to guess would be to assuming two possibilities in the same subset are false. You only have to look at two, as either one or both of them must in fact be false, so one (or more) of your guesses must be correct.


This sounds pretty random. You don't know if you have selecting two strong nodes of the same chain (not productive), two of different chains (possibly inherently inconsistent already), one weak and one a strong node (possibly inherently inconsistent already), a nonassociated node (of no consequence in the first place), etc.

But, I'm sure in certain cases you can get lucky and solve the puzzle this way. And it's quite possible that identifying the node relationships is slower than just going after each one independently.

Quote:


Another way would be to pick one possibility and assume it's true, if not, then assume it's false. It's either got to be true or false, so you know one of the two assumptions is correct.

Again, it seems to me valuable to know what sort of node you are working with. Doing this with a strong node is what Sudoku Assistant does first. Maybe that's not fast, but why say it's slow?

But doing this with a weak node is just wasting time. Setting a weak node FALSE and then using forward logic gets you nowhere. Setting it TRUE can find a logical inconsistency if forward logic is used. Setting a nonassociated node FALSE does nothing in the forward sense; setting it TRUE could have an effect on its own, but more likely will just set a few weakly associated nodes FALSE (in the forward sense here) and that does nothing much.

Quote:


You also don't have recurse if your assumption doesn't lead to the correct answer or a contradiction. If you do, that's called a depth first search. But you can do a breadth first search.


Sounds reasonable. The Sudoku Assistant implementation is breadth only. I'm sure that reduces its speed factor.


Speed: Speed is a worthy goal. I am honored to be in the virtual presence of the maker of the fastest solver in the world. That is truly an accomplishment. I mean that. No sarcasm here. I am a teacher. I'm not the fastest teacher. Speed in teaching is, of course, not generally a value. So I'm just in general less concerned with speed, by my very nature.

Clearly there are very clever tricks that fast solvers can use to select specific nodes to start with. What is described, for example -- selecting cells with only two values -- is simply selecting nodes of strong chains first. That's precisely what Sudoku Assistant does as well. I suspect that the next step -- selecting nodes that are in cells involving more than just two values -- is a mix of weakly associated nodes and simply nonassociated nodes.

I would THINK -- but I don't know -- that there might be an advantage in first taking the time to find out that there are only, say, 20 independent sets of these nodes -- 20 strong chains, so that only 40 tests need to be made, not the 150 or so that would be required if EACH such node were tested. In addition, one learns from this quick(?) analysis that there are certain nodes that need not be tested -- in a breadth-based strategy-- because setting their state has no particular logical consequence in the current solution state (they aren't strong chain nodes or weakly associated nodes).

Now, it's possible that this cataloguing -- this identification of strong nodes, weakly associated nodes, and nonassociated nodes -- in the end, on average, is slower than what is currently implemented by the fastest solver. I wouldn't know. If the author of the fastest solver in the world says it won't help, it probably won't help. I'm not here to enter that race; but someone else might like to give it a try.





[/quote]
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Sun Nov 06, 2005 1:29 pm    Post subject: Reply with quote

another indication of "trial and error" could be whether
you get the same timing, same technics when you
solve an equivalent transformation of the puzzle.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
xyzzy

Joined: 24 Aug 2005
Posts: 80
:

Items
PostPosted: Sun Nov 06, 2005 2:00 pm    Post subject: Reply with quote

Bob Hanson wrote:

forward: Make an assumption X. Follow the logical consequences of this assumption. If a logical inconsistency is found, you know that X is not true.

reverse: Make an assumption X. Follow the logical consequences of this assumption, looking for proof that X is true.


But you're just playing word games. The assumption you try to "prove" in reverse logic is different than the one you are making in forward logic. The key thing to realize is that trying to prove the reverse assumption and following the implications of the forward assumption are the same thing. You found the same nodes to mark, using the same method. The only difference is the words you used to justify the same actions. Compare:

  • When a node is "assumed true", all other nodes in its subsets can be marked "eliminated".
  • When a node is "hypothesized false", all other nodes in its subsets can be marked as "proving the hypothesis".
  • When a node is marked "ESLAF", all other nodes in its substes can be marked "EURT".

It's the exact same thing, you just use different words. Since the rules the words follow are the exactly the same, it doesn't matter whether you say "eliminated", "proving the hypothesis" or "EURT".

Quote:

Trial and error: I think it's clear from all those examples that trial and error as generally implemented is based on forward logic. The whole idea is to follow the consequences of an assumption and either (a) find a logical inconsistency or (b) solve the puzzle (i.e. find a magic cell). Is that fair enough to say?

Put it like this: You pick a node, and by following certain rules this allows more nodes to be marked. If certain conditions are satisified we learn if the original node is present in the solution or not. Whether you call it forward or reverse makes no difference. It's still the same rules and same conditions.

Quote:

But, I'm sure in certain cases you can get lucky and solve the puzzle this way. And it's quite possible that identifying the node relationships is slower than just going after each one independently.

Think of it like this, what is the point of finding ALL the node relationships if you might not use some of them? Why not just find the relationships as you need them, so you don't waste time finding relationships you never use?

Quote:
Quote:

Another way would be to pick one possibility and assume it's true, if not, then assume it's false. It's either got to be true or false, so you know one of the two assumptions is correct.

Again, it seems to me valuable to know what sort of node you are working with. Doing this with a strong node is what Sudoku Assistant does first. Maybe that's not fast, but why say it's slow?

But how do you find that a node is a strong node? Your list of chains isn't free, you had to calculate it for every single node. If I choose a node and then figure out it doesn't help, isn't that the same thing as figuring out a node is't part of a chain and not choosing it?

Quote:

But doing this with a weak node is just wasting time. Setting a weak node FALSE and then using forward logic gets you nowhere.


Setting a weak node false could reveal a locked tuple, or a uniqueness test, or a box intersection. But why choose a weak node? If I have kept track of the possible nodes in each subset, all I have to do is find a subset of size 2 and then I know I've not a strong node.
Back to top
View user's profile Send private message Visit poster's website
xyzzy

Joined: 24 Aug 2005
Posts: 80
:

Items
PostPosted: Sun Nov 06, 2005 2:13 pm    Post subject: Reply with quote

dukuso wrote:
another indication of "trial and error" could be whether
you get the same timing, same technics when you
solve an equivalent transformation of the puzzle.


Maybe for "random trial" and error. But an intelligent implementation will have some heuristic for determining which trials to make. Since equivalent transformations have no effect on what the best trial is, a good heuristic for finding the best trial would be largely immune to such transformations.

On the other hand, a solver that just uses techniques like searching for singles, locked tuples and so on will usually just search in order. Most implementations of locked tuples just look in row 1, then row 2, then row 3 and so on. On a transformed sudoku they will effectively look in a totally different order, and find different locked tuples first.
Back to top
View user's profile Send private message Visit poster's website
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Sun Nov 06, 2005 2:18 pm    Post subject: Reply with quote

As I've said, the forward and reverse techniques lead to an identical result. I think we are in agreement there. It is simply interesting (to me at least) that you can work the logic both ways, and think of finding a logical inconsistency as finding a "verity" (I think it is called). I'm convinced the forward direction is easier to grasp and to see. When the reverse method was introduced here recently, in the context of bifurcation, it was thought it might be something new. My result of doing both in Sudoku Assistant demonstrates clearly that they produce identical results. The logic is precisely, as you say, the same.

forward logic ---> logical inconsistencies
forward logic + bifurcation --> verities

reverse logic ---> verities
reverse logic + bifurcation --> logical inconsistencies

That's all I'm saying.

Essentially, I'm done with the "reverse logic" idea now. I like the forward idea better EXCEPT for the minor play with words that you have to find "errors" that way. Then it "sounds" more like "trial and error", which some people seem to disparage. If someone says, "Well, your forcing chain idea is just trial and error," you can simply say that it can also be thought of strictly in terms of finding "verities" as well, using Nick70s "reverse logic" technique. Then there are "trials" but no "errors". As you say, it's all the same.
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Sun Nov 06, 2005 5:40 pm    Post subject: Reply with quote

sorry about the double post -- I didn't have time earlier to finish.


Quote:
Think of it like this, what is the point of finding ALL the node relationships if you might not use some of them? Why not just find the relationships as you need them, so you don't waste time finding relationships you never use?


Well, that's an interesting and unaswered question. I could ask: Why bother checking all those nodes of a strong chain because you didn't figure out ahead of time that all those nodes were in the same set and all you had to check was one? Check one and you are done. It's not clear to me that on average there is anything gained or lost by cataloguing the node relationships. By doing so, you might avoid checking half the nodes.

But there's another reason: Doing so adds value to the solution by making it humanly interesting and ties in to other methods, such as "x-cycles" and "y-cycles" and such. The value is in understanding the solution, not just arriving at one. In addition, I suspect that one could use it as a basis for novel human-solving techniques.

Quote:
But how do you find that a node is a strong node? Your list of chains isn't free, you had to calculate it for every single node. If I choose a node and then figure out it doesn't help, isn't that the same thing as figuring out a node is't part of a chain and not choosing it?


OK, this relates to the basic Medusa strategy, not the logical forward/reverse business. Strong nodes are extremely easy and quick to spot. Putting them together into chains often finds an elimination without ever going any further. Weak nodes are a bit more involved, but nothing more than scanning rows, columns and blocks -- 20-24 checks in each case. But in both of these cases the construction of the catalogue usually finds an elimination, and no additional forward/reverse logic is involved. It's only in extraordinary cases where any logical implications have to actually be followed. This involves, for example, only about 1/3 of the top95 set.

Maybe you were thinking that all this strong/weak stuff was done first, and only then advancements are looked for. I'm sorry if I gave that impression. YES, that would be EXTREMELY slow. You can check that out by clicking the hypothesis "only" option at

http://www.stolaf.edu/people/hansonr/sudoku

Yes, that's VERY slow.

So there is a direct benefit of cataloging. I'm actually quite surprised you are not doing it already in your fast solver.

If you "pick a node and figure out it doesn't help" you have effectively mapped out its strong chain/weak link relationships. I guess if you are then using that map for the next step, you are essentially doing what I am doing. My "chains" are your information that you carry on to the next check, I suppose.

How do you "choose a node and then figure out it doesn't help?" Surely that takes some time. Again, I suggest it's an open question, and I'm not that interested in proving that one implementation is or is not faster than another. I'm not suggesting Sudokus should NOT be solved by trial and error or by whatever means you devise. Go for it, I say. I just find it interesting that once simple subset elimination is taken care of, either forward or reverse logical analysis does the trick in every case.

And I'm happy to report that there is a very fast web-based Sudoku solver now that anyone can use and probably understand. No, it's not the way to solve 200M puzzles. It's not that fast. But no one has shown me a puzzle yet it can't solve. I hypothesize (only) that if written in C++ this would be VERY fast. I could be wrong.

Quote:

Setting a weak node false could reveal a locked tuple, or a uniqueness test, or a box intersection.


Sure, and that could be a short cut. But the check for those costs time as well and isn't necessary in general. So everything's a tradeoff.
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Ruud
Site Admin
Joined: 17 Sep 2005
Posts: 708
:
Location: Netherlands

Items
PostPosted: Sun Nov 06, 2005 6:49 pm    Post subject: Reply with quote

Here's an approach to saving strong chains "on the fly":

Each time a candidate is eliminated, you probably want to check if that elimination caused a single. For this you need to maintain counters for # candidates left in the cell, and for # cells available for that digit in the row, column and box the cell belongs to.

When any of these counters reach 0, the puzzle has no solution.

When any of these counters reach 1, you found a single, and you can eliminate all competing candidates on the >1 counters.

When any of these counters reach 2, you found a strong link. If one of the 2 linked candidates already has a chain-ID, assign it to the other one. If none of them has a chain-ID, create a new one. If both have a different chain-ID, replace one of them, effectively combining them into one large chain.

The previous is probably something you already do in the Medusa routine, so it would not be to big a change to move that to "candidate elimination" routine.

It may look like all this maintenance would decrease your performance, but it is only executed when any counter reaches 2, which can happen only 256 times in a sudoku with 17 clues where every other cell has 2 candidates left, and each digit has 2 cells left in each house (I would be very interested to see that happen)

Linked lists in stead of counters are even better, but a little more complicated to implement.

Ruud.
Back to top
View user's profile Send private message Visit poster's website
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Sun Nov 06, 2005 7:16 pm    Post subject: Reply with quote

Ruud,

Quote:
When any of these counters reach 2, you found a strong link. If one of the 2 linked candidates already has a chain-ID, assign it to the other one. If none of them has a chain-ID, create a new one. If both have a different chain-ID, replace one of them, effectively combining them into one large chain.


OK, I'd like to not refer to "strong link". When that counter reaches 2, you have what I'd call a "strong edge" -- a pair of strong nodes. (One node per mark.) That only considers the "vertical stack" strong edges confined to a single cell. The other strong edges are in the "horizontal plane" of the board -- rows, columns, and blocks.

In my implementation I'm recording several binary arrays 2^0 -- 2^8. When the "bit count" in these arrays = 2, then you have a strong edge.

For example, if cell[0][0] = {1,2,3} then I set

Possible.xCell[0][0]= 2^0+2^1+2^2 = 7.
xNum(Possible.xCell[0][0])=3 here.

xNum(Possible.xCell[][])==2 -- your sort of strong edge
xNum(Possible.xBlock)==2 -- two of same candidate value in a block
xNum(Possible.xRow)==2 -- two of same candidate value in a row
xNum(Possible.xCol)==2 -- two of same candidate value in a column

Each strong edges has two nodes. So linking these edges into chains simply involves building essentially your linked lists.

It's the ORing, ANDing, and XORing of these arrays that identifies all hidden/naked subsets and all x-wings and such.

The "Medusa" idea simply allows a "12" type edge to be connected to a 1----1 type edge so that in, for example:

[12 23 123] [.....] [...]

we have

strong chain 2---1---1
strong chain 2---3---3
weak link "2" in that third cell.

Like this in "3D":

Code:

1-------1
|
2   2   2(w)
    |
    3---3


We can check 2 parities of 2 chains plus one test on 2(w) for a complete solution, instead of 7. These chains are short; real puzzles can have very long chains, for greater savings.

I have a feeling xyzzy is using something akin to this in his/her very fast solver. It would be interesting to compare, for sure.

xyzzy, what's the time to beat for top95 and "impossible 520"?

[/code]
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Mon Nov 14, 2005 9:14 pm    Post subject: logical chain analysis, bifurcation, dead ends, breadth Reply with quote

Dead End, but not really: Demonstration that any logical chain analysis is just trial and error and that "bifurcation" is just the beginning

See also: http://www.stolaf.edu/people/hansonr/sudoku/deadend.htm

The purpose of this posting is to demonstrate that

a) logical chain analysis including a step for checking for hidden singles does solve *just about* any 9x9 Sudoku.

b) "bifurcation" tabling logic, as presented in http://www.setbb.com/phpbb/viewtopic.php?t=300&mforum=sudoku
is equivalent to a "second pass" depth-search trial and error strategy.


Shown below is the configuration part way through the Sudoku Assistant analysis
of a puzzle from the top95 collection:
Code:

920300008006408090830500000000200080709005000002006004008000010001000402200700809

92.|3..|..8
..6|4.8|.9.
83.|5..|...
---+---+---
...|2..|98.
7.9|1.5|...
..2|9.6|..4
---+---+---
4.8|6..|.1.
..1|85.|4.2
2..|7..|8.9


Hypothesis and Disproof -- Looking for a logical inconsistency -- one of the "dead ends":

   |---c1--|---c2--|---c3--||---c4--|---c5--|---c6--||---c7--|---c8--|---c9--
-----------------------------------------------------------------------------
r1 |     9 |     2 |     7 ||     3 |     6 |     1 ||     5 |     4 |     8
   |       |       |       ||       |       |       ||       |       |       
---+-------+-------+-------||-------+-------+-------||-------+-------+-------
r2 |    15 |    15 |     6 ||     4 |    27 |     8 ||   237 |     9 |    37
   |       |       |       ||       |       |       ||       |       |       
---+-------+-------+-------||-------+-------+-------||-------+-------+-------
r3 |     8 |     3 |     4 ||     5 |   279 |   279 ||  1267 |   267 |   167
   |       |       |       ||       |       |       ||       |       |       
===========================||=======================||=======================
r4 | 13456 |  1456 |    35 ||     2 |  3479 |  3479 || 13679 |     8 | 13567
   |       |       |       ||       |       |       ||       |       |       
---+-------+-------+-------||-------+-------+-------||-------+-------+-------
r5 |     7 |  1468 |     9 ||    18 |   348 |     5 ||  1236 |   236 |   136
   |       |       |       ||       |       |       ||       |       |       
---+-------+-------+-------||-------+-------+-------||-------+-------+-------
r6 |   135 |   158 |     2 ||   189 |  3789 |     6 ||  1379 |   357 |     4
   |       |       |       ||       |       |       ||       |       |       
===========================||=======================||=======================
r7 |   346 |    79 |     8 ||    69 |  2359 |  2349 ||   367 |     1 |  3567
   |       |       |       ||       |       |       ||       |       |       
---+-------+-------+-------||-------+-------+-------||-------+-------+-------
r8 |   356 |    79 |     1 ||   689 |  3589 |    39 ||     4 |  3567 |     2
   |       |       |       ||       |       |       ||       |       |       
---+-------+-------+-------||-------+-------+-------||-------+-------+-------
r9 |     2 |   456 |    35 ||     7 |     1 |    34 ||     8 |    36 |     9
   |       |       |       ||       |       |       ||       |       |       
-----------------------------------------------------------------------------


This configuration is one of the "dead ends" arising from logical "hypothesis and disproof." At this point, Sudoku Assistant moves on to another hypothesis. (To get this table from the original top95 puzzle, load that puzzle, set the 3D Medusa analysis level to "+weak", and press "Solve". The solver will get stuck. Change the 3D level to "+hypothesis and disproof", check the "details" link, and then press "Step" once. Look down the list for the dead end at "Logical analysis: Chain 2(0) Dead End at k." Click the "Table" link.)

How "Hypothesis and Disproof" Works

In this analysis, you pick a "node A" (a specific possible value in a certain cell) hypothesize, "A cannot be eliminated," and see if that leads to a logical inconsistency (an "error"). If it does, then you know that the hypothesis is false, and "A" CAN be eliminated. The cycle of finding a solution is restarted. This node can be either a strong-chain node (a candidate in a cell with only two possible values or a candidate that has exactly two possible locations in a row, column, or 3x3 block) or a weakly associated node (one of a strong node's "buddies" -- in the same row, column, or block as that strong node).

If this analysis does not lead to a logical inconsistency, two outcomes are possible: Either you have found a "Magic Cell" -- the puzzle is solved -- or you have reached a "dead end."

When I started in on this hypothesis and proof business with the Sudoku Assistant, it didn't solve some of the top95 and "impossible 520" puzzles. The problem, I discovered, was that I was "missing" the idea that if N-1 members of a set of 'buddies' are found to be eliminated due to the hypothesis, then the Nth can be "set TRUE," and the process can be continued. Addition of this consideration allows for the solving of all of these puzzles. Indeed, I have yet to find a 9x9 Sudoku that can't be solved this way. Some have referred to the analysis as "logical bifurcation" when this special step is included.

"Bifurcation Logic" is the Same

An alternative route through logical analysis amounts to hypothesizing that Node "a" CAN be eliminated and then working backward to what would lead to this hypothesis being true. Sort of a "trial" without the "error." You can see this sort of analysis in action by selecting "proof" instead of "disproof" in the Sudoku Assistant. You will see that it is absolutely identical logic, just conceptualized in a slightly different way. The "bifurcation" rule in that case becomes the following:

If any one of N-1 members of a set of 'buddies' not eliminated (TRUE) would prove the hypothesis, then the Nth can be set FALSE (elimination would prove the hypothesis) and the process can be continued. To each his/her own.

(This overall idea was introduced by Nick70 in the above-mentioned forum discussion. The starting point is actually his "a", not his "A", because that is a more general starting point, but other than that, it is the same. The rule as originally stated in that discussion can be summarized as "{?TTT...}-->{FTTT...}." But there is no particular need to think of this process "in reverse" as was done there.)

{?FFF...} --> {TFFF...} Is Just the Beginning

But it turns out that there isn't anything special about adding this particular consideration. Really all it is is just starting over with the human solving stage -- looking for hidden singles. ("If you can't put 5 anywhere else, then it must go HERE....") We could have continued past this "logical dead end" looking for block row/column exclusions ("locked" cells), subsets, X-wings, chain analysis --- the works.

So, for example, if you look at the "dead end" shown above, it isn't really a dead end. The 5 in r8c1 can be eliminated. That's because the 5 in this block is "locked" into being in row 9 by the absence of 5s in row 9, columns 4-9. If you load this configuration and try to solve it, you will see that several other eliminations continue from there, ultimately leading to a logical inconsistency. Thus, node r4c6#4 can be eliminated because 4 must be in the third column of block 8. Node r8c4#6 can be eliminated by a nifty XY-Wing involving the three duples at r8c1 (just 36, once the 5 is gone), r8c6, and r7c4. Etc., etc. Clearly this is only a dead end because we have chosen to give up at this point.

The Beginning is Just Trial and Error Depth Search

If this sounds a lot like just solving a Sudoku, well, it is. We're just "starting over" with this particular configuration arising from our trial hypothesis. Continuing all the way through block row/column exclusion, subset elimination, grid analsysis, and chain analysis to an additional round of hypothesis and (dis)proof amounts at some point to a "depth-based" trial and error search with added "human" short-cuts. (An efficient solver will just dispense with ALL the human stuff and do this sort of logical analysis directly.)

My conclusion

It is interesting to try to solve Sudoku puzzles using "human" methods. All of these methods amount to "short-cuts" to simple trial and error. The spotting of a hidden triple, an X-wing, a swordfish -- these are just quick tricks that will suffice for the sort of puzzle generally presented in print. But all of these tricks will ultimately fail for some Sudoku. (This, I believe, is a fundamental aspect of the "NP-complete" nature of Sudoku. There is no GENERAL non-[trial and error] solution to such problems.) At this point, one can employ more advanced techinques -- coloring, tabling, and other simiplified variants of general logical chain analysis. But, for SOME puzzles, in the end this will not be enough. Whether you refer to the next step as "depth search" or just "continuing to look for logical inconsistencies", the process would continue by cycling through whatever method you have chosen to greater and greater depth until the job is done.


Sudoku Assistant:
http://www.stolaf.edu/people/hansonr/sudoku

this top95 Puzzle:
http://www.stolaf.edu/people/hansonr/sudoku?920300008006408090830500000000200080709005000002006004008000010001000402200700809

this configuration and logical analysis:
see http://www.stolaf.edu/people/hansonr/sudoku/deadend.htm

additional thread:
http://www.setbb.com/phpbb/viewtopic.php?p=3025&mforum=sudoku#3025
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Solving sudoku All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron