Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

Pseudocode for forcing chain/net discovery

 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Programming sudoku
View previous topic :: View next topic  
Author Message
Draco

Joined: 27 Jun 2008
Posts: 3
:

Items
PostPosted: Thu Jul 24, 2008 6:39 am    Post subject: Pseudocode for forcing chain/net discovery Reply with quote

Fulfilling my offer to Ronk to post this when the code stabilized…

The goal of this algorithm is to find forcing chains or nets that originate from a b/b value (either a square with exactly 2 possible values or a house with only 2 squares containing a given value). The code will create two puzzle copies, setting one b/b value in copy 1 and the second in copy 2.

The code will then look for any changes to the PM’s that this assignment produces and record said changes. Then singles are found in multiple passes or iterations. After each iteration to find the available singles, record the changes to the PM’s. To be more precise, track which potential values from the original puzzle grid are removed from consideration as each singleton is assigned. I call these “cancellations.” We are looking for one of a few conditions:

1. Values in a square that are cancelled in both puzzle copies (common cancellation). Common cancellations must be false.
2. Contradiction Type 1: the path leads to 2 of the same value in one house. This means that b/b value/location originating the path must be false (recorded in CommonCancelPoints).
3. Contradiction Type 2: the path leads to a square with no viable candidate values remaining. This means that b/b value/location originating the path must be false (recorded in CommonCancelPoints).

As soon as any one of these 3 events is triggered (which is signaled by the appearance of one or more rows in CommonCancelPoints), the search stops and returns the chain/net that it found. The chain/net is returned for each puzzle copy as a list of the singles found in each iteration along with data to help in turning that into a path through the puzzle. The search also stops if there are no further singles to be found for either path. If no cancellations are found, the next b/b pair is evaluated. This continues until cancellations are found or all b/b pairs for the current puzzle grid state have been exhausted.

No attempt is made to apply any other logic (e.g. not looking for patterns, locked sets, ALS, etc.) insofar as chain propagation goes. In this regard, any single path is one which a human can follow (at least anybody who can solve puzzles that have been reduced to singles, both hidden and open) even though it utilizes all singletons, not just forces involving b/b locations. No claim is made that any person would be likely to try all of these variations. Smile

There are optimization parameters that are applied that can limit the search based on:

1. Starting square
2. Cancellation of a value in a square
3. Depth of the chain (# of iterations)
4. Length of chain
5. # of values cancelled
6. Degree of puzzle simplification produced by the chain

1 places limits on the b/b pair(s) to be considered in the outer FOR loop. 2 places limits on which common cancellation points will be accepted for termination of the search. 1 and/or 2 allow the user to focus the search on desired starting or ending points. 3 and 4 are used to look for simpler chains first (I call this throttling). 5 and 6 are applied for “smart searching” which scores every chain found for the grid at different throttle settings, returning the chain with the best score. This implies that one has a good scoring algorithm. It can also take a long time. With proper tuning, however, it is always as good as a simple, throttled search (which is pretty quick, usually under 100 Ms). It is also always slower (sometimes 10 – 100x slower), and often cracks the puzzle faster. I currently tune the “smart search” to take any solution that reduces the puzzle to singles over all other solutions, regardless of other considerations (such as length, depth, etc). I could just as easily tune it to do likewise for puzzles cracked to STSS or any other level of granularity offered by my scoring algorithm. That is more a matter of taste than method… though I should note that coming up with a good (i.e. relatively fast, repeatable, reliable/consistent in terms of how one chooses to rate puzzles) algorithm was not as easy a task as I first thought it might be.

This post does not show the throttling techniques, scoring techniques, finding b/b pairs, how to find singles (FindSinglesForIteration) or how to set a value and track the effect it has on PM's (SetSquareValueAndTrackCancels). I am happy to go into those details in a separate thread should anyone want to discuss things to that level of detail. My assumption is that readers already know how to perform these tasks. The particulars of their implementation would be very closely tied to the internal puzzle grid/PM representation used by the baseline solver (at least in my code they are).

This post also does not describe the work involved in turning the chain/net into a displayable form (GUI-based or text suitable for forum posts). If there is interest, I can describe the current state of my code for doing that as well. I found this to be a thornier problem in many ways than finding the chain/net itself (at least once I settled on the design below). That is, in part, due to the broad swath I cover for a single “iteration” (1 iteration includes all the open and hidden singles present for any single grid state… which means initial order depends heavily on the order you choose for calling out those singles). The code I have works well enough, but could be optimized in several edge conditions to draw more direct chains.

Finally, keep in mind that if the PM’s going in are bad, then the cancellations produced by this algorithm may be bad (as bad PM's often lead to an invalid solution). GIGO.

With that intro, here’s the pseudo-C code to find any single chain/net (before optimization is applied). Thanks go out to Danny for showing me how to post more readable code.

Code:
int puzzle[9][9];              // Current puzzle grid (provided by caller), with PM’s
int p1[9][9];                  // Copy #1 of puzzle (init in loop)
int p2[9][9];                  // Copy #2 of puzzle (init in loop)
int bbCount;                   // count of all b/b locations (2 entries per b/b pair)
int p1Going, p2Going;          // Set to -1 if p1/p2 have not “run out” yet, 0 if they have
int bbLocs[bbCount *2][3];     // fill with row, col, value for b/b locations (2 per pair) before
                               // loop starts.
int CommonCancelPoints[][3];   // will be filled with row, col, value of common cancel points.
int ccCount;                   // Count of common cancel points
int p1ChainPoints[][4]         // returns singletons found for p1: row, col, value, iteration. Note
                               // that my code returns 7 or 8 columns to help with user-readable
                               // view construction, but those are not needed just to find the chain
                               // or net itself.
int p2ChainPoints[][4];        // Likewise for p2
int p1CpCount, p2CpCount;      // Counts for p1ChainPoints and p2ChainPoints

/  ****************************************************/
// FIRST fill bbLocs with b/b locations and set bbCount
// b/b init is not shown in this pseudocode
/  ****************************************************/

ccCount = 0;

// using LT in lieu of less-than symbol so that the code isn’t messed up by
// BBCODE parsing…

for (int ibvIndex = 0; ibvIndex*2 LT bbCount && ccCount == 0; ibvIndex ++)
{
   p1CpCount = p2CpCount = 0;
   p1 = CopyOf (puzzle);      // Could implement CopyOf with memcpy() …
   p2 = CopyOf (puzzle);
   SetSquareValueAndTrackCancels(p1, CommonCancelPoints, &ccCount, bbLocs[ibvIndex*2]);
   SetSquareValueAndTrackCancels(p2, CommonCancelPoints, &ccCount, bbLocs[ibvIndex*2 + 1]);
   p1Going = p2Going = -1;
   for (int iIteration = 1; (p1Going || p2Going) && ccCount == 0; iIteration++)
   {
      if (p1Going)
         p1Going = FindSinglesForIteration (p1, iIteration, CommonCancelPoints, &ccCount,
             p1ChainPoints, &p1CpCount);
      if (p2Going)
         p2Going = FindSinglesForIteration (p2, iIteration, CommonCancelPoints, &ccCount,
             p2ChainPoints, &p2CpCount);
   }   // end of FOR on iIteration
}   // end of FOR on ibvIndex
____________________________________________________________________________________________________


On exit from FOR’s, if ccCount > 0 then we have a forcing chain (or net) that is useful. If not, then we have no forcing chains for this puzzle grid in its current state. p1ChainPoints has the singletons created for each iteration for the first path (even bbLocs indices) and p2ChainPoints has the singletons for the second path. CommonCancelPoints tells us which PM’s are cancelled and which square they’re in), and ccCount tells us how many there are. The TRICK that is left as an exercise is turning a list of singles with iteration data into a view users will grok. Regardless, the CommonCancelPoints may be safely removed from the PM’s as progress towards a solution.

Cheers…

- drac

[EDIT: replaced code section with Danny's much cleaner copy]


Last edited by Draco on Thu Jul 24, 2008 4:43 pm; edited 1 time in total
Back to top
View user's profile Send private message
daj95376

Joined: 05 Feb 2006
Posts: 349
:

Items
PostPosted: Thu Jul 24, 2008 3:02 pm    Post subject: Reply with quote

I will have to study your algorithm in more depth. A quick pass left me a couple of questions. As for making your code listing readable, please note the additional line (of continuous underscore characters) that I added below.

Code:
int puzzle[9][9];              // Current puzzle grid (provided by caller), with PM’s
int p1[9][9];                  // Copy #1 of puzzle (init in loop)
int p2[9][9];                  // Copy #2 of puzzle (init in loop)
int bbCount;                   // count of all b/b locations (2 entries per b/b pair)
int p1Going, p2Going;          // Set to -1 if p1/p2 have not “run out” yet, 0 if they have
int bbLocs[bbCount *2][3];     // fill with row, col, value for b/b locations (2 per pair) before
                               // loop starts.
int CommonCancelPoints[][3];   // will be filled with row, col, value of common cancel points.
int ccCount;                   // Count of common cancel points
int p1ChainPoints[][4]         // returns singletons found for p1: row, col, value, iteration. Note
                               // that my code returns 7 or 8 columns to help with user-readable
                               // view construction, but those are not needed just to find the chain
                               // or net itself.
int p2ChainPoints[][4];        // Likewise for p2
int p1CpCount, p2CpCount;      // Counts for p1ChainPoints and p2ChainPoints

/  ****************************************************/
// FIRST fill bbLocs with b/b locations and set bbCount
// b/b init is not shown in this pseudocode
/  ****************************************************/

ccCount = 0;

// using LT in lieu of less-than symbol so that the code isn’t messed up by
// BBCODE parsing…

for (int ibvIndex = 0; ibvIndex*2 LT bbCount && ccCount == 0; ibvIndex ++)
{
   p1CpCount = p2CpCount = 0;
   p1 = CopyOf (puzzle);      // Could implement CopyOf with memcpy() …
   p2 = CopyOf (puzzle);
   SetSquareValueAndTrackCancels(p1, CommonCancelPoints, &ccCount, bbLocs[ibvIndex*2]);
   SetSquareValueAndTrackCancels(p2, CommonCancelPoints, &ccCount, bbLocs[ibvIndex*2 + 1]);
   p1Going = p2Going = -1;
   for (int iIteration = 1; (p1Going || p2Going) && ccCount == 0; iIteration++)
   {
      if (p1Going)
         p1Going = FindSinglesForIteration (p1, iIteration, CommonCancelPoints, &ccCount,
             p1ChainPoints, &p1CpCount);
      if (p2Going)
         p2Going = FindSinglesForIteration (p2, iIteration, CommonCancelPoints, &ccCount,
             p2ChainPoints, &p2CpCount);
   }   // end of FOR on iIteration
}   // end of FOR on ibvIndex
____________________________________________________________________________________________________
Back to top
View user's profile Send private message
daj95376

Joined: 05 Feb 2006
Posts: 349
:

Items
PostPosted: Fri Jul 25, 2008 1:31 am    Post subject: Reply with quote

A (forcing) chain is a single stream of linked relationships between candidates and cells in the original PM. It's tricky and inefficient to find chains by actually performing assignments and eliminations because you need to continually return the PM to its original state. I suspect that most solvers build a table of relationships and then use it to logically link entries together into a chain. By finding multiple singles and applying them across the PM in one operation, your results are based on multiple streams and are forcing nets except for the very rare case when there is only one single found for p1 (and p2) on each iteration through your FOR loop.

Note: I actually generate chains with my new solver by performing assignments and eliminations and carefully restoring the state of the original PM when necessary. I had plans to expand my logic that never materialized. Oh Well!

Contradiction Type 3: All occurrences of a value are eliminated from a house/unit.

Code:
 assignment in [r1c8] forces CT_3 in [b1]
 +-----------------------------------+
 | 12 12 12  |  .  .  .  |  .  3  .  |
 |  4  5  6  |  .  .  .  |  .  .  .  |
 |  7  8  9  |  .  .  .  |  .  .  .  |
 |-----------+-----------+-----------|
 |  .  .  .  |  .  .  .  |  .  .  .  |
 |  .  .  .  |  .  .  .  |  .  .  .  |
 |  .  .  .  |  .  .  .  |  .  .  .  |
 |-----------+-----------+-----------|
 |  .  .  .  |  .  .  .  |  .  .  .  |
 |  .  .  .  |  .  .  .  |  .  .  .  |
 |  .  .  .  |  .  .  .  |  .  .  .  |
 +-----------------------------------+
Back to top
View user's profile Send private message
Draco

Joined: 27 Jun 2008
Posts: 3
:

Items
PostPosted: Fri Jul 25, 2008 2:26 am    Post subject: Reply with quote

Yes, chains and nets are found without preference for one over the other. That was a design decision; I have nothing against nets, though I know some (perhaps most) people feel otherwise.

I understand the strict chain strictures to not modify the PMs. Understanding that does not prohibit one from creating algorithms that don't conform to that restriction. And... I did say this code finds chains/nets. Smile

What drove that decision was the realization that propagation of singles is something everyone (or at least everyone who would understand how forcing chains or nets work) knows how to do. And, at least for the search/find part, it yields a very simple algorithm. The optimizations once I got the first implementation working... now those are another matter! Still, the core algorithm remains.

Part of what you do not see in my post is the simplification that occurs during the creation of the user-viewable result (left that off as it is not part of actually finding the chain or net). Branches are trimmed from a net when they do not contribute to the cancellations; this often teases a chain out of a network (but not always). There is also UI implemented to allow removal of one or more cancellation points to further simplify the net (or chain) as desired.

FWIW, becaue of the internals of my solver, I already have a lot of the code in place for "backtracking" pretty easily (it is also easy to create copies of the current grid, though that was an enhancement I added once I decided on this chain/net implementation). Once one has the list of b/b pairs for a grid, it is not difficult to walk its paths to find chains. On that basis I started designing a strict chain finder eariler this year, then stopped. I decided I did not want to spend time creating a subset of the chain/net solutions I already have by restricting what my solver finds to the strict definition of chains. We could have a long, philosophical discussion about this decision, but in the end it is a matter of taste. Everyone decides which techniques they'd like to employ, and I would rather not open the "which techniques are acceptable" can of worms in a thread about an algorithm (not saying you tried to do so, just stating my desire to avoid said worms).

Not sure how your code for CT_3 fits into the rest of your post...?

Thanks for your feedback (and help)!

Cheers...

- drac
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Programming sudoku All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron