Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

Constraint subsets
Goto page 1, 2  Next
 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Solving sudoku
View previous topic :: View next topic  
Author Message
Ruud
Site Admin
Joined: 17 Sep 2005
Posts: 708
:
Location: Netherlands

Items
PostPosted: Wed Dec 14, 2005 10:02 pm    Post subject: Constraint subsets Reply with quote

On the player's forum, a very interesting topic on Constraint subsets was launched by Lummox JR.
A very interesting discussion developed, until beginning this month, when the topic suddenly died.

I'd like to revive the topic here with a more technical audience, so here is a brief summary:

Introduction (skip this part if it bores you)

A candidate is a digit that could be placed in a cell. With an empty grid, 81 x 9 = 729 candidates are available.

Of course, there can only be one digit in a cell. There just isn't enought room for more. This gives us our first constraint type:

Only 1 of 9 possible candidates can be placed in a cell. Since we have 81 cells, this provides us with 81 constraints of this type.

The rules of the game give us the next 3 constraint types:

Only 1 of each digit is allowed in each row.
Only 1 of each digit is allowed in each column.
Only 1 of each digit is allowed in each box.

There are 9 rows, columns and boxes, and 9 digits, so there are an additional 9 x 9 x 3 = 243, giving us a total of 324 constraints to satisfy.

Each of the 729 candidates competes in 4 of these 324 constraints. Each constraint has 9 candidates competing in it.

There are 2 operations that we can perform with this model:

1. Eliminate a candidate, by removing it from the 4 constraints it competes in.
2. Place a candidate. This is usually done by eliminating it's remaining competitors in the constraints it competes in.

Singles (basic usage of constraints)

When the number of candidates for a constraint drops to 1, a single is discovered. The constraint type determines whether we call this a naked single or a hidden single, but the principle is the same in both cases.

When a single has been discovered, it must be placed to process the effects in the remaining 3 constraints. Then the program can check whether another constraint has dropped to size 1. Easy Sudokus can be solved by simply repeating this process.

Constraint Subset Rule (limited version)

When a set of N constraints can be formed (set A) in which no duplicate candidates exist, and a set of N constraints (set B) can be formed in which all the candidates of set A are included, all candidates of set B can be eliminated except those that appear in set A.

Constraint Sets of size 1.

The processing of singles is the most basic application of the Constraint Subset Rule. When the size 1 is taken into account, the rule can be simplified to:

When a pair of constraints A and B exist, where all the candidates of A are included in B, all candidates of B can be eliminated except those that appear in A.

Now most pairs of constraints have only 1 shared candidate, so there is no need to apply the Rule, unless there is only one candidate left in the constraint.

However, when a row and a box intersect, or a column and a box, the intersection has upto 3 candidates. The same Constraint Subset Rule can be applied. When all candidates of constraint A are in the intersection, eliminate all candidates from B, except those in the intersection. Sounds familiar?

Constraint Sets of size 2

First, when 2 row-constraints form the A set, and 2 boxes form the B-set, the complement of the line-box intersection is found. Since it is easier to handle this in the size 1 code, we skip this type of constraint subset.

When 2 row-constraints for the A set and 2 columns form the B set, the constraint subset performs the eliminations like an X-Wing. As with X-Wing we can also use columns in the A set and rows in the B set.

When 2 cell-constraints form the A set and constraints covering 2 digits of the same house form the B set, we are handling a naked pair.

When constraints covering 2 digits of the same house form the A set and 2 cell-constraints form the B set, we are handling a hidden pair.

Larger Sizes

Following the previous contraint-type combinations, we can see how Swordfish, Jellyfish, naked & hidden N-Tuplets can be handled with constraints subsets.

For now, I'll leave it at this, otherwise this post will grow too long. I'll add a second post defining more advanced applications of the constraint subsets. However, when studying this, I was very surprised to see how easily this simple rule subsumes one technique after the other.

When there are any comments, please shoot.

(to be continued)

Ruud.
_________________
Meet me at sudocue.net
Back to top
View user's profile Send private message Visit poster's website
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Mon Dec 19, 2005 11:52 pm    Post subject: Reply with quote

Ruud, I just thought I'd respond to this, seeing as it was sitting there with a 0 in that column.....

Would you be willing to edit this to include the following?

1) Examples of the exact wording of some of those specific 324 constraints.

I think I see that if there are "729 candidates" then

"r1c1 is #3"

is a "candidate" not a "constraint", right?

(I would have thought there are just "9 candidates", so I'm not quite
seeing this, how there are 729 candidates, unless candidates are defined
as I show here.)

Are the following examples of "constraints"?

There is only one 9 in block 4. (81 like this)
There is only one 9 in row 3. (81 like this)
There is only one 6 in column 1. (81 like this)
There is only one possible candidate in r1c3. (81 like this)

I have a feeling my

A.XBlock[C.block][k]|=Pwr2[C.bptr]
A.XRow[i][k]|=Pwr2[j] //(row vector)
A.XCol[j][k]|=Pwr2[i] //(column vector)
A.XCell[i][j]|=Pwr2[k]//allowed associated with this cell

constitute this set. "k" here is a candidate number. "i" is a row,
"j" is a column, C.block and C.bptr refer to specific blocks and specific
cells in those blocks. Pwr2[k] means 2^k and probably correspond to
your histogram graphs.

When each of these quantities = 2^n for some n, then all the constraints
are satisfied. Yes?

Can you then give us an example of what "a set of N constraints where no
duplicate candidates exist" sounds like? I can't follow that.

Also, a concrete example of "N constraints can be formed in which all
the candidates of set A are included...." I would appreciate a specific example in this exact language.

Thanks,

Bob
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Tue Dec 20, 2005 12:48 am    Post subject: Reply with quote

as I said before, I think this can be better done in exact-cover-speak.
Candidates=rows
Constraints=columns
singles : |N(c)|=1
rules Jaap1,Jaap2. subsets,etc. I posted these.
It's a bit more formal and more general.
Some people might not like it, because you can't see the sudoku
directly ?!
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Tue Dec 20, 2005 3:16 am    Post subject: Reply with quote

perhaps, but I'm interested in the "constraints and links" language here.
In that case, candidates are definitely not "rows" I think. Still, what does it mean to say that |N(c)|=1. Is "c" constraints" or "candidates"? Sorry, I keep struggling with learning the lingo, and with different speaks using the same words for different things, it sure doesn't help.

It would be good for some of those who understand this to put up a web site or two that really explain it in simple terms, with specific language, if that is possible.
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Tue Dec 20, 2005 3:55 am    Post subject: Reply with quote

here is a link for the terminology:
http://www.setbb.com/phpbb/viewtopic.php?t=395&mforum=sudoku

yes, double use of rows,columns is not so good, but how else to name
them ? I'd thought about Rows and rows.

c= constraints or Columns
r=rows or candidates
you have the 729*324 binary matrix which can be viewed as a bipartite graph. N(c) is the set of neighbors of c or the set of candidates
which match constraint c.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Tue Dec 20, 2005 4:08 am    Post subject: Reply with quote

I'm sorry. Not being a mathematician, the notation of that posting just whizzes past me in a blur. A totally different language than anything I know.

But ok. So part of the problem is that we are talking about a MATRIX with rows and columns that are not the Sudoku rows and columns. Guaranteed frustration!

Can you give an example of how a "set of candidates matches a constraint c"?

For example, just take the 3-valued cell:

r3c1={357}

What do you say about this in "exact cover speak"? What do you say about it in terms of "constraint subsets"? Are these languages compatible or conflicting in terminology? Only you and Ruud know for sure, I think.

Do the 3, 5, and 7 (of nine possible, 1-9) "candidates match the constraint that cell r3c1 only have one candidate"? Or do the candidates r3c1#3, r3c1#5, r3c1#7 (of 729 possible candidates) match some more global set of constraints? Or am I totally misunderstanding "candidate" and "constraint" in this context.

Exact cover speak may be unambiguous, but it is also unintelligible (to me) unless defined in a context that maps what I know about Sudoku.
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Tue Dec 20, 2005 7:04 am    Post subject: Reply with quote

>I'm sorry. Not being a mathematician, the notation of that posting
>just whizzes past me in a blur. A totally different language than
>anything I know.

well, I think the language is quite similar to what you usually
find in math-texts.

>But ok. So part of the problem is that we are talking about
>a MATRIX with rows and columns that are not the Sudoku rows
>and columns. Guaranteed frustration!

the whole exact-cover thing is independent of sudoku. It works for all
sorts of other problems : polyomino-tiling,n-queens,scheduling, etc.
Sudoku ist just another example. So the matrix rows and columns
existed before sudoku was invented.

>Can you give an example of how a "set of candidates matches
>a constraint c"?

always 9 candidates per constraint in empty sudoku

>For example, just take the 3-valued cell:
>
>r3c1={357}

what is it ? Candidates 3,5,7 are allowed for cell (3,1) ?

>What do you say about this in "exact cover speak"?

c=r3c1 , N(c)={313,315,317}

>What do you say about it in terms of "constraint subsets"?

given C1,C2<C_ , |C1|=|C2|, assume also N(C1)<N(C2) and |N(r)/\C1|<2 for all r in R_.
if r2 in N(C2)-N(c1), then r2 can be removed.

<:subset
R_ : set of all candidates = matrix-rows
C_ : set of all constraints = matrix-columns
/\ : intersection
|x| : number of elements in the set x
A-B : set of all elements in A which are not in B


>Are these languages compatible or conflicting in terminology?

terminology is a bit different

>Only you and Ruud know for sure, I think.

Ruud has a nice graphics to visualize the 729*324 matrix
and how it changes while solving

>Do the 3, 5, and 7 (of nine possible, 1-9) "candidates match
>the constraint that cell r3c1 only have one candidate"?

?

>Or do the candidates r3c1#3, r3c1#5, r3c1#7 (of 729 possible candidates)

just call it r3c1s3,r3c1s5,r3c1s7 or short 313,315,317

>match some more global set of constraints?

each candidate matches 4 constraints. This is true even, when
the board is already partially filled and some matrix-rows and
columns are deleted.
e.g. 313 matches : r3c1,r3s3,c1s3,b1s3. So N(313)={r3c1,r3s3,c1s3,b1s3}

>Or am I totally misunderstanding "candidate" and "constraint"
>in this context.
>
>Exact cover speak may be unambiguous, but it is also
>unintelligible (to me) unless defined in a context
>that maps what I know about Sudoku.

the nice thing is that it works for any binary matrix.
It should be superior when you are trying to develope or describe
more complicated solving strategies.
But yes, my "survey" was probably didactically not very understandable
for non-mathematicians.
There should be more examples etc. Feel free to improve on this !



-Guenter
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Moschopulus

Joined: 12 Aug 2005
Posts: 39
:

Items
PostPosted: Tue Dec 20, 2005 10:30 am    Post subject: Reply with quote

Here's a message I posted somewhere else, I'm not sure if it will help but there's a chance so I'll post it again. Probably you know all this.
------------

We want to put sudoku into the framework of a "covering" problem: Given a matrix of 0's and 1's, we want to find some rows that together have exactly one 1 in each column.
In other words, their sum is the vector of all 1's.

Here is how to construct the matrix for sudoku:
The rows are indexed by pairs (d,(x,y)), where d is a digit and (x,y) are the coordinates of a cell. We have one row for each (d,(x,y)).

The columns are in four groups, each group having 81 columns.
The first group have labels (d,x), where d is a digit and x labels a row of the puzzle.
The second group have labels (d,y), where d is a digit and y labels a column of the puzzle.
The third group have labels (d,b), where d is a digit and b labels a box of the puzzle.
The fourth group have labels (x,y). where (x,y) are the coordinates of a cell.

In row (d,(x,y)) we place a 1 in the columns labelled (d,x), (d,y), (d,b) where b is the box containing the cell (x,y), and (x,y). Each row has four 1's.

This gives a 729x324 matrix, which I think deserves the name The Sudoku Matrix. (Of course we can do this for n^2 x n^2 sudoku.)

Here's the point:
A valid completed sudoku grid corresponds to a set of 81 rows which "cover" in the sense that they collectively have exactly one 1 in each column.

A puzzle may be described by a set of rows which extends uniquely to a set of 81 rows which cover.
Solving the puzzle means finding the rest of the rows in the unique completion.

When we have one row of the matrix that is definitely in the solution, we can delete all rows that have any 1's "in common" with this row.

(If we drop the 81 columns corresponding to the boxes then we have the
ordinary Latin squares completion problem.)


-----------------
In Ruud's original message a candidate is a row of this matrix, i.e. a possible (d,(x,y)).
The four constraint types are the four groups of columns.

If we know r3c1={357} then this corresponds to three possible rows of the matrix, (3,(3,1)) and (5,(3,1)) and (7,(3,1)).

A candidate (row) matches a constraint (column) if there is a 1 in the place where that row and column meet.

So a "set of candidates matches a constraint c" if that set of rows of the Sudoku Matrix all have a 1 in the column for the constraint c.

Only one of these can be in the solution, since the rows/candidates in the solution have no 1's in common.

e.g. The set of candidates (rows) (3,(3,1)) and (5,(3,1)) and (7,(3,1)) matches the constraint (3,1) (i.e. those three rows all have a 1 in the column labelled (3,1))

Code:

.........    ...    (3,1)

(3,(3,1))    ...      1
(4,(3,1))    ...      1
(5,(3,1))    ...      1
(6,(3,1))    ...      1
(7,(3,1))    ...      1
.........       


Last edited by Moschopulus on Tue Dec 20, 2005 1:27 pm; edited 1 time in total
Back to top
View user's profile Send private message Visit poster's website
soduko

Joined: 10 Oct 2005
Posts: 50
:

Items
PostPosted: Tue Dec 20, 2005 12:57 pm    Post subject: Reply with quote

About terminology

I myself prefer the following terminology

Instead of Rows use Placements
Instead of Collumns use Tests
Instead of digits use Symbols

Placements (Rows, options) are best described at:
R1C2S3 (placemant of a 3(symbol) at Row(1) column (2)
or for more mathematical use
P(R1C2S3)

Test (collumns, constraints)are described as
T(C3S2) =1 <==> there where a symbol 2 is placed in collumn 3
T(B9S2) =1 <==> there where a symbol 2 is placed in box 9
T(R8C3) =1 <==> there where a symbol (any) is placed on row 8 collum 3

keeping talking in rows and collums will just be to confusing with the soduko rows and collums.

And use C R S and B' s in abundance, better to much than not enough.

N(Cj) is maybe better be described as:

P(Tj)
(The placements that have T(j) = 1 )


N(Pi) is maybe better be described as:

T(Pi)
The tests that have P(i) =1

ect ect.



Quote:


This gives a 729x324 matrix, which I think deserves the name The Sudoku Matrix. (Of course we can do this for n^2 x n^2 sudoku.)


No please call it the (soduko) placement matrix, (soduko)-exact-cover-matrix or anything else but not sudoku matrix.

Matrix and grid are almost synonims and will be used interchangebly.
(even while soduko grids are always square)

PS
Placements can also be complete templates (with symbol) and then you can get the same confucion again ( was a T standing for template of for test) but I think this is a minor problems. (dancing with templates is not a big hit)




That we use different terminology than mathematicians is not such a big problem. It is much more a problem if we do not have a clear view what a row is.
(the queens problem suffers from the same dual use of terms)

Hope I made my standpoint clear.
Back to top
View user's profile Send private message
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Tue Dec 20, 2005 1:46 pm    Post subject: Reply with quote

thanks for elaborating.

But I don't like your notations
d(x,y) ,
tests instead of constraints
T(c2s3)
P(r1,c2,s3)
T(Pi)

it's not exactly defined and not necessary.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
soduko

Joined: 10 Oct 2005
Posts: 50
:

Items
PostPosted: Tue Dec 20, 2005 3:59 pm    Post subject: Reply with quote

dukuso wrote:
thanks for elaborating.

But I don't like your notations
d(x,y) ,
tests instead of constraints
T(c2s3)
P(r1,c2,s3)
T(Pi)

it's not exactly defined and not necessary.



Some remarks:

I do not use d(x,y) (it is to confusing)

In my terminology it would be P(RxCySd) or RxCySd
(I do not like comma's and believe in as much C,R,B and S as helpfull)
the P is mainly to highlight that it is a placement.

Tests instead of constraints is to get rid of the "C", if you have an other term that doesn't start with a "C" I would be willing to change.

T(C2S3)
the T is mainly to highlight that it is a Test
In principle C2S3 would be enough, but it is hardly as informative, (all tests have two terms, all placements have three so they are distinghuisable) and sets consists of Placements (exclusive) or Tests not of a combination of both,

P(r1,c2,s3)

See the above and I do not use comma's
They are superflous and confusing, I implement the whole matrix as a two dimensional array
R1C2S3 is just a readable expression for placementnumber 81x3 + 9x1 + 2 - 90 = placement number 173

T(Pi)

Is just a tiny bit more informative than
N(ri)


Quote:


given C1,C2<C_ , |C1|=|C2|, assume also N(C1)<N(C2) and |N(r)/\C1|<2 for all r in R_.
if r2 in N(C2)-N(c1), then r2 can be removed.

<:subset
R_ : set of all candidates = matrix-rows
C_ : set of all constraints = matrix-columns
/\ : intersection
|x| : number of elements in the set x
A-B : set of all elements in A which are not in B




becomes:

given T1,T2<T_ , |T1|=|T2|, assume also P(C1)<P(C2) and |T(p)/\T1|<2 for all p in P_.
if p2 in P(T2)-P(T1), then p2 can be removed.

<:subset
P_ : set of all Placements = matrix-rows
T_ : set of all Tests = matrix-columns
/\ : intersection
|x| : number of elements in the set x
A-B : set of all elements in A which are not in B

P(Tx) = All Placements that met by ALL Tests in set Tx (please see below, AFAIK)
T(Pi) = All Tests that are satisfied by ALL Placements in set Pi (please see below, AFAIK)



pn : individual placement n : |pn| = |P(R1C2S3)| = 1
tm : individual Test m |tm| = 1



No need to define N(x) anymore
it is replaced by P(x) or T(x)

It is really no character longer than your decription and I hope a little bit more informative.

Final remark
I think it still needs some explanation that
T1 and T2 are groups of tests (not nessesary single tests)
and that |T1| is the number of Tests in T1
Or am i wrong in this deduction?

And I am also still a bit unsure about the description of P(Tx) and T(Pi)
but you didn't describe N(C1) either. (especially in the case that C1 is more than one Collumn.)
But maybe C1 is only one collumn in the first place.
But then |C1| becomes meaningless.

Please reply so that we can enlighten eachother
Back to top
View user's profile Send private message
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Tue Dec 20, 2005 5:16 pm    Post subject: Reply with quote

I think, I defined N(C1) somewhere :
union of all N(c) with c in C1

your P(),T() are mappings from..to ?
also, placement,test only applies to sudoku, while exact-cover
just only requires any binary matrix.
Sudoku is a special case, so it's sudoku's problem when row,column
is double-used ;-)
Back to top
View user's profile Send private message Send e-mail Visit poster's website
soduko

Joined: 10 Oct 2005
Posts: 50
:

Items
PostPosted: Tue Dec 20, 2005 7:17 pm    Post subject: Reply with quote

dukuso wrote:
I think, I defined N(C1) somewhere :
union of all N(c) with c in C1

your P(),T() are mappings from..to ?
also, placement,test only applies to sudoku, while exact-cover
just only requires any binary matrix.
Sudoku is a special case, so it's sudoku's problem when row,column
is double-used Wink



Ahh, then my decription was wrong Embarassed

P(Tx) = ALL Placements that met by ANY Tests in set Tx
T(Pi) = ALL Tests that are satisfied by ANY Placements in set Pi



Quote:
I think, I defined N(C1) somewhere :
union of all N(c) with c in C1

This definition is circular or incomplete.

But clear to me. Wink
(you also have to define N(c) as well, but I know that N(c) is the union of all Neightbours of c, and that a neightbour of C is any r were A(c,r) = 1 or was it A(r,c) = 1 )



P(...) has two meanings:
P(R1C2S3) is a straightforward placement
P(Ti) is a mapping of ALL Placements that met by ANY Tests in set Ti

Maybe it is better to use the lowercase "p" for the first meaning.


Mapping of Union ?

I think i am missing something.
I think mapping can be replaced by union without any difference in meaning. (union is a more mathematical term while mapping is more a programmers term, as long as the definition of a mapping is not reduced to a 1 to 1 function / mapping)

Quote:
so it's sudoku's problem when row,column
is double-used


Not At All ...
The nqueens problem does have the same problem Cool
Back to top
View user's profile Send private message
Bob Hanson

Joined: 05 Oct 2005
Posts: 187
:
Location: St. Olaf College

Items
PostPosted: Tue Dec 20, 2005 8:32 pm    Post subject: Reply with quote

This is all very helpful. So is the "729x324 matrix" at

http://www.stolaf.edu/people/hansonr/sudoku/

about right?
_________________
Bob Hanson
Professor of Chemistry
St. Olaf College
Northfield, MN
http://www.stolaf.edu/people/hansonr
Back to top
View user's profile Send private message Send e-mail Visit poster's website
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Wed Dec 21, 2005 8:26 am    Post subject: Reply with quote

Bob Hanson wrote:
This is all very helpful. So is the "729x324 matrix" at

http://www.stolaf.edu/people/hansonr/sudoku/

about right?


grr... That page crashes my Browser.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Solving sudoku All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron