Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

Question about canonical equvalency

 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Programming sudoku
View previous topic :: View next topic  
Author Message
enxio27

Joined: 10 Feb 2007
Posts: 12
:

Items
PostPosted: Tue May 20, 2008 2:37 pm    Post subject: Question about canonical equvalency Reply with quote

Ruud's Sudocue will canonicalize a collection of 81-character puzzle strings. Do I understand correctly that if two puzzle strings (as contrasted with solution strings) canonicalize to the same string, then they MUST be equivalent puzzles? If so, why would two canonically equivalent puzzles have different difficulty scores (as rated by Sudocue)?
Back to top
View user's profile Send private message
gsf

Joined: 18 Aug 2005
Posts: 411
:
Location: NJ USA

Items
PostPosted: Tue May 20, 2008 4:47 pm    Post subject: Re: Question about canonical equvalency Reply with quote

enxio27 wrote:
Ruud's Sudocue will canonicalize a collection of 81-character puzzle strings. Do I understand correctly that if two puzzle strings (as contrasted with solution strings) canonicalize to the same string, then they MUST be equivalent puzzles? If so, why would two canonically equivalent puzzles have different difficulty scores (as rated by Sudocue)?

my solver will too
it has two different minlex canonicalizations, -f%c soution grid row order minlex, and %#mc puzzle row order minlex
to process a file of puzzles
Code:
-f'%c %#0v

and sort/uniq on the first field, or if the number of puzzles is not too large (< ~1M) then
Code:
-qFN -e 'uniq((%#mc))'

and the solver will uniq on the row-order minlex string

the basis for canonicalization is an algorithm that is not subject to order in the initial puzzle
not so with ratings based on methods applied unless move batching is enabled, and even then some ordering bias slips in
Back to top
View user's profile Send private message Visit poster's website
enxio27

Joined: 10 Feb 2007
Posts: 12
:

Items
PostPosted: Tue May 20, 2008 5:30 pm    Post subject: Re: Question about canonical equvalency Reply with quote

gsf wrote:

my solver will too
it has two different minlex canonicalizations, -f%c soution grid row order minlex, and %#mc puzzle row order minlex
to process a file of puzzles
Code:
-f'%c %#0v

and sort/uniq on the first field, or if the number of puzzles is not too large (< ~1M) then
Code:
-qFN -e 'uniq((%#mc))'

and the solver will uniq on the row-order minlex string

the basis for canonicalization is an algorithm that is not subject to order in the initial puzzle
not so with ratings based on methods applied unless move batching is enabled, and even then some ordering bias slips in


Shocked Can you repeat that, in English, please? Smile (I'm trying to learn some of this canonicalization stuff, but I'm having trouble with some of the jargon.)
Back to top
View user's profile Send private message
gsf

Joined: 18 Aug 2005
Posts: 411
:
Location: NJ USA

Items
PostPosted: Tue May 20, 2008 6:00 pm    Post subject: Re: Question about canonical equvalency Reply with quote

enxio27 wrote:

Shocked Can you repeat that, in English, please? Smile (I'm trying to learn some of this canonicalization stuff, but I'm having trouble with some of the jargon.)

my solver is a command line program
the --man option (tersely) documents the others

most of the sudoku canonicalization algorithms are row-order minlex based on some property of the 81-char grid representation
row-order means that the grid is read by row, top to bottom
minlex means minimum lexical value of the 81-char string
valid sudoku transformations (row/col with in band/stack, stack<->stack, band<->band, clues values)
will produce different 81-char strings
sort all possible transformations for the 81-char string for one puzzle
and select the lexicographic smallest, and you have the
row-order minlex canonical representation
for the puzzle

the two canonicalization forms provided by my solver are
%c row-order minlex based on the solution grid (benefit: fast, drawback: puzzle must have one solution)
%#mc row-order minlex of puzzle grid (benefit: works on pseudo puzzles, drawback: slower as #cluse increases)

rudd also has a pattern based minlex canonicalization that has two stages:
treat the clues as value 1 and determine the minlex pattern
then add in the clue values and determine the minlex string that has the same clue positions as the minlex pattern
I have not implemented this yet
Back to top
View user's profile Send private message Visit poster's website
daj95376

Joined: 05 Feb 2006
Posts: 349
:

Items
PostPosted: Tue May 20, 2008 6:07 pm    Post subject: Reply with quote

Simple Answer:

There are different ways to canonicalize a puzzle. Awhile back, a number of experienced users agreed on one way. There are current discussions in another forum where they're trying to fold symmetry into the mix. They want it to be canonical and pretty!

gsf has a very powerful solver that runs in command-line mode. He also has a command line parser that would give most programmers nightmares. The most common activity with his solver is people asking how to format the command line to get a specific result.

I don't have any idea where you'd find a precise description on how to implement the canonicalization algorithm.
Back to top
View user's profile Send private message
Jean-Christophe

Joined: 19 Mar 2006
Posts: 126
:
Location: Belgium

Items
PostPosted: Wed May 21, 2008 6:45 am    Post subject: Re: Question about canonical equvalency Reply with quote

enxio27 wrote:
why would two canonically equivalent puzzles have different difficulty scores (as rated by Sudocue)?

Only Ruud could give a definitive answer.

Two canonically equivalent puzzles should have the same "logical" difficulty, but could have different "human perceived" difficulties. Ruud may have tuned its rating system for humans, by assigning different scores to logically equivalent moves.

The digit ordering may also explain the difference. For example, at some point, there may be several hidden singles or X-Wings. If SudoCue always search from 1 to 9 and chooses the first one found, it could be more efficient for one puzzle than for the other, revealing more naked or hidden singles. Similarly for harder techniques, it can take longer before it hits a key move, which unlocks the puzzle.

Could you post an example of two such equivalent puzzles? You could also compare their log in SudoCue and look where they diverge, considering the kind of moves made, not their particular digits or location.
_________________
Jean-Christophe
"When you have eliminated the impossible, whatever remains, however improbable, must be the truth." Sherlock Holmes.
Back to top
View user's profile Send private message Visit poster's website
tarek

Joined: 31 Dec 2005
Posts: 153
:
Location: London, UK

Items
PostPosted: Wed May 21, 2008 9:39 am    Post subject: Reply with quote

Ruud's solver is non-batch breadth 1st solver .........

This means that it will be subject to bias based on puzzle configuration & method heirarchy.

to make the solving path consistant in a non-batch solver, the puzzle should be canonicalized & solved in the canonicalized form. the cells/digits can be reassigned later according to the original configuration.

My guess is that ruud's solver doesn't do that & therfore the solving path can differ with each isomorph.

tarek
Back to top
View user's profile Send private message
serbo

Joined: 22 Dec 2008
Posts: 6
:

Items
PostPosted: Sun Dec 28, 2008 1:44 am    Post subject: Reply with quote

daj95376 wrote:
Simple Answer:
gsf has a very powerful solver that runs in command-line mode. He also has a command line parser that would give most programmers nightmares. The most common activity with his solver is people asking how to format the command line to get a specific result.

Is there some dedicated thread to gsf's solver? I have some questions too! Maybe they where already answered, or should I start a new thread?
Back to top
View user's profile Send private message
daj95376

Joined: 05 Feb 2006
Posts: 349
:

Items
PostPosted: Wed Dec 31, 2008 6:39 pm    Post subject: Reply with quote

serbo wrote:
daj95376 wrote:
Simple Answer:
gsf has a very powerful solver that runs in command-line mode. He also has a command line parser that would give most programmers nightmares. The most common activity with his solver is people asking how to format the command line to get a specific result.

Is there some dedicated thread to gsf's solver? I have some questions too! Maybe they where already answered, or should I start a new thread?

gsf's solver is discusses in a zillion places because I'm not the only one who doesn't understand how its command parser works.

I now realize that I failed to answer your core question about why canonically equivalent puzzles may have different difficulty ratings. If I remember correctly, here's how the answer goes.

Unless you have a solver that performs all possible eliminations on a puzzle at every step, then your solver may end up choosing different solution paths for two canonically equivalent puzzles. Thus, two potentially different difficulty ratings.
Back to top
View user's profile Send private message
gsf

Joined: 18 Aug 2005
Posts: 411
:
Location: NJ USA

Items
PostPosted: Thu Jan 01, 2009 2:38 am    Post subject: Reply with quote

serbo wrote:
daj95376 wrote:
Simple Answer:
gsf has a very powerful solver that runs in command-line mode. He also has a command line parser that would give most programmers nightmares. The most common activity with his solver is people asking how to format the command line to get a specific result.

Is there some dedicated thread to gsf's solver? I have some questions too! Maybe they where already answered, or should I start a new thread?

post here or on another thread
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Programming sudoku All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron