Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

Rating Difficulty
Goto page 1, 2  Next
 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Setting sudoku
View previous topic :: View next topic  
Author Message
Howard

Joined: 31 Jul 2005
Posts: 7
:
Location: Keele, UK

Items
PostPosted: Sun Jul 31, 2005 2:47 pm    Post subject: Rating Difficulty Reply with quote

Hi all,

I'm helping to write a handheld (Palm,Pocket PC) Sudoku player/helper/generator.

I've written an engine that's pretty good at generating puzzles, and can solve them using the majority of logical techniques. (Up to XW/Swordfish(n)/Forcing Chains, etc.) I've had to put in quite a bit of optimisation so that it will generate good Sudoku in a reasonable time on a handheld device, which are much less powerful than current PCs!

I'll discuss my puzzle generation methods elsewhere, but I wanted to share the method I've been using to determine difficulty. Since I'm only concerned with puzzles that don't require guessing, then each puzzle can be solved with a sequence of logical techniques.

As others have done, at each stage I try each of the techniques until one is successful, and then go through again and again until either the puzzle is complete, or none of the techniques I've implemented makes any further progress. Clearly the simple puzzles (requiring only single candidate or single position tests) should be rated much easier than those requiring harder techniques.

The newspapers seem to rank their puzzle difficulties mostly on how many seed cells there are. Given that they usually require only the trivial tests, this is reasonable.

What I did was to give a "cost" to each technique, and calculate a total based on how many times each technique was used. I also gave a weighting to the first time a technique was used, to give it more of a human feel. (If you had to use Disjoint Subsets once, that was hard, but using it several times isn't so much harder.)

I realise that the terms I use for the techniques may not be the same as others have used, but here's what I've been using for the cost for each technique, for its first use, and then each subsequent use.

Code:

Method name,     First   Subsequent
                 cost    cost
SINGLECANDIDATE  100,    100
SINGLEPOSITION   110,    110
CANDIDATELINE    350,    200
DOUBLEPAIR       500,    250
MULTILINE        700,    400

DJSS2           1000,    750  (naked pairs)
USS2            1400,   1000 (hidden pairs)

DJSS3           2000,   1400 (naked triples)
USS3            2400,   1600 (hidden triples)

XWING           2500,   1800
SWORDFISH(3)    5000,   2000

FORCINGCHAINS   4200,   2100

DJSS4           7000,   5000
USS4            8000,   6000

SWORDFISH(4)    8000,   6000


I decided on these numbers based on how hard I felt each of these techniques was, compared to "100" for a single candidate test (which is of course trivial if you're using a computer solver that shows pencilmarks!). I'm sure many of you would choose different numbers (What would you change?)


I ran through a good set of Times puzzles (a couple of books worth!) to look at what kind of ranges each of the puzzles came out with:
Easy: 4500-4800 (average 4600)
Mild: 5000-5500 (average 5200)
Difficult: 5300-6000 (average 5600)
Fiendish: 5800-9000 (average 6500)

Only a few of the Times puzzles required Disjoint Subsets or anything that wasn't simple to use, but this did at least give me a range to work with. Clearly there's a lot of overlap between the ranges.

From this, I decided on some ranges so that I could classify my own generated puzzles into ranges that broadly fell across these. I wanted so that if someone entered a Times puzzle into my solver, it would reasonably give it a difficulty rating which was similar to that it came designated as.

Using this technique to classify puzzles, it becomes pretty clear that the four difficulty bands commonly used really don't do justice to many of the harder puzzles that require the much harder techniques.

Personally, beyond Fiendish, I added "Diabolical" (8000-15000) and "Obscene" (15000+).

I'm finding some lovely puzzles being generated at some higher difficulty levels, that even though they have a relatively high number of seeds (26-29), they are remarkably challenging to solve. The ones that my system rates as the hardest are those that require many different techniques to be used to solve them.

I'll probably continue to tweak the technique ratings with feedback from users, and hence move some of the difficulty boundaries, but at least for me it seems to give a continuous rating which is reasonably consistent with how hard the puzzle feels to play. I'm fairly sure this isn't formal enough for some of the mathematicians on this board, but for the purposes I wanted (to be able to categorise puzzles for users, and for them to ask to generate a puzzle of appropriate rating), it seems to hold out reasonably well!

I'll appreciate any comments or thoughts any of you have!

Howard.
_________________
Howard Tomlinson, Developer, Astraware. (howard at astraware dot com)
Back to top
View user's profile Send private message Visit poster's website
Simes

Joined: 08 Apr 2005
Posts: 71
:
Location: North Yorkshire, UK

Items
PostPosted: Sun Jul 31, 2005 5:11 pm    Post subject: Reply with quote

Howard wrote:
I also gave a weighting to the first time a technique was used, to give it more of a human feel.
Hey, good idea! Mind if I borrow it?

How do you generate good puzzles then? I find that most of mine are easy - which makes for a lot of discarded puzzles if you're looking for a hard one.
_________________
Simes
www.sadmansoftware.com/sudoku
Back to top
View user's profile Send private message Visit poster's website
Howard

Joined: 31 Jul 2005
Posts: 7
:
Location: Keele, UK

Items
PostPosted: Tue Aug 02, 2005 8:55 pm    Post subject: Reply with quote

Simes wrote:
Hey, good idea! Mind if I borrow it?

Feel free!
Turning abstract values into something meaningful for people - and back - is quite an interesting exercise. In this case coming up with something that rates puzzles with more of a human feel was the intention.

I'm actually seeing some limitations as I continue working - for instance, in that the naked triples {246} {246} {246} are not necessarily the same difficulty as the group {24} {26} {46}. Perhaps I should rate them differently?

Another factor of difficulty is: from each step along in the solution, how many "valid" moves are there?
Even if the system always chooses the simplest one available, I'm sure it ought to rate the difficulty as harder if that was the *only* solution, than if there were dozens of potential moves.
That way, the very hardest puzzles would not only contain many techniques, but also have very few possible solve-routes to completion.

Hope that makes sense also Smile


As for generating "good" puzzles, I came up with a few ideas. I'll discuss these directly with you by email first. I certainly found that the shotgun approach of "make a grid, solve it, discard it, make a grid, solve it..." is quite slow - especially if you're trying to make a grid with particular properties.

Howard.
_________________
Howard Tomlinson, Developer, Astraware. (howard at astraware dot com)
Back to top
View user's profile Send private message Visit poster's website
droid42

Joined: 29 Jul 2005
Posts: 20
:

Items
PostPosted: Wed Aug 03, 2005 9:15 am    Post subject: Reply with quote

I've also been dabbling with generating puzzles over the last couple of days and have also been grappling with the challenge of deciding which puzzles are easy and which ones are hard.

To start off with, similar to your approach, I grouped solving methods together in difficulty groups. My arbitrary first attempt is as follows..

0 - Simple logic (e.g. only "possible" in cell/row/column/box etc.)
1 - Pointing pairs/triples
1 - Box/line reduction
2 - Naked pairs
2 - Hidden pairs
3 - Naked triplets
3 - Hidden triplets
3 - X-Wing
4 - Naked quads/pentuples/n-tuples
4 - Hidden quads/pentuples/n-tuples
4 - Swordfish
5 - Bigger fish
6 - Super-colouring

I generate the puzzle and then apply the solver to it using graded levels of difficult. For example, the first pass uses only simple logic. If this succeeds, the puzzle is graded 0. The next pass uses methods only in levels 0 + 1 and subsequent passes are 0 + 1 + 2 etc. etc. This (relatively) fast method gives a basic grading. For the very hardest puzzles I only let them through this first step if they need at least level 6.

For a deeper understanding of how hard a puzzle is, I run a test similar to the above but much more involved. Instead of running the 7 combinations of difficulty levels, i.e.
0
01
012
0123
01234
012345
0123456
01234567

I run every possible combination of levels, e.g.
0
01
02
03 etc.
012
013
014 etc.
056
2356
2456 etc.etc.etc.

Obviously this takes a while per puzzle (around 2-3 seconds) but is worthwhile in terms of the information you get back, namely:

- Most efficient combination of solving methods
- Least efficient combination of solving methods
- Essential solving methods
- Methods that are of no use for this puzzle
- Methods that can be used to solve it but aren't essential

Once you have all this information, it takes no more than a glance to see how interesting or hard a particular puzzle is.

As a quick aside, before I embarked on writing a sudoku generator, I always assumed that building the puzzle would be the hardest bit. In fact, it's exactly the above that's taken up most of my time.

Building the completed solution was easy (throw some random cells on the board ... does it solve? no? try again etc.) ... doesn't take long to find a completed grid.

Desiging the puzzle means taking a completed grid and using the following algorithm...

remove random cell -> will it solve? if yes, repeat by removing another cell ... if no, replace the cell and try another.

Repeat the above until the required number of cells have been removed. To my surprise, this approach spits out a very very very hard puzzle (i.e. requires at least super-colouring or jellyfish) every couple of seconds. I even found a puzzle that has a genuine naked sextuplet in it depending on the solving route taken.

Ian.
Back to top
View user's profile Send private message
angusj
Site Admin
Joined: 18 Jun 2005
Posts: 406
:

Items
PostPosted: Wed Aug 03, 2005 10:04 am    Post subject: Reply with quote

droid42 wrote:
I even found a puzzle that has a genuine naked sextuplet in it depending on the solving route taken.

Hi Ian.
Firstly, thanks for your interesting post.
Anyhow, there's never any need to look for quintuplets, sextuplet etc because if there is one there's also a hidden or naked triple or quad in the same group (unless you're solving puzzles that are larger than the standard 9x9 grid).
Back to top
View user's profile Send private message Visit poster's website
Simes

Joined: 08 Apr 2005
Posts: 71
:
Location: North Yorkshire, UK

Items
PostPosted: Wed Aug 03, 2005 10:05 am    Post subject: Reply with quote

droid42 wrote:
...this approach spits out a very very very hard puzzle ... every couple of seconds.
Does it do this even if you have a constraint that the placement of the clues must exhibit some symmetry?

droid42 wrote:
I even found a puzzle that has a genuine naked sextuplet in it depending on the solving route taken.

Isn't that the same as a hidden triplet? (Perhaps I'm wrong (again) but I though naked x and hidden y were opposite sides of the same coin.)
_________________
Simes
www.sadmansoftware.com/sudoku
Back to top
View user's profile Send private message Visit poster's website
droid42

Joined: 29 Jul 2005
Posts: 20
:

Items
PostPosted: Wed Aug 03, 2005 11:31 am    Post subject: Reply with quote

angusj wrote:
droid42 wrote:
I even found a puzzle that has a genuine naked sextuplet in it depending on the solving route taken.

Hi Ian.
Firstly, thanks for your interesting post.
Anyhow, there's never any need to look for quintuplets, sextuplet etc because if there is one there's also a hidden or naked triple or quad in the same group (unless you're solving puzzles that are larger than the standard 9x9 grid).


Yes, interesting point. I suspect the only reasons my program finds them are a) it can and b) it was probably running the full difficulty analysis as described above with solitary-pair/triple detection switched off.

I ran it against a 25x25 earlier on and it uncovered a solitary 14-tuplet, yikes.

Wouldn't 12 23 34 45 15 in a 3x3 box be a valid quintuplet? (or would such a combination of possibles be impossible for some other reason?)

Ian.
Back to top
View user's profile Send private message
angusj
Site Admin
Joined: 18 Jun 2005
Posts: 406
:

Items
PostPosted: Wed Aug 03, 2005 12:07 pm    Post subject: Reply with quote

droid42 wrote:
Wouldn't 12 23 34 45 15 in a 3x3 box be a valid quintuplet?

Yes, perfectly valid. However, that leaves 6,7,8 & 9 in the remaining cells, so worst case (assuming no cells assigned in that box) there'll be a naked or hidden quad.
Back to top
View user's profile Send private message Visit poster's website
droid42

Joined: 29 Jul 2005
Posts: 20
:

Items
PostPosted: Wed Aug 03, 2005 3:57 pm    Post subject: Reply with quote

angusj wrote:
droid42 wrote:
Wouldn't 12 23 34 45 15 in a 3x3 box be a valid quintuplet?

Yes, perfectly valid. However, that leaves 6,7,8 & 9 in the remaining cells, so worst case (assuming no cells assigned in that box) there'll be a naked or hidden quad.


Doh! ... yes, of course...

Ian.
Back to top
View user's profile Send private message
upsidedownface

Joined: 27 May 2005
Posts: 5
:
Location: Leeds

Items
PostPosted: Mon Sep 12, 2005 9:57 pm    Post subject: Classifying Difficulty Reply with quote

I have started trying to classify the difficulty, but I think I am only looking at what looks difficult to old low perceptionists like me.

At any "level" there is a number of candidates, often with more than one reason for being a candidate.
The first level (=0) has the given problem with the "seeds".
Having identified the candidates, I then add them all to the grid, increase the level by 1 and see how many new candidates there are.

When there are no candidates, if the number of unfilled cells is zero the problem is "solved" or if greater than zero the program is "stuck".

This is "Easy"

My program produces a log file of the steps it took, including the number of candidates at each level.

Guardian no. 108 Easy
level:- 1 2 3 4 5 6 7
count:- 2 3 5 8 12 14 11 "solved"

Guardian 095 Medium
level:- 1 2 3 "stuck"4 5 6 7 8 9 10 11 12
count:- 3 2 1 1 1 1 3 6 8 12 10 6 "solved"

For a "stuck" program/problem I display all the possibilities in the unfilled cells. My program then looks for disjoint sets, but it isn't yet very sophisticated, and I don't know what most of them look like, so I can't program it!

This is "Difficult"

As a last resort I choose a setting for some cell which has only 2 possibilities and start the solving process again. Sometimes I have to try this more than once, but I haven't yet had to do it more than 3 times.

This is "Cheated"

Les.
Back to top
View user's profile Send private message
Agent Allen

Joined: 01 Oct 2005
Posts: 34
:

Items
PostPosted: Sat Oct 01, 2005 4:44 pm    Post subject: This is in the territory Reply with quote

This topic is interesting.
Is there a really good catalogue defining all the methods Howard listed?
The first couple are of course very obvious, but the more complex the method the less illuminating the name.

The topic [Setting Sudoku for Human Solvers (Sudokorists)] I've started is in the same sort of area as Howard.
I'm proposing the standardise 'cost' on (notional) expert minutes and call them AESMs.

Actually the program I'm writing works in expert seconds but that seems too much precision to report in credibly.
I also suspect that care needs to be taken to go back and do it badly again to get the correct rating if you use a good guess.
I've noticed some listings where its possible that X says his tool solves problem A 10x faster than Ys for no reason other than Xs tool works well on problem A. It might be 10x slower on problem B, but how would we know? In the difficult problem topic someone points out his program thinks a problem is many times easy when rotated. He seems to realise that isn't sensible.
He's not rating the problem anymore but subjective luck of the draw in his method.
If you implement any of the 'manouvres' without rotational symmetry I certainly think your weightings won't be very objective.
_________________
Agent A
Back to top
View user's profile Send private message
Lummox JR

Joined: 07 Sep 2005
Posts: 202
:

Items
PostPosted: Sat Oct 01, 2005 8:39 pm    Post subject: Reply with quote

I believe the DJSS stands for disjoint subsets, and I'm not sure what USS is but that also is a subset rule. Notice he also said those are hidden and naked pairs. The most commonly accepted term for each of these rules is naked and hidden subsets.

The X-wing and swordfish patterns are basically also subset rules, but positional rather than relating to the placements of different digits in a house (block/column/row). They're quite rare, and even astoundingly rarer is the 4-row/col swordfish, which is also sometimes called jellyfish.

Forcing chains is implemented in various ways, but the most common way is to find a chain of cells which each have only 2 candidates, such that one might be 18, another 38, another 36, then 16, etc. If you have an even number of such cells linked and the first and last ones intersect at a target cell that shares their common "endpoint candidate" (in this case the 1), you can remove the candidate from that cell. (The logic is that if the target cell is 1, the 18 must be 8, the 38 must be 3, the 36 must be 6, the 16 must be 1, and there's a contradiction.) I'm not fond of this approach, though, since it takes a long time to find anything and even then will miss things. Coloring is considerably easier, and advanced coloring while difficult will find anything that forcing chains can.

I like your idea of representing score in seconds, though that would be difficult to quantify, and I don't think it can be standardized. You're also quite right in noticing that some methods work better for some problems than others. My current approach to assessing difficulty is a lot like Howard's, but without the first-try rule (I'll probably implement something of the sort eventually) and different weights, and a lot more possible tests it can run. It always adds the score of the first test that finds anything. I'd get more accurate results, however, if I implemented this in a recursive algorithm that could try harder tests as well, or 2nd or 3rd possibilities for the same tests, to try to find techniques that will crack the puzzle faster. It might be that coloring on the 2's reveals something of little value while coloring the 8's will find a key insight to the solution. I'd have to somewhat revamp my solver to do that.
Back to top
View user's profile Send private message
Agent Allen

Joined: 01 Oct 2005
Posts: 34
:

Items
PostPosted: Sat Oct 01, 2005 9:38 pm    Post subject: Yes we can standardise Reply with quote

I'm saying we would like to measure how hard a puzzle is by telling someone how long we think an expert sudokorist would take on average.
The measure I'm proposing is at least it is defined and meaningful which is quite a lot. Difficult to measure accurately at this time, but that can be worked on.
Then you can start to ask questions about puzzles - how long does this one take?
Questions about sudokorists - how long do you take? What's your multiplier? If you take on average 0.95*, you rock.
Questions about methods - how long does it take and does it payoff?
Is it worth the fag? Is it worth trying early or only if stuck?

It seems to me that if you measure cost but cost isn't in a physical commodity its largely meaningless.
Howards numbers look about right relative to each other, but only because I'm rescaling by /200 to get minutes.
There in the ball-park anyway looking at the rankings from the Times.
_________________
Agent A
Back to top
View user's profile Send private message
Agent Allen

Joined: 01 Oct 2005
Posts: 34
:

Items
PostPosted: Sat Oct 01, 2005 9:50 pm    Post subject: Thanks also for the tips on methods Reply with quote

Also thanks for the methods tips.
I found a great catalogue on

http://users.pandora.be/vandenberghe.jef/sudoku/index.html?how2solve
That lists nearly everything talked about in these forums.
Its also very well explained.
Which, err is better than some of the posts round here, err sorry guys, but err some of the posting and certainly most of the C programs are completely unreadable Embarassed Rolling Eyes .
I'm an ex-C programmer and in my day we used local variables.
Why? So we could laugh at the COBOL progammers for one thing.
Well actually because structured programming showed that made programs both safer more widely usable and readable.
Can't be bad.

There's an old C saying "if was hard to write it should be hard to read".
It is a joke, is everyone clear about that?

---
Off rant, I agree that considering time cost, I'm not convinced all of the methods catalogued round the place deserve house room.
Some of them are awfully time consuming to do and only so successful.
A bit of brute force seems to be worth a try in preference to some.
That said, without hard and fast cost measures who can say...
_________________
Agent A
Back to top
View user's profile Send private message
Ruud
Site Admin
Joined: 17 Sep 2005
Posts: 708
:
Location: Netherlands

Items
PostPosted: Sun Oct 02, 2005 12:12 am    Post subject: Reply with quote

Pump up the score for those @#$ naked singles.

It's easier to spot a locked pair than a naked single.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Setting sudoku All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron