Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

BB Sudoku V1.0
Goto page Previous  1, 2, 3
 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Programming sudoku
View previous topic :: View next topic  
Author Message
dobrichev

Joined: 18 Dec 2009
Posts: 28
:

Items
PostPosted: Thu Jan 07, 2010 12:59 am    Post subject: improvements Reply with quote

briturner wrote:
There is always a bit of a problem with tweaking code for efficiency, since it is very compiler and environment (os / processor / hardware) dependent. I compiled and tested the improvements you recommended (compiled directly from the file you emailed me), and for me, it actually slightly increased the amount of time needed.

I agree.
Here are the timimgs i did
Code:
                  48826/17  1465   50000   all
Original binary     1.118  0.0765  1.445  2.636
Orig. src+MS        1.090  0.0763  1.430  2.595
Touched src+MS      1.100  0.0793  1.360  2.538
Orig. src+Intel     0.875  0.0628  1.087  2.023
Touched src+Intel   0.874  0.0630  1.090  2.021
where:
- the puzzle sets are those I found a link to.
- the 17-clue set consist of 48826 puzzles. Hope they are increased in time.
- "all" set is a simple concatenation of the other three sets. It is a measurement, not sum of the other measurements.
- times are in seconds. I run each executable 3 - 4 times and recorded the minimal time.

Conclusions:
- for the hardest 1465 set the modified code runs slower.
- the performance is significantly dependent on compiler.
- the benefit from the ChangedGroup cache remains questionable. Removing it was the most major change suggested.
Back to top
View user's profile Send private message
dobrichev

Joined: 18 Dec 2009
Posts: 28
:

Items
PostPosted: Thu Jan 07, 2010 10:15 am    Post subject: apples to apples Reply with quote

briturner wrote:
Greetings again,

It is going to take a bit more for me to get an apples to apples timing comparison here. 2 things I noticed are:

a) you buffer the entire file IO before starting the timing, and

b) more importantly, it looks like you are attempting to run on multiple processors (I see things like "#pragma omp parallel for schedule(static, 1) reduction(+:nSolved)").

While these are legitimate speedups, it does make the timing of the actual solving algorithm more difficult.

I think I will try putting both solvers in the same timing stub, which will help give me better timings.

Yes, I explicitly noticed there is buffering and parallel processing.

Leaving the actual solver untouched, I modified the calling code in FSSS attempting to obtain some comparable time measuring. The combined set of about 100K puzzles described in the previous post was used. Here are the results:
Code:
                          1st sol  2nd sol   write
BB Original binary         2.636    3.090    5.705
BB Orig. src+MS            2.595    3.044    6.032
BB Touched src+MS          2.538    3.055    5.960
BB Orig. src+Intel         2.023    2.408    4.030
BB Touched src+Intel       2.021    2.407    4.048
FSSS SP unbuff MS          2.250    2.766    2.516
FSSS SP unbuff Intel       1.953    2.438    2.188
FSSS SP chunks MS          2.281     n/a     2.422
FSSS SP chunks Intel       1.985     n/a     2.141
FSSS MP chunks Intel       1.078    1.313    1.219
FSSS MP full buff Intel    0.922    1.156    0.938
where:
- colums are for "stop on first solution", "stop on second solution", and "stop on first solution and output the result" respectively.
- rows are for different compilations of BB and FSSS solver. Differences between BB Original Source and BB Touched Source are not matter of this analysis.
- all BB results and the top 4 FSSS results are single threaded. Last 2 results are multithreaded and measured on dual core Pentium D - no hyperthreading, one thread per core = total 2 threads.
- MS and Intel are the compilers used for measuring the exactly same code. No multithreading is measured with MS compiler.
- unbuffered I/O means that the timer is started and the puzzles are read and solved one by one. It is colser to BB code but does not benefit from parallelization and respective limitations.
- chunks consist of 256 buffered puzzles. There is some overhead in reading, but writing in chunks instead of switching between read and write seems to compensate this overhead.
- in multithreaded compilations, the puzzles from the buffer are solved in parallel - first puzzle by the first logical processor, second by the second processor, etc. On a four processor machine the fifth puzzle will wait for first processor to finish the first puzzle and then its processing will start.
- the latest method buffers the whole I/O and the R/W time is excluded from the measurement. This is the most precise "pure" measurement, unfortunately inapplicable to BB solver. The time difference between this method and the rest is an estimation for the whole file I/O process for the FSSS solver.

Comments:
- FSSS is working measurable faster on the same puzzle set when compiled with the same compiler.
- BB has avdanced logic in the input parsing, which may (or may not) result to the observed the time differences.
- BB writes the results significantly slower. This is not measured by others. Actually this is not important for me.
- It is not possible to decide which solver is faster without simplifying the input parsing in the BB solver. Nevertheless the possible difference is maximum 20 - 30%.
- Parallelization, when done on homogenous and large enough chunks shows scalability close to the theoretical.

Note: In FSSS Solver there is code taken from BB Solver, but it is not one of the BB mods.
Back to top
View user's profile Send private message
JasonLion

Joined: 16 Nov 2008
Posts: 61
:
Location: Silver Spring, MD

Items
PostPosted: Thu Jan 07, 2010 3:11 pm    Post subject: Reply with quote

You might want to try my solver, http://www.enjoysudoku.com/JSolve10.zip. It is 10% to 20% faster than either fsss or BB Sudoku (at least on my machine/compiler) My focus was on the speed of determining if there was a unique solution and to make the solver as portable as possible. I started with a few of the algorithms used by BB Sudoku, and rewrote them from scratch into a cleaner, more portable, form. Along the way I discovered several improvements.

I handle the stack differently, using a single global active state, and copying saved states to and from the stack. That gives some speed advantages when accessing the current state which more than make up for the extra copy back from the saved state stack when unwinding.

I changed the handling of clearing bits in the queue processing for the case where there are many items in the queue. Instead of using flags on each house and then checking the marked houses, I do one check of the entire board at the end of singles processing. When there are many items in the queue this is significantly faster. With many items in the queue, essentially all of the houses get marked anyway. By checking the entire board I avoid the overhead of marking the houses and checking to see if they have been marked.

Note: The main routine is fairly primitive and won't work as written on Windows. The solver should be fine on Windows. I would love it if someone feels like contributing Windows support for the main routine.
Back to top
View user's profile Send private message
dobrichev

Joined: 18 Dec 2009
Posts: 28
:

Items
PostPosted: Fri Jan 08, 2010 9:15 pm    Post subject: JSolver Timing Reply with quote

JasonLion wrote:
You might want to try my solver... It is 10% to 20% faster than either fsss or BB Sudoku (at least on my machine/compiler).

Thank you for sharing your code.

Here are the measurements I did on my machine.
Code:
      1st sol  2nd sol
BB     2.015    2.399
FSSS   1.938    2.422
JS     2.062    2.468

Times are in seconds, about 100K puzzles.

I didn't played too much with compiler's optimization options, but tried the standard ones and those which work best for FSSS and BB code.

Could you share your measurements?
Back to top
View user's profile Send private message
dobrichev

Joined: 18 Dec 2009
Posts: 28
:

Items
PostPosted: Fri Jan 08, 2010 9:48 pm    Post subject: Reply with quote

JasonLion wrote:
You might want to try my solver, http://www.enjoysudoku.com/JSolve10.zip....
Note: The main routine is fairly primitive and won't work as written on Windows. The solver should be fine on Windows. I would love it if someone feels like contributing Windows support for the main routine.

This may help you.
Code:
...
#include <ctype>
#ifdef _WINDOWS
#include <time>
#else
#include <sys>
#endif //_WINDOWS

#include "JSolve.h"

#ifdef _WINDOWS
static unsigned long MilliTime(void) {return clock() / (double)CLOCKS_PER_SEC * 1000;}
#else
static unsigned long
MilliTime(void)
{...}
#endif //_WINDOWS
...

Although it is not a must _WINDOWS to be defined for windows compilations, some IDE do it. Much people are checking for WIN32 which presumes Win API usage but it is not the case in your application.

Note: In the code above, in the #include preprocessor read "ctype" as "ctype.h", "time" as "time.h", and "sys" as "sys/resource.h".
Back to top
View user's profile Send private message
JasonLion

Joined: 16 Nov 2008
Posts: 61
:
Location: Silver Spring, MD

Items
PostPosted: Fri Jan 08, 2010 10:14 pm    Post subject: Reply with quote

I am testing on MacOS on a 2.66 GHz Core 2 Duo. I am compiling all three programs with gcc -O3 in 64 bit mode. fsss is version 8.1 and always uses the '*' command line option. BB Sudoku is version 1.0 and always uses the -D0 -S2 command line options.

Code:

Twenty copies of top1465
fsss V8.1   1.693
JSolve      1.303
BBV1.0      1.563

A version of sudoku17 with 48826 puzzles
fsss V8.1   0.577
JSolve      0.569
BB V1.0      0.727

multipuzz
fsss V8.1   9.491
JSolve      9.579
BB V1.0      12.571

top50000
fsss V8.1   0.917
JSolve      0.841
BB V1.0      1.115
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Programming sudoku All times are GMT
Goto page Previous  1, 2, 3
Page 3 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron