|
View previous topic :: View next topic |
Author |
Message |
| briturner
| Joined: 14 Oct 2008 | Posts: 75 | : | | Items |
|
Posted: Sun Oct 11, 2009 5:53 pm Post subject: BB Sudoku V1.0 |
|
|
Greetings,
Here is the latest BB Sudoku release. This was mainly driven by a request to clarify the copyright so others can use this in their own programs. I clarified this with allowing this code to be freely used for personal, educational, or non-commercial uses (providing the notice is kept).
There are a few minor changes, and gfs suggested a change to the makeGuess routine that provided around a 10% improvement.
The files are available at http://sites.google.com/site/bbsudokufiles (thanks to lkSudoku for setting up the easier download site)
Code: | SPEED COMPARISONS
------------------------------------------------------------------------------
Here are some timings I did on a 1.83GHz pc with different solvers:
Top 50000 Top 1465 Pearly6000 Sudoku 17
Sudocoo 2.656 0.578 2.031 123.609
fast_solv_9r2 5.000 0.421 3.203 4.593
Sudocoup 4.281 0.406 9.640 2.953
zerodoku v2 2.515 0.375 3.015 2.703
BB_Sudoku v0.1 1.982 0.224 2.108 1.856
BB_Sudoku v0.6 1.355 0.073 1.221 0.903
BB_Sudoku v1.0 1.094 0.059 0.861 0.820
zeroth 600k Gen 500k bb puzzles
Sudocoo 48.859 dnf 1.453
fast_solv_9r2 55.859 28.921 1.515
Sudocoup 43.515 57.032 6.219
zerodoku v2 27.937 677.187 1.671
BB_Sudoku v0.1 22.263 528.890 1.206
BB_Sudoku v0.6 13.335 2.024 0.746
BB_Sudoku v1.0 11.340 1.912 0.519
Note: With BB_Sudoku v1.0, I have moved to running Win 7 RC, and VC 2008. I
believe this added a couple percent improvement.
|
brit
Last edited by briturner on Tue Oct 13, 2009 7:18 pm; edited 1 time in total |
|
Back to top |
|
|
| wapati
| Joined: 12 Jun 2007 | Posts: 622 | : | Location: Canada | Items |
|
Posted: Sun Oct 11, 2009 9:38 pm Post subject: |
|
|
For big changes I get hardware. Niggles, who cares? |
|
Back to top |
|
|
| Adak
| Joined: 27 Feb 2008 | Posts: 87 | : | | Items |
|
Posted: Mon Oct 12, 2009 6:38 am Post subject: |
|
|
I have version 0.5 of your solver (with just a few tweaks), running with my grid searcher program. It may not be the fastest version, but it does a fine job, running 24/7 on a Win 7 pc.
Congratulations Brit, on improving your already awesome solver, yet again. |
|
Back to top |
|
|
| PIsaacson
| Joined: 17 Jan 2006 | Posts: 47 | : | Location: Campbell, CA | Items |
|
Posted: Tue Oct 13, 2009 7:09 am Post subject: |
|
|
Brian,
Thanks for releasing the code to the general public. The first adaptation is probably going to be used for producing a minimal puzzle generator that has controlled bias. I posted for our Sudoku Players' Forum the following link containing mods to make your solver code a callable library for one of our generic generators: http://pisaacson.fileave.com/Sudoku/suexg-cb.zip
If you want to follow the thread to see how your code is being used, it's http://www.sudoku.com/boards/viewtopic.php?t=14615 |
|
Back to top |
|
|
| sudoking
| Joined: 20 Oct 2009 | Posts: 40 | : | | Items |
|
Posted: Tue Oct 20, 2009 11:28 am Post subject: |
|
|
An interesting algorithm. However, when I loaded it into a .net 1.1 project it wouldn't compile. Then I saw this line:
Code: | #include "bb_sudoku_solver.cpp" // Include the solver |
This isn't how the #include statement is supposed to be used. #include is supposed to be used to include header files. The reason this line is needed in this code is that both bb_sudoku.cpp and bb_sudoku_solver.cpp need to access variables that are declared in the bb_sudoku_tables.h file. however, this is not the way to achieve that.
Firstly, the variables should be declared in a source file, such as bb_sudoku.cpp. They should be exposed with the extern keyword in the header file, and finally, both source code files should include the header file. Here's a simple example:
Code: |
**** START OF FILE main.c ****
#include "header.h"
int i;
int main(int argc, char *argv[]) {
i = 1;
print_i();
}
**** END OF FILE main.c ****
**** START OF FILE func.c ****
#include <stdio>
#include "header.h"
void print_i(void) {
printf("%d\n", i);
}
**** END OF FILE func.c ****
**** START OF FILE header.h ****
#ifndef _HEADER_H
#define _HEADER_H
void print_i(void);
extern int i;
#endif
**** END OF FILE header.h ****
|
Here, the variable "i" is declared in the main.c file. It is exposed in the header.h file, so any source code file that includes this header file can see this variable. As such, the func.c file can access "i". I hope this makes sense. |
|
Back to top |
|
|
| sudoking
| Joined: 20 Oct 2009 | Posts: 40 | : | | Items |
|
Posted: Tue Oct 20, 2009 12:37 pm Post subject: |
|
|
I also get the following warning when I compile:
bb_sudoku_solver.cpp(702) : warning C4146: unary minus operator applied to unsigned type, result still unsigned
This occurs in the "MakeGuess" function. Here's the line in question:
"v" is declared as an unsigned integer, and what the compiler is saying is that the result of placing a - operator in front of an unsigned will yield an unsigned. Without analysing the routine, which I don't have time for ATM, there's no way for me to tell is this will possibly cause a bug or not. That's really something for the author to determine. |
|
Back to top |
|
|
| humble_programmer
| Joined: 27 Jun 2006 | Posts: 69 | : | Location: Colorado Springs, USA | Items |
|
Posted: Tue Oct 20, 2009 12:38 pm Post subject: |
|
|
sudoking wrote: | This isn't how the #include statement is supposed to be used. |
I beg to differ: using the #include statement to embed one source file inside another actually does have legitimate value. Programmers tend to organize source code into multiple files for convenience--it's easier to work with several smaller files than one monolithic file--but most compilers will only optimize within a single source file. In situations where performance is the most important criteria (such as a command-line sudoku solver) you can actually eke out a performance gain by combining all of the source files into a single file and letting the compiler optimize the whole thing at once. Several open source chess engines do this as well: see Crafty for a mainstream example.
For example of why this works, imagine that a method in source file A is only called once in the entire application, and that call is from source file B. When compiling files A and B independently, the compiler has no idea how many times the method is called, and leaves it intact. However, if both files are combined, the compiler will now recognize that the method is only called once, and (can/may/will) inline it. At the very least, this eliminates the call set up and tear down, and may also lead to other optimizations, such as hoisting complex expressions or register sharing.
Bottom line: if it's stupid but it works...then it's not stupid. _________________ Cheers!
Humble Programmer
,,,^..^,,,
www.humble-programmer.com |
|
Back to top |
|
|
| sudoking
| Joined: 20 Oct 2009 | Posts: 40 | : | | Items |
|
Posted: Tue Oct 20, 2009 1:31 pm Post subject: |
|
|
humble_programmer wrote: | Programmers tend to organize source code into multiple files for convenience--it's easier to work with several smaller files than one monolithic file--but most compilers will only optimize within a single source file. |
I agree, but with modern compilers and development environment, this really isn't necessary any more. Placing the all code in a single file doesn't make thing that much more complicated. Modern IDEs, al la the MS .NET range, which I believe is where this code was originally cut, allow blocks of code to be hidden, fast lookup of function definitions and declarations, etc. There's no longer a great need to separate source files for simplicity, esp for such a simple project as this.
In the current example, there really isn't much optimisation to be had at all from a single source file. In fact, in the comments it said this was done to make the project easy to compile.
Quote: | For example of why this works, imagine that a method in source file A is only called once in the entire application, and that call is from source file B. When compiling files A and B independently, the compiler has no idea how many times the method is called, and leaves it intact. However, if both files are combined, the compiler will now recognize that the method is only called once, and (can/may/will) inline it. At the very least, this eliminates the call set up and tear down, and may also lead to other optimizations, such as hoisting complex expressions or register sharing. |
Agreed. (Assuming you meant "called many times from a single location". If it were called only once, then there's not a lot of optimisation to be had there.) However, the only candidate for such a call in this case appears to be the solver routine, which isn't declared with the inline keyword.
Anyway, each to their own. |
|
Back to top |
|
|
| lkSudoku
| Joined: 16 May 2009 | Posts: 60 | : | | Items |
|
Posted: Tue Oct 20, 2009 5:22 pm Post subject: |
|
|
Including C file sources is not something I encounter often
While it is true that in some cases, optimization is done on single files (such as call tail or inline optimizations etc), in modern compilers and linkers there is an option to perform optimization on multiple files
In microsoft MSVC this is the Whole Program Optimization option, or LTCG flag
Other compilers and linkers have their specifc flags |
|
Back to top |
|
|
| briturner
| Joined: 14 Oct 2008 | Posts: 75 | : | | Items |
|
Posted: Tue Oct 20, 2009 9:27 pm Post subject: |
|
|
Greetings,
I can address the 2 issues raised:
1) The include of .cpp file : This was on purpose, and, while it may or may not help with the optimization, it was really for simplicity's sake. I did not want to include a make file, so this method allowed me to still compile with a simple command line like "gcc -O3 -lrt -xc -obb_sudoku.out bb_sudoku.cpp". I thought I made this clear in the comments, right after the file list, but I should have made it more clear at the include point.
2) v = v & -v; is correct. It is a bit twiddle method for finding a set bit. The '-' is really being used to gets the 2s compliment of the binary value, and not really the negative of the value. Therefore, you can ignore the error.
brit |
|
Back to top |
|
|
| sudoking
| Joined: 20 Oct 2009 | Posts: 40 | : | | Items |
|
Posted: Wed Oct 21, 2009 10:37 am Post subject: |
|
|
briturner wrote: | I did not want to include a make file, so this method allowed me to still compile with a simple command line like "gcc -O3 -lrt -xc -obb_sudoku.out bb_sudoku.cpp". |
It's been a while since I compiled anything with gcc, however, I'm pretty sure you can compile multiple sources from the command line, with something like:
gcc -O3 -lrt -xc -obb_sudoku.out bb_sudoku.cpp bb_sudoku_solver.cpp
Quote: | I thought I made this clear in the comments, right after the file list, but I should have made it more clear at the include point. |
Like more eager programmers, I jumped straight into the code without first reading the comments.
2) v = v & -v; is correct. It is a bit twiddle method for finding a set bit. The '-' is really being used to gets the 2s compliment of the binary value, and not really the negative of the value. Therefore, you can ignore the error.[/quote]
Is there any reason that v is an unsigned int? I changed it to an int, and it didn't seem to make any difference, aside from the fact the compile warning is gone.
Also, I manages to get about a 10% increase in performance by making some simple changes. Firstly, I cut out all the code that's no longer needed. Most of the code in the DecodePuzzleString is redundant. The puzzles I'm feeding the solver are all 81 chars of either 1-9 or a '.' so all the code in that routine to handle other formatting characters was redundant.
But the biggers performance increase was by creating a pointer (Gp) to G[PIdx], and then updating it when ever PIdx changes. Then, every instance of G[PIdx].xxx can be replaced with Gp->xxx. I tried this on 20,000,000 puzzles from my generator (many of which had multiple solutions), and while the original code took 1211s, the new version took only 1086s. I haven't had time to look for any other instances where a similar optimisation could be made, but I will. I'll keep you posted. |
|
Back to top |
|
|
| sudoking
| Joined: 20 Oct 2009 | Posts: 40 | : | | Items |
|
Posted: Thu Oct 22, 2009 1:12 pm Post subject: Another performance improvement. |
|
|
I changed the following macro, and yielded another increase in performance of about 7%.
From:
Code: | #define MarkChanged(x) { ChangedGroup[C2Grp[x][0]] = ChangedGroup[C2Grp[x][1]] = ChangedGroup[C2Grp[x][2]] = Changed = 1; } |
To:
Code: | #define MarkChanged(x) {register unsigned char const *ucp = C2Grp[x]; ChangedGroup[*(ucp++)] = ChangedGroup[*(ucp++)] = ChangedGroup[*(ucp++)] = Changed = 1; } |
|
|
Back to top |
|
|
| JasonLion
| Joined: 16 Nov 2008 | Posts: 62 | : | Location: Silver Spring, MD | Items |
|
Posted: Thu Oct 22, 2009 7:04 pm Post subject: |
|
|
That code is not portable. There is no guarantee of when the post increments will happen, they could all happen after the assignments, which would cause incorrect results. |
|
Back to top |
|
|
| sudoking
| Joined: 20 Oct 2009 | Posts: 40 | : | | Items |
|
Posted: Thu Oct 22, 2009 8:26 pm Post subject: |
|
|
JasonLion wrote: | That code is not portable. |
That's not really an issue, the original code is no more portable than the code I've added, as it uses non standard call.
Quote: | There is no guarantee of when the post increments will happen, they could all happen after the assignments, which would cause incorrect results. |
I've always thought that the a post increment was applied immediately after the assignment. I've coded that way for years, across a broad range of compilers, and I've used similar code many times, without issue. I tested this code on MSVC and gcc without issue.
Can you give an example of a compiler that would spit the dummy at this code?
An interesting note is that I recompiled with this:
Code: | #define MarkChanged(x) {register unsigned char const *ucp = C2Grp[x]; ChangedGroup[*(ucp++)] = 1; ChangedGroup[*(ucp++)] = 1; ChangedGroup[*ucp] = Changed = 1; }
|
and the performance increase disappeared, but only on the MSVC version. (NOTE: The last ++ wasn't needed in this version, because I could guarantee the order in which the elements of ChangedGroup were assigned.) |
|
Back to top |
|
|
| JasonLion
| Joined: 16 Nov 2008 | Posts: 62 | : | Location: Silver Spring, MD | Items |
|
Posted: Fri Oct 23, 2009 1:58 am Post subject: |
|
|
The order of execution of post-increments relative to the expression they are in is explicitly not defined and the compiler is allowed to do it in a number of different ways. The old Borland C compiler would delay all post increments until the very end of the expression (after the assignments).
As a practical matter, most modern x86 compilers will do what you expect them to do. On other architectures, the results are less likely to be suitable. But there are no guarantees.
Writing code like this tends to cause trouble. It might work for now, but it could stop working at any time in the future. Some minor change to the compilers optimizer could change the results. If that happened, it would take quite a while before you could pin down what had gone wrong with the code.
I haven't seen anything in briturner's code that has a similar problem. There are portability issues with some of the includes and the timing routines, but they are obvious and easily fixed. Introducing an undefined order of execution issue could cause much trickier problems that would be difficult to pin down without completely understanding what the code was intended to do.
I am fairly sure that there is another way to write it which will get the speedup, without introducing order of execution ambiguity. But it will take some experimenting with the compiler to find out what it is. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
Igloo Theme Version 1.0 :: Created By: Andrew Charron
|