Sudoku Programmers Forum Index

 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister   ProfileProfile   Log inLog in          Games  Calendar

Log in to check your private messagesLog in to check your private messages   

Dancing Links - Can it go even faster?

 
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Programming sudoku
View previous topic :: View next topic  
Author Message
Ruud
Site Admin
Joined: 17 Sep 2005
Posts: 708
:
Location: Netherlands

Items
PostPosted: Tue Nov 15, 2005 9:44 pm    Post subject: Dancing Links - Can it go even faster? Reply with quote

Hi,

I would like to open a new discussion here, on the topic of dancing links. I have used this code posted by antony as the basis for the DLX implementation in Sudo Cue, but I did make a few modifications to speed it up a little more.

Maybe other people have found different ways to speed up the process, and I am very interested in your opinions.

These are the parts I optimized:

1. The "header" nodes (columns) are linked according to their size. For each size (1 through 9) there is a root node. These root nodes are linked with the Up/Down links. Finding the best column is now done in only a few steps. No need to maintain size, just relink a node to the up/down peer of its root node.

2. Each row has 4 associated nodes. The structure may be simpler connecting them with left-right links, but performance is better when each node has direct links to the other 3 nodes. No left-right loops with end-of-loop test are required.

3. When the clues are "loaded", the program will already "cover" the affected columns. DLX will not gain much, since the search is already optimized. Still, a few lines of code are squeezed out this way.

Ruud.
Back to top
View user's profile Send private message Visit poster's website
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Wed Nov 16, 2005 7:06 am    Post subject: Reply with quote

how much speed do you gain with 1),2),3) ?

The problem is, what to do on ties, when there are several columns
of size 2, but none of smaller size.

Have you tried 16*16 ? I think 9*9 is too small to
estimate the effect on algorithm improvements.

I implemented the Row/Column <--> Block interaction
yesterday. (rule JAAP1, N(c1)<N(c2), r in N(c2)-N(C1)
then r can be removed)

but it was slower than before. Even if I could test this in zero time,
it would only give 15% more speed on 16*16.
I only did it for columns of size 2, though.

I got 6fold better speed on 16*16 with a _bug_ recently, I still don't
really understand why. I attach the C-code below.
The idea is to remember which choices were better on ties
and then reuse this info.



Code:


// speed-optimized for hard 16*16s , new trick: remember how good
// column-choices were ! (see lines with a ~)

#include <stdio.h>
#include <time.h>
#define N 4 // 16*16-sudoku
#define N2 N*N
#define N4 N2*N2
#define MWC ( (zr=36969*(zr&65535)+(zr>>16)) ^ (wr=18000*(wr&65535)+(wr>>16)) )
 int A[N2+9][N2+9],Ur[N2*N4+1],Uc[4*N4+1],V[N2*N4+1];
 int Rows[4*N4+1],Cols[N2*N4+1],Row[4*N4+1][N2+1],Col[N2*N4+1][5];
 int C[N4+1],I[N4+1],Node[N4+1],Two[N4*4+9],C0[N4*4+9];
 int M1[N4+9][4*N4+9],N1[N4+9][4*N4+9],Z1[N4+9][4*N4+9]; //~
 int t1,i4,c0,m0,m1,try,i,j,k,l,r,r1,c,c1,c2,n=N4*N2,m=4*N4,x,y,s;
 int tries,smax,min,clues,nodes,guesses,solutions;
 unsigned zr=362436069, wr=521288629;
 char L[18]=".123456789ABCDEFG";
 FILE *file;

 int solve(int);
 int reduce(int);
 int unreduce(int);

int main(int argc,char*argv[]){
  if(argc<2){m5:printf("\nusage:suexco file [tries] \n\n");
  printf("prints the number of (<smax)solutions of the sudokus in file\n");
  printf("empty cells are -.* or 0 , other nondigit-characters are");
  printf(" ignored\n");exit(1);}
  smax=999;//if(argc>2)sscanf(argv[2],"%i",&smax);
  tries=1;if(argc>2)sscanf(argv[2],"%i",&tries);

for(i=1;i<=m;i++){j=1;while(j<i)j+=j;Two[i]=j-1;}

r=0;k=N;l=N2;for(x=1;x<=N2;x++)for(y=1;y<=N2;y++)for(s=1;s<=N2;s++){
       r++;Cols[r]=4;
       Col[r][1]=x*N2-N2+y;
       Col[r][2]=(k*((x-1)/k)+(y-1)/k)*l+s+N4;
       Col[r][3]=x*N2-N2+s+N4*2;
       Col[r][4]=y*N2-N2+s+N4*3;}
for(i=1;i<=m;i++)Rows[i]=0;
for(r=1;r<=n;r++)for(i=1;i<=Cols[r];i++){
  c=Col[r][i];Rows[c]++;Row[c][Rows[c]]=r;}

 if((file=fopen(argv[1],"rb"))==NULL)
    {fclose(file);printf("\nfile-error\n\n");goto m5;}
m6:t1=clock();i=0;for(x=1;x<=N2;x++)for(y=1;y<=N2;y++){
   m1:if(feof(file))exit(1);
   c=fgetc(file);j=0;if(c=='-' || c=='.'|| c=='0' || c=='*')goto m7;
   while(L[j]!=c && j<=N2)j++;if(j>N2)goto m1;
   m7:A[x][y]=j;i++;};

   for(i=0;i<=N4;i++)Node[i]=0;nodes=0;

for(try=1;try<=tries;try++){ // you can do multiple tries for benchmarking here

solve(smax);
printf("%i sol.  %8i nodes  %8i guesses  %i/91sec\n",solutions,nodes,guesses,clock());
//for(i=1;i<=N4;i++)printf("%i ",Node[i]);printf("\n\n");
}
goto m6; }







int solve(int smax){
 for(i=0;i<=N4>>4;i++)for(j=1;j<=m;j++){N1[i][j]=0;Z1[i][j]=0;} //~
 for(i=0;i<=n;i++)Ur[i]=0;for(i=0;i<=m;i++)Uc[i]=0;
 clues=0;for(x=1;x<=N2;x++)for(y=1;y<=N2;y++)
   if(A[x][y]){clues++;r=x*N4-N4+y*N2-N2+A[x][y];
     for(j=1;j<=Cols[r];j++){c1=Col[r][j];if(Uc[c1])return -1;Uc[c1]++;
       for(k=1;k<=Rows[c1];k++){r1=Row[c1][k];Ur[r1]++;}}}
 for(c=1;c<=m;c++){V[c]=0;for(r=1;r<=Rows[c];r++)if(Ur[Row[c][r]]==0)V[c]++;}

   m0=0;m1=0;guesses=0;i=clues;solutions=0;
m2:i++;I[i]=0;min=n+1;if(i>N4 || m0)goto m4;if(m1){C[i]=m1;goto m3;}

   c0=0;for(c=1;c<=m;c++)if(!Uc[c]){if(V[c]<=min){c0++;C0[c0]=c;}
   if(V[c]<min){min=V[c];c0=1;C[i]=c;C0[c0]=c;if(min<2)goto m3;}}
   if(min>1)guesses++;

   m2a:c1=MWC&Two[c0];if(c1>=c0)goto m2a;c1++;C[i]=C0[c1];
   min=999999;i4=i>>4;
   for(j=1;j<=c0;j++){c=C0[j];if(N1[i4][c]<min*Z1[i4][c]){ //~
      min=N1[i4][c]/Z1[i4][c];C[i]=c;}} //~

m3:c=C[i];I[i]++;if(I[i]>Rows[c])goto m4;
   r=Row[c][I[i]];if(Ur[r])goto m3;m0=0;m1=0;
   for(j=1;j<=Cols[r];j++){c1=Col[r][j];Uc[c1]++;}
   for(j=1;j<=Cols[r];j++)reduce(Col[r][j]);
   Node[i]++;nodes++;
   if(i==N4)solutions++;if(solutions==smax)return smax;
   M1[i][c]=nodes; //~ remember nodes
   goto m2;
m4:c=C[i];
   N1[i>>4][c]+=nodes-M1[i][c];Z1[i>>4][c]++; //~ column-statistics
   i--;c=C[i];r=Row[c][I[i]];
   for(j=1;j<=Cols[r];j++)unreduce(Col[r][j]);
   if(i>clues)goto m3;
return solutions;}


int reduce(int c){ // deletes c and N[c], updates V[],m0,m1
  int r,c2,k,l;
  for(k=1;k<=Rows[c];k++){r=Row[c][k];Ur[r]++;if(Ur[r]==1)
     for(l=1;l<=Cols[r];l++){c2=Col[r][l];V[c2]--;
        if(Uc[c2]+V[c2]<1)m0=c2;if(Uc[c2]==0 && V[c2]<2)m1=c2;}}}

int unreduce(int c){
  int r,c2,k,l;
  Uc[c]--;
  for(k=1;k<=Rows[c];k++){r=Row[c][k];Ur[r]--;
     if(Ur[r]==0)for(l=1;l<=Cols[r];l++){c2=Col[r][l];V[c2]++;}}}

Back to top
View user's profile Send private message Send e-mail Visit poster's website
Nick70

Joined: 08 Jun 2005
Posts: 160
:

Items
PostPosted: Wed Nov 16, 2005 11:02 pm    Post subject: Re: Dancing Links - Can it go even faster? Reply with quote

[quote="Ruud"]1. The "header" nodes (columns) are linked according to their size. For each size (1 through 9) there is a root node. These root nodes are linked with the Up/Down links. Finding the best column is now done in only a few steps. No need to maintain size, just relink a node to the up/down peer of its root node.[/quote}
This is an interesting suggestion and I tested it, but the speed decreased instead of increasing.

I think for higher order sudokus it might be an improvement, because you are removing the scan of the available columns to find the shorter one. For 9x9 sudoku, however, doing that scan seems to be less expensive than unlinkking/relinking the columns every time you unlink/relink a row.
Back to top
View user's profile Send private message
soduko

Joined: 10 Oct 2005
Posts: 50
:

Items
PostPosted: Tue Nov 22, 2005 2:18 pm    Post subject: Reply with quote

I was thinking about the following speed up:

Before the trail and error fase.

If a test (Constrait) can only be satisfied by two (valid) placements :
(call them P1 and P2)
All placements that make P1 false will be followed by P2
All placements that make P2 false will be followed by P1
(for explainations sake this are the eliminating placements)
P1 and P2 are no options to start with in the T&E fase.


An example
the Test C8V9 can only be satisfied by
P1 R1C8V9 and
P2 R2C8V9

P1 is made false by

R1C1V9 till R1C9V9
R1C8V9 till R9C8V9
ect
(the squares of this are always the same for every square and can be stored in an array before starting the solving proces)


All these placements will have the followup placement of p2 R2C8V9.

the same applies for the other placement.

P1 and P2 can be eliminated from the trail and error fase.
(they are now implied in all the tests resulting from the eliminating placements)


You can also test the eliminating placements them selves

If 2 followups from the same placements are contradictionary the placement is false and can be removed.

ect, ect.



Hopes this makes it clear.
Now i must find time to change it in a working program...
Back to top
View user's profile Send private message
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Wed Nov 23, 2005 5:37 am    Post subject: Reply with quote

>was thinking about the following speed up:
>
>Before the trail and error fase.

where is it

>If a test (Constrait) can only be satisfied by two (valid) placements :
>(call them P1 and P2)
>All placements that make P1 false will be followed by P2

a plavement Q "makes P1 false" iff A-P1-Q has no solution ?
(A is our exact-cover-matrix)
isn't it equivalent to "Q in N(N(P1))" and to "P1 in N(N(Q))" ?

but what means : "will be followed by P2" ?

>All placements that make P2 false will be followed by P1
>(for explainations sake this are the eliminating placements)
>P1 and P2 are no options to start with in the T&E fase.

?

>An example
>the Test C8V9 can only be satisfied by

what's V ?

>P1 R1C8V9 and
>P2 R2C8V9

OK, V must be symbol = digit

>P1 is made false by
>
>R1C1V9 till R1C9V9
>R1C8V9 till R9C8V9
>ect

OK. It's N(N(P1))

>(the squares of this are always the same for every square

squares=placements=rows or squares=cells ?

> and can be stored in an array before starting the solving proces)

empty sudoku has |N(N(r))|=29 for each r.
They can always be easily addressed, even without initially storing them.
When you store them, there could be problems when the links start dancing.

>All these placements will have the followup placement of p2 R2C8V9.

what you mean with followup placement ?

>the same applies for the other placement.
>
>P1 and P2 can be eliminated from the trail and error fase.
>(they are now implied in all the tests resulting from the
> eliminating placements)

whenever placement p1 is chosen, we delete p1,N(p1),N(N(p1)).
You can calculate N(N(p1)) for each p1 in advance, but I can't see
why this should help. Just walking through N(p1) and then N(N(p2))
should have almost same speed

>You can also test the eliminating placements them selves
>
>If 2 followups from the same placements are contradictionary
> the placement is false and can be removed.
>
>ect, ect.
>
>
>
>Hopes this makes it clear.
>Now i must find time to change it in a working program...

better spend time on thinking what to do on ties, when
several columns c have #c=|N(c)|=2 ...

We might consider |N(N(c))| or |N(N(N(c)))| ,... although
I would only check them occasionally , so to concentrate on regions.
E.g. in samurai you would concentrate on one of the 9*9 and not
swap the 9*9s on each move and concentrate on another 9*9 at each move.

Dancing links usually doesn't know about these global connections,
it only sees the next neighbors N(c),N(r).

If one of the 9*9s is already half-full and another one is just
only half-full in one column, a human would probably first try
to complete the first one.
But how can we tell this to the computer ?

Or in larger sudokus, when clues are gathered in certain regions,
a human can spot this immediately.
But it's hard to implement, so most programs will presumably fail
to recognize this.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
soduko

Joined: 10 Oct 2005
Posts: 50
:

Items
PostPosted: Thu Nov 24, 2005 10:49 pm    Post subject: Reply with quote

Sorry

I realised that my method has nothing to do with DLX

It is more a way of seeing sudoku as an exact cover problem.

But still to reply to your reply (maybe it is still usefull)


Quote:
>If a test (Constrait) can only be satisfied by two (valid) placements :
>(call them P1 and P2)
>All placements that make P1 false will be followed by P2

a plavement Q "makes P1 false" iff A-P1-Q has no solution ?
(A is our exact-cover-matrix)
isn't it equivalent to "Q in N(N(P1))" and to "P1 in N(N(Q))" ?



Yes, more precise the first one, the search starts at P1 not at Q
becaurse P1 is in N(c) with #c=2

Quote:

but what means : "will be followed by P2" ?

attach to placement Q the placement P2 (so that if Q is "placed" P2 is "placed" as well (if P2 isn't "placed " before )


Quote:

>All placements that make P2 false will be followed by P1
>(for explainations sake this are the eliminating placements)
>P1 and P2 are no options to start with in the T&E fase.

?


Here i missed the crux of DLX
But I mean that you can almost ignore the collumn altogether, just give it a high size (the S value in Knuths paper, I finaly found his article in pdf format)

Quote:

OK, V must be symbol = digit

>P1 is made false by
>
>R1C1V9 till R1C9V9
>R1C8V9 till R9C8V9
>ect

OK. It's N(N(P1))

>(the squares of this are always the same for every square

squares=placements=rows or squares=cells ?

> and can be stored in an array before starting the solving proces)

empty sudoku has |N(N(r))|=29 for each r.
They can always be easily addressed, even without initially storing them.
When you store them, there could be problems when the links start dancing.



Yes V is S ( I prefer V, but that is a private matter, I hope)
Squares = cells
And also Yes you can also get them from the A array.




Quote:
>All these placements will have the followup placement of p2 R2C8V9.

what you mean with followup placement ?




explaned earlier I hope.



Quote:
whenever placement p1 is chosen, we delete p1,N(p1),N(N(p1)).
You can calculate N(N(p1)) for each p1 in advance, but I can't see
why this should help. Just walking through N(p1) and then N(N(p2))
should have almost same speed


But now you already know what the "following placements" are you do not have to search for collums with one candidate at all.
you can even skip testing between Q and P2 if you like.

Quote:
better spend time on thinking what to do on ties, when
several columns c have #c=|N(c)|=2 ...



If followed this method there are almost no collums with
#c=|N(c)|=2
to test anymore.

(they are mostly reduced to following placements.)


The problem is that there can exsist "following placement cycles" ( P2 follows from Q follows from ... follows from P2)
If you have an algoritm that causes them not to arise then you can really delete the collumn. I am still puzzling with this but it doesn't seem to difficult.

I came to the conclusion that this is much more exact cover programming than DLX , but still i hope it is helpfull

DLX sees the A array as a lot of connected nodes (every 1 is a node)

I see A more as 5 real seperate arrays
a Tests-statusarray
a Placement-statusarray
a placement -> tests array
a tests -> placement array
and
a placements with followups test array
(and I am even tempted to fill A not binary but with values from 0-9)


The problem with using my method within DLX loops is the backtracking.
(to undo the followups if they are by backtracking not valid anymore)
So you can only use them where backtracking will not happen.
And that is in the beginning just after "placing the clues"

It is a bit an excact cover replacement for a lot of logic.
I think I Christen this method Exact Cover Chains


The big question is does this method takes less time to erase the #c=2 collumns than that DLX takes to do the #c=2 collumns


Ps I hope that you can agree with the use of Test for Constraits I was just looking for something that doesn't start with a C
(To not confuse it with collum)
Back to top
View user's profile Send private message
dukuso

Joined: 14 Jul 2005
Posts: 424
:
Location: germany

Items
PostPosted: Thu Nov 24, 2005 11:38 pm    Post subject: Reply with quote

doesn't it result in the same algo, just some local
deep-first replaced by some breadth first, so to say ?

it could be faster if we can use 128-bit masks to
mark N(N(r)) as used

(using marked rows and columns instead of dancing links)
Back to top
View user's profile Send private message Send e-mail Visit poster's website
soduko

Joined: 10 Oct 2005
Posts: 50
:

Items
PostPosted: Fri Nov 25, 2005 10:56 am    Post subject: Reply with quote

Quote:

doesn't it result in the same algo, just some local
deep-first replaced by some breadth first, so to say ?


If you wish, but i see it more as adding "logic" and "chains" to DLX


Quote:

it could be faster if we can use 128-bit masks to
mark N(N(r)) as used


It is possible with 3 128 bit masks

there are 28 nodes in N(N(r)) (forgetting the node N(N(r))=r )

per node 2 bits for indicating the subarray (place, row, column, box)
and 8 bits for (Row,Collumn) (Col, symbol) (row symbol) or (box symbol)
(every value instead of 0-9 expanded to 0-F)
so 280 bits in total

maybe it is more efficient if you keep a nibble (old word for a half byte) per info (so also use 4 bits (mibble) per subarray indicator)
Back to top
View user's profile Send private message
Ruud
Site Admin
Joined: 17 Sep 2005
Posts: 708
:
Location: Netherlands

Items
PostPosted: Sat Nov 26, 2005 3:36 pm    Post subject: Reply with quote

Nick70 wrote:
(Linking headers according to their size) This is an interesting suggestion and I tested it, but the speed decreased instead of increasing.

I have not tested it separately, but you may be right. I did optimize the code a bit further so that "selection" of a row does not re-link the 4 headers 8 times each, but links them straight to size=1 root node.

Ruud.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    Sudoku Programmers Forum Index -> Programming sudoku All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Sudoku Programmers topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron