Dancing Links


Xi Chen
Student, Computer Science and Engineering
University of New South Wales
Kensington
hypernewbie@gmail.com


Introduction

Dancing links is the development that resulted from a simple idea. This simple idea just seems to fit right into the brute force backtracking depth-first search algorithm to solve the "Exact Cover" NP-Complete problem. Many popular puzzles such as the widely known Sodoku puzzle can be reduced to "Exact Cover".

The "Exact Cover" problem is as follows:
Given an array of M*N bits, either 0 or 1, find a set of one or more rows in the array such that every column has exactly one "1". For example, in the following array, an answer would be rows 1, 4 and 5:

0 0 1 0 1 1 0
1 0 0 1 0 0 1
0 1 1 0 0 1 0
1 0 0 1 0 0 0
0 1 0 0 0 0 1
0 0 0 1 1 0 1


The Data Structure

The simple idea behind it all is that in a linked list, a node can be easily removed (covered), and a node can also easily be put back in (uncovered). To cover a node in a linked list, we simply do this:

  Node.Right.Left = Node.Left;
  Node.Left.Right = Node.Right;

And to put it back in, we simply do this:

  Node.Right.Left = Node;
  Node.Left.Right = Node;

We can then build a 2D circular linked array (matrix), with pointers for every node having pointers to Up, Down, Left and Right. A "Header" row marks the "beginning" of every column. The last node of every column circularly links back to the Header, and vice versa. The last node of every row links back to the first node, and vice versa. The "header" row is given a special "root" node which marks the entry of the whole data structure. One way to visualise the whole thing is as a torus, a donut, hanging from a string, the string being the "root" node.

A visualisation of the data structure for the above input:

How to build such a data structure is up to personal preference, but my way to implement the construction is to input the array of 0's and 1's, and then for each 1, loop through to find its left, right, up and down neighbours.


The Algorithm

There is technically no "Dancing Links algorithm", since Dancing Links is simply the application of a simple idea onto the optimisation of the naive backtracking brute force solution to Exact Cover. With this optimisation, the brute force seems to work unbelievably well.

Ok, now, onto the algorithm. Lets start with the brute force solution:

  Find an unfilled column C - if theres no such column, we have a solution.
  For every row R that fills this column C:
      Try using row R in our solution - which means:
	      1. Removing R; so we don't consider it again
	      2. Removing every column that R fills; we don't need to fill those again
	      3. For every column that R fills, remove the other rows that also fill the column; we don't need any other row that 
	         fill what this row has already filled.
      Recurse with the new state.
      "Undo" all the things removed while trying row R; restore to previous state.
  Next

Bring the two together

Now, without the dancing links data structure, one could probably use an array of booleans which they loop through lots of times. On the above example; this won't seem so bad. But on extremely sparse graphs, having to loop through the entire column in order to find one that is unfilled could take heaps of time.

This is where the data structure comes in!

When a node is removed in a link list, it is actually removed, in the sense of "the node is not in the linked list anymore". This means, Finding an unfilled column is simply an O(1) operation. Removing a column oce its been "filled" is also an O(1) operation, since you can just cover the header node. When looping through "For every row R that fills this column C", we don't need to loop through every single row of the array to find one that has not been marked "used", we simply traverse the linked list!


Implementation

The nodes can easily be represented in C as structures. The structure may contain:
  * A Header pointer - I like to have every node point to the "Header" node of its column, for easy and fast access.
  * A Left pointer
  * A Right pointer
  * A Up Pointer
  * A Down pointer
  * Some quick form of ID - Being able to name nodes is nice when you're debugging or testing, for outputting, so you don't
    have to look at pointer addresses all day and try to decode them.
The matrix can be built, as mentioned above, from a 2-D data array of either 0 or 1. The data in this array can, in the above example case, but just hardcoded. The data array could also be generated by code, in the case of the sodoku solver.

I like to keep things relatively tidy, and I made my "Cover" function take a column, and make that column "solved", i.e. covering the column, and covering every row that solves this column. I also made a similar "UnCover" function that does the reverse, for backtracking.

The actual search algorithm is just a simple recursive brute force depth-first search that loops through a bunch of linked lists, to search through every possibile solution.


Bringing Sodoku to Exact Cover

Ok, now we have an extremely fast solution to the Exact Cover problem, now let's solve sodoku with it? How?

We bring sodoku down and turn it into an Exact Cover problem.

Lets say, you're doing your exam for your piano grade. Lets think of the columns not as columns, but as a precise list of tricks you must show to the examiner that you are able to do before he/she can give you the pass. The examiner is also very easily bored, and will immediately fail you if you do the same trick twice.
Now, think of the rows not as rows, but as an exact song list you can choose from to play to the examiner. You may choose one or more songs from your list of songs. You can choose to play all the songs in your list.
Songs can show the examiner different tricks, so she/he can mark you off for them.

You must pick a set of songs such that every trick on the examiner's list is fufilled exactly once.

Lets carry over this way of thinking of rows and columns of the matrix over to sodoku.
The columns is a list of "requirements" that must be met. There are 4 types of requirements in sodoku:

  1. Every square must have a number in it - [A Number has to be in 0,0] [A Number has to be in 0,1]...etc...[A Number as to be in 8,8] 
     = 9*9 = 81 requirements.
  2. Every row must have numbers 1-9 in it - [Row 0 must have a 1] [Row 0 must have a 2] ...etc... [Row 1 must have a 1] [Row 2 must have a 2]
     ...etc...[Row 8 must have a 1]....etc....[Row 8 must have a 9] = 9*9 = 81 requirements.
  3. Every column must have numbers 1-9 in it - [Column 0 must have a 1]......etc.....[Column 8 must have a 9] = 9*9 = 81 requirements.
  4. Every box must have numbers 1-9 in it - [Box 0 must have a 1] [Box 0 must have a 2] ...etc...[Box 8 must have a 9] = 9*9 = 81 requirements.

Therefore, we have 81 * 4 = 324 requirements which need to be met. If they're all met exactly once, we have a solution!
This translates to 324 columns in our matrix.

Carrying the same thinking over, rows are our list of songs we can pick. In sodoku, rows are our list of things that we can do: we can place a number [1-9] in any square ( 0 to 8, 0 to 8 ). This gives us 9*9*9=729 things we can do, or "songs to choose from". Add the 1 extra row for the "header" row, that gives us 729 + 1 = 730 rows in our matrix! Therefore, each row can represent "a number placement".

Still carrying the same thinking over, in our above example scenario, each song shows the examiner a different set of tricks. In sodoku, each row, or "number placement", solves a set of requirements. For instance, "Placing a 3 in square (1,1)" solves the requirements [A Number has to be in 1,1], [Row 1 must have a 3], [Col 1 must have a 3], and [Box 0 must have a 3].

When we choose out "right set of songs" to pass our piano exam, the same applies where: we choose the right set of "number placements" to satisfy all our requirements for a complete sodoku.


Example C Code

The C code comes down to:

#include <stdio.h>

#define MAX_COL 750
#define MAX_ROW 750

#define SQ_OFFSET 0
#define RW_OFFSET 81
#define CL_OFFSET 162
#define BX_OFFSET 243

struct str_node {
    
    struct str_node * Header;
    
    struct str_node * Left;
    struct str_node * Right;
    struct str_node * Up;
    struct str_node * Down;
    
    char IDName;
    int  IDNum;
} ;

int nCol;
int nRow;
struct str_node Matrix[MAX_COL][MAX_ROW];
struct str_node Root;
struct str_node *RootNode = &Root;
struct str_node *RowHeader[MAX_ROW];
char Data[MAX_COL][MAX_ROW];
int Result[MAX_ROW]; int nResult = 0;
char Finished;
int GlobalProgressUpdate;
int MaxK;

// --> Initialisation functions
inline int dataLeft(int i) { return i-1<0?nCol-1:i-1; }
inline int dataRight(int i) { return (i+1)%nCol; }
inline int dataUp(int i) { return i-1<0?nRow-1:i-1; }
inline int dataDown(int i) { return (i+1)%nRow; }

void CreateMatrix(void) {
    int a,b, i, j;
    //Build toroidal linklist matrix according to data bitmap
    for(a=0;a<nCol;a++) {
        for(b=0;b<nRow;b++) {
            if(Data[a][b]!=0) {
                // Left pointer
                i = a; j = b; do { i = dataLeft(i); } while (Data[i][j]==0);
                Matrix[a][b].Left = &Matrix[i][j]; 
                // Right pointer
                i = a; j = b; do { i = dataRight(i); } while (Data[i][j]==0);
                Matrix[a][b].Right = &Matrix[i][j];
                // Up pointer
                i = a; j = b; do { j = dataUp(j); } while (Data[i][j]==0);
                Matrix[a][b].Up = &Matrix[i][j];
                // Down pointer
                i = a; j = b; do { j = dataDown(j); } while (Data[i][j]==0);
                Matrix[a][b].Down = &Matrix[i][j]; 
                // Header pointer
                Matrix[a][b].Header = &Matrix[a][nRow-1];
                Matrix[a][b].IDNum = b;
                //Row Header
                RowHeader[b] = &Matrix[a][b];
            }
        }
    }
    for(a=0;a<nCol;a++) {
        Matrix[a][nRow-1].IDName = 'C';  
        Matrix[a][nRow-1].IDNum = a;
    }
    //Insert root
    Root.IDName = 'R'; 
    Root.Left = &Matrix[nCol-1][nRow-1]; Root.Right = &Matrix[0][nRow-1];
    Matrix[nCol-1][nRow-1].Right = &Root; Matrix[0][nRow-1].Left = &Root;
}

// --> DLX Algorithm functions
struct str_node *ChooseColumn(void) {
    return RootNode->Right;
}

void Cover(struct str_node *ColNode) {
    struct str_node *RowNode, *RightNode;
    ColNode->Right->Left = ColNode->Left;
    ColNode->Left->Right = ColNode->Right;
    for(RowNode = ColNode->Down; RowNode!=ColNode; RowNode = RowNode->Down) {
        for(RightNode = RowNode->Right; RightNode!=RowNode; RightNode = RightNode->Right) {
            RightNode->Up->Down = RightNode->Down;
            RightNode->Down->Up = RightNode->Up;
        }
    }
}

void UnCover(struct str_node *ColNode) {
    struct str_node *RowNode, *LeftNode;
    for(RowNode = ColNode->Up; RowNode!=ColNode; RowNode = RowNode->Up) {
        for(LeftNode = RowNode->Left; LeftNode!=RowNode; LeftNode = LeftNode->Left) {
            LeftNode->Up->Down = LeftNode;
            LeftNode->Down->Up = LeftNode;
        }
    }
    ColNode->Right->Left = ColNode;
    ColNode->Left->Right = ColNode;
}

void SolutionRow(struct str_node *RowNode) {
    Cover(RowNode->Header);
    struct str_node *RightNode;
    for(RightNode = RowNode->Right; RightNode!=RowNode; RightNode = RightNode->Right) { Cover(RightNode->Header); }
}

void PrintSolution(void);

void Search(int k) {
    /*if(GlobalProgressUpdate < k) {
        printf("== Search(%d)\n", k);
        PrintSolution();
        GlobalProgressUpdate = k;
    }*/
    if((RootNode->Left == RootNode && RootNode->Right==RootNode) || k == (81-MaxK)) {
        //Valid solution!
        printf("----------- SOLUTION FOUND -----------\n");
        PrintSolution();
        Finished = 1;
        return;
    }
    struct str_node *Column = ChooseColumn();
    Cover(Column);
    
    struct str_node *RowNode;
    struct str_node *RightNode;
    for(RowNode = Column->Down; RowNode!=Column && !Finished; RowNode = RowNode->Down) {
        // Try this row node on!
        Result[nResult++] = RowNode->IDNum;
        for(RightNode = RowNode->Right; RightNode!=RowNode; RightNode = RightNode->Right) {
            Cover(RightNode->Header);
        }
        Search(k+1);
        // Ok, that node didn't quite work
        for(RightNode = RowNode->Right; RightNode!=RowNode; RightNode = RightNode->Right) {
            UnCover(RightNode->Header);
        }
        Result[--nResult] = 0;
    }
    
    UnCover(Column);
}

// --> Sodoku to Exact Cover conversion

// Functions that extract data from a given 3-digit integer index number in the format [N] [R] [C].
inline int retNb(int N) { return N/81; }
inline int retRw(int N) { return (N/9)%9; }
inline int retCl(int N) { return N%9; }
inline int retBx(int N) { return ((retRw(N)/3)*3) + (retCl(N)/3); }
inline int retSq(int N) { return retRw(N)*9 + retCl(N); }
inline int retRn(int N) { return retNb(N)*9 + retRw(N); }
inline int retCn(int N) { return retNb(N)*9 + retCl(N); }
inline int retBn(int N) { return retNb(N)*9 + retBx(N); }
// Function that get 3-digit integer index from given info
inline int getIn(int Nb, int Rw, int Cl) { return Nb*81 + Rw*9 + Cl; }

void PrintSolution(void) {
    int a,b,c;
    int Sodoku[9][9] = {};
    for(a=0;a<9;a++) for(b=0;b<9;b++) Sodoku[a][b] = -1;
    for(a=0;a<nResult;a++) Sodoku[retRw(Result[a])][retCl(Result[a])] = retNb(Result[a]);
    for(a=0;a<9;a++) {
        for(b=0;b<9;b++) {
            if(a>0&&a%3==0&&b==0) { for(c=0;c<9;c++) printf("--"); printf("\n"); } //horizontal lines
            if(Sodoku[a][b]>=0) printf("%d%c", Sodoku[a][b]+1, b%3==2?'|':' ' );
            else printf(". ");
        } printf("\n");
    }
}

void BuildData(void) {
    int a,b,c;
    int Index;
    nCol = 324; nRow = 729 + 1;
    for(a=0;a<9;a++) {
        for(b=0;b<9;b++) {
            for(c=0;c<9;c++) {
                Index = getIn(c, a, b);
                Data[SQ_OFFSET + retSq(Index)][Index] = 1; //Constraint 1: Only 1 per square
                Data[RW_OFFSET + retRn(Index)][Index] = 1; //Constraint 2: Only 1 of per number per Row
                Data[CL_OFFSET + retCn(Index)][Index] = 1; //Constraint 3: Only 1 of per number per Column
                Data[BX_OFFSET + retBn(Index)][Index] = 1; //Constraint 4: Only 1 of per number per Box
            }
        }
    }
    for(a=0;a<nCol;a++) { Data[a][nRow-1] = 2; }
    CreateMatrix();
    for(a=0;a<RW_OFFSET;a++) { Matrix[a][nRow-1].IDName = 'S'; Matrix[a][nRow-1].IDNum = a; }
    for(a=RW_OFFSET;a<CL_OFFSET;a++) { Matrix[a][nRow-1].IDName = 'R'; Matrix[a][nRow-1].IDNum = a-RW_OFFSET; }
    for(a=CL_OFFSET;a<BX_OFFSET;a++) { Matrix[a][nRow-1].IDName = 'C'; Matrix[a][nRow-1].IDNum = a-CL_OFFSET; }
    for(a=BX_OFFSET;a<nCol;a++) { Matrix[a][nRow-1].IDName = 'B'; Matrix[a][nRow-1].IDNum = a-BX_OFFSET; }
}

inline void AddNumber(int N, int R, int C) {
    SolutionRow(RowHeader[getIn(N, R, C)]);
    MaxK++;
    Result[nResult++] = getIn(N, R, C);
}

void LoadPuzzle(char *FileName) {
    FILE * a_file = fopen(FileName, "r");
    if(a_file==NULL) { printf("File load fail!\n"); return; } 
    int a,b; char temp;
    for(a=0;a<9;a++) {
        for(b=0;b<9;b++) {
            if(!fscanf(a_file, "%c\n", &temp)) {printf("File loading error!\n"); return; }
            if(temp>='1'&&temp<='9') { AddNumber((temp-'0')-1, a, b);}
        }
    }
}

int main(void) {
    BuildData();
    LoadPuzzle("puzzle.txt");
    Search(0);
    return 0;
}
Download here: dlx.c


Conclusion

Dancing links, after being made popular by the legendary computer scientist Donald E. Knuth, has become one of the fastest methods of solving Exact Cover problems, including the famous sodoku. Such a simple idea could spark something much larger, and that is one of the beauties of science, computer science being no different.


References

"Dancing Links" - Dr. Donald Knuth, Stanford University - 15 Nov. 2000
http://www.ocf.berkeley.edu/~jchu/publicportal/sudoku/sudoku.paper.html
http://en.wikipedia.org/wiki/Knuth%27s_Algorithm_X
http://en.wikipedia.org/wiki/Knuth%27s_Algorithm_X
http://en.wikipedia.org/wiki/Dancing_Links
http://www.sudopedia.org/wiki/Dancing_Links