COMP9024 23T3 - Week 4 Problem Set

Week 4 Problem Set
Graph Data Structures and Graph Search

path
- smallest: any path with one edge (e.g. 1-2 or 6-13)
- largest: some path including all but two nodes (e.g. 3-8-9-2-1-6-5-4-11-12-7)
cycle
- smallest: any cycle with 4 nodes (e.g. 10-2-3-8-10)
- largest: any cycle with 10 nodes (e.g. 1-2-3-8-7-12-11-4-5-6-1)
spanning tree
- smallest: any spanning tree must include all nodes (an example is the largest path above plus the edges 6-13 and 2-10)
- largest: same
vertex degree
- smallest: there are several nodes of degree 2 (e.g., nodes 1 and 10)
- largest: in this graph, 4 (nodes 2 and 8)
clique
- smallest: any vertex by itself is a clique of size 1
- largest: any two nodes that are connected, e.g. 1 and 2 (this graph has no cliques of size 3 or higher)

The following algorithm uses two nested loops to compute the degree of each vertex. Hence its asymptotic running time is O(n²).

degrees(g):
   Input  graph g
   Output array of vertex degrees

   for all vertices v∈g do
      deg[v]=0
      for all vertices w∈g, v≠w do
         if v,w adjacent in g then
            deg[v]=deg[v]+1
         end if
      end for
   end for
   return deg

The following algorithm uses three nested loops to print all 3-cliques in order. Hence its asymptotic running time is O(n³).

show3Cliques(g):
   Input graph g of n vertices numbered 0..n-1

   for all i=0..n-3 do
      for all j=i+1..n-2 do
         if i,j adjacent in g then
            for all k=j+1..n-1 do
               if i,k adjacent in g and j,k adjacent in g then
                  print i"-"j"-"k
               end if
            end for
         end if
      end for
   end for

The following algorithm uses two nested loops to calculate the graph density. Hence its asymptotic running time is O(n²).

density(g):
   Input  graph g
   Output graph density

   m=0
   for all vertices v∈g do
      for all vertices w∈g, w>v do
         if v,w adjacent in g then
            m=m+1
         end if
      end for
   end for
   return (2*m)/(n*(n-1))

Sample graphAnalyser.c:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include "Graph.h"

#define MAX_NODES 1000

// determine and output the sequence of vertex degrees
void degrees(Graph g) {
   int nV = numOfVertices(g);
   int v, w, degree;

   printf("Vertex degrees:\n");
   for (v = 0; v < nV; v++) {
      degree = 0;
      for (w = 0; w < nV; w++) {
         if (adjacent(g,v,w)) {
            degree++;
         }
      }
      printf("%d", degree);
      if (v < nV-1) {
         printf(", ");
      } else {
         printf("\n");
      }
   }
}

// show all 3-cliques of graph g
void show3Cliques(Graph g) {
   int i, j, k;
   int nV = numOfVertices(g);

   printf("Triplets:\n");
   for (i = 0; i < nV-2; i++) {
      for (j = i+1; j < nV-1; j++) {
         if (adjacent(g,i,j)) {
            for (k = j+1; k < nV; k++) {
               if (adjacent(g,i,k) && adjacent(g,j,k)) {
                  printf("%d-%d-%d\n", i, j, k);
               }
            }
         }
      }
   }
}

// calculate and output the density of undirected graph g
void density(Graph g) {
   int nV = numOfVertices(g);
   int nE = 0;
   int v, w;
   double density;

   for (v = 0; v < nV-1; v++) {
      for (w = v+1; w < nV; w++) {
         if (adjacent(g,v,w)) {
            nE++;
         }
      }
   }

   density = (2.0 * (double)nE) / ((double)nV * ((double)nV - 1.0));
   printf("Density: %.3lf\n", density);
}

int main(void) {
   Edge e;
   int nV;

   printf("Enter the number of vertices: ");
   scanf("%d", &nV);
   Graph g = newGraph(nV);

   printf("Enter an edge (from): ");
   while (scanf("%d", &e.v) == 1) {
      printf("Enter an edge (to): ");
      scanf("%d", &e.w);
      insertEdge(g, e);
      printf("Enter an edge (from): ");
   }
   printf("Done.\n");

   degrees(g);
   show3Cliques(g);
   density(g);
   freeGraph(g);

   return 0;
}

The adjacency matrix representation always requires a V×V matrix, regardless of the number of edges, where each element is 1 byte long. It also requires an array of V pointers. This gives a fixed size of V·8+V² bytes.

The adjacency list representation requires an array of V pointers (the start of each list), with each being 8 bytes long, and then one list node for each edge in each list. The total number of edge nodes is 2E (each edge (v,w) is stored twice, once in the list for v and once in the list for w). Since each node requires 16 bytes (vertex+padding+pointer), this gives a size of V·8+16·2·E. The total storage is thus V·8+32·E.

Since both representations involve V pointers, the difference is based on V² vs 32E. So, if 32E < V² (or, equivalently, E:V < V/32), then the adjacency list representation will be more storage-efficient. Conversely, if E:V > V/32, then the adjacency matrix representation will be more storage-efficient.

To pick a concrete example, if V=60 and if we have 112 or fewer edges (112/60 = 1.867 < 60/32 = 1.875), then the adjacency list will be more storage-efficient, otherwise the adjacency matrix will be more storage-efficient.

The following solution uses a loop to compute the correct index in the 1-dimensional edges[] array:

adjacent(g,v,w):
   Input  graph g in upper-triangle matrix representation
          v, w vertices such that v≠w
   Output true if v and w adjacent in g, false otherwise

   if v>w then
      swap v and w        // to ensure v<w
   end if
   chunksize=g.nV-1, offset=0
   for all i=0..v-1 do
      offset=offset+chunksize
      chunksize=chunksize-1
   end for
   offset=offset+w-v-1
   if g.edges[offset]=0 then return false
                        else return true
   end if

Alternatively, you can compute the overall offset directly via the formula `(nV-1)+(nV-2)+...+(nV-v)+(w-v-1)=v/2(2*nV-v-1)+(w-v-1)` (assuming that v < w).

DFS starting at 7:

Current    Stack (top at left)
-          7
7          5
5          2 3 4 6
2          1 3 3 4 6
1          0 3 4 3 3 4 6
0          3 4 3 3 4 6
3          4 4 3 3 4 6
4          4 3 3 4 6
...
6          -

DFS starting at 4:

Current    Stack (top at left)
-          4
4          1 3 5
1          0 2 3 3 5
0          2 3 3 5
2          3 5 3 3 5
3          5 5 3 3 5
5          6 7 5 3 3 5
6          7 5 3 3 5
7          5 3 3 5
...
-          -

BFS starting at 7:

Current    Queue (front at left)
-          7
7          5
5          2 3 4 6
2          3 4 6 1
3          4 6 1
4          6 1
6          1
1          0
0          -

BFS starting at 4:

Current    Queue (front at left)
-          4
4          1 3 5
1          3 5 0 2
3          5 0 2
5          0 2 6 7
0          2 6 7
2          6 7
6          7
7          -

hasCycle(G):
|  Input  graph G
|  Output true if G has a cycle, false otherwise
|
|  mark all vertices as unvisited
|  for each vertex v∈G do           // make sure to check all connected components
|  |  if v has not been visited then
|  |     if dfsCycleCheck(G,v,v) then
|  |        return true
|  |     end if
|  |  end if
|  end for
|  return false

dfsCycleCheck(G,v,u):      // look for a cycle that does not go back directly to u
|  mark v as visited
|  for each (v,w)∈edges(G) do
|  |  if w has not been visited then
|  |  |  if dfsCycleCheck(G,w,v) then
|  |  |     return true
|  |  |  end if
|  |  else if w≠u then
|  |  |  return true
|  |  end if
|  end for
|  return false

The following two C functions implement this algorithm:

#define MAX_NODES 1000
int visited[MAX_NODES];

bool dfsCycleCheck(Graph g, Vertex v, Vertex u) {
   visited[v] = true;
   Vertex w;
   for (w = 0; w < numOfVertices(g); w++) {
      if (adjacent(g, v, w)) {
         if (!visited[w]) {
       if (dfsCycleCheck(g, w, v))
          return true;
    } else if (w != u) {
            return true;
    }
      }
   }
   return false;
}

bool hasCycle(Graph g) {
   int v, nV = numOfVertices(g);
   for (v = 0; v < nV; v++)
      visited[v] = false;
   for (v = 0; v < nV; v++)
      if (!visited[v])
    if (dfsCycleCheck(g, v, v))
       return true;
   return false;
}

After removing d, cc[] = {0,0,0,1,1,1,0,0,0,1} (i.e. unchanged)
After removing b, cc[] = {0,0,2,1,1,1,2,0,2,1} with nC=3

Inserting an edge may reduce the number of connected components:

insertEdge(g,(v,w)):
|  Input graph g, edge (v,w)
|
|  if g.edges[v][w]=0 then               // (v,w) not in graph
|  |  g.edges[v][w]=1, g.edges[w][v]=1   // set to true
|  |  g.nE=g.nE+1
|  |  if g.cc[v]≠g.cc[w] then            // v,w in different components?
|  |  |  c=min{g.cc[v],g.cc[w]}          // ⇒ merge components c and d
|  |  |  d=max{g.cc[v],g.cc[w]}
|  |  |  for all vertices v∈g do
|  |  |     if g.cc[v]=d then
|  |  |        g.cc[v]=c                 // move node from component d to c
|  |  |     else if g.cc[v]=g.nC-1 then
|  |  |        g.cc[v]=d                 // replace largest component ID by d
|  |  |     end if
|  |  |  end for
|  |  |  g.nC=g.nC-1
|  |  end if
|  end if

Removing an edge may increase the number of connected components:

removeEdge(g,(v,w)):
|  Input graph g, edge (v,w)
|
|  if g.edges[v][w]≠0 then               // (v,w) in graph
|  |  g.edges[v][w]=0, g.edges[w][v]=0   // set to false
|  |  if not hasPath(g,v,w) then         // v,w no longer connected?
|  |     dfsNewComponent(g,v,g.nC)       // ⇒ put v + connected vertices into new component
|  |     g.nC=g.nC+1
|  |  end if
|  end if

dfsNewComponent(g,v,componentID):
|  Input graph g, vertex v, new componentID for v and connected vertices
|
|  g.cc[v]=componentID
|  for all vertices w adjacent to v do
|     if g.cc[w]≠componentID then
|        dfsNewComponent(g,w,componentID)
|     end if
|  end if

Sample GraphCC.h:

// Graph ADT interface
#include <stdbool.h>

typedef struct GraphRep *Graph;

// vertices are ints
typedef int Vertex;

// edges are pairs of vertices (end-points)
typedef struct Edge {
   Vertex v;
   Vertex w;
} Edge;

Graph newGraph(int);
int   numOfVertices(Graph);
void  insertEdge(Graph, Edge);
void  removeEdge(Graph, Edge);
bool  adjacent(Graph, Vertex, Vertex);
void  showGraph(Graph);
void  showComponents(Graph);
void  freeGraph(Graph);

Sample GraphCC.c:

// Graph ADT
// Adjacency Matrix Representation
#include "GraphCC.h"
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>

typedef struct GraphRep {
   int  **edges;   // adjacency matrix
   int    nV;      // #vertices
   int    nE;      // #edges
   int    nC;      // #connected components
   int   *cc;      /* which component each vertex is contained in
                      i.e. array [0..nV-1] of 0..nC-1 */
   bool  *visited; // to track exploration during path check
} GraphRep;

Graph newGraph(int V) {
   assert(V >= 0);
   int i;

   Graph g = malloc(sizeof(GraphRep));
   assert(g != NULL);
   g->nV = V;
   g->nE = 0;
   g->nC = V;

   // allocate memory for connected components array
   g->cc = malloc(V * sizeof(int));
   assert(g->cc != NULL);
   // allocate memory for visited array
   g->visited = malloc(V * sizeof(bool));
   assert(g->visited != NULL);
   // allocate memory for each row
   g->edges = malloc(V * sizeof(int *));
   assert(g->edges != NULL);
   // allocate memory for each column and initialise with 0
   for (i = 0; i < V; i++) {
      g->cc[i] = i;
      g->edges[i] = calloc(V, sizeof(int));
      assert(g->edges[i] != NULL);
   }

   return g;
}

int numOfVertices(Graph g) {
   return g->nV;
}

// check if vertex is valid in a graph
bool validV(Graph g, Vertex v) {
   return (g != NULL && v >= 0 && v < g->nV);
}

void insertEdge(Graph g, Edge e) {
   assert(g != NULL && validV(g,e.v) && validV(g,e.w));
   int c, d, i;

   if (!g->edges[e.v][e.w]) {                    // edge e not in graph
      g->edges[e.v][e.w] = 1;
      g->edges[e.w][e.v] = 1;
      g->nE++;
      if (g->cc[e.v] != g->cc[e.w]) {            // v,w in different components?
         if (g->cc[e.v] < g->cc[e.w]) {          // ⇒ merge components c and d
            c = g->cc[e.v];
            d = g->cc[e.w];
         } else {
            c = g->cc[e.w];
            d = g->cc[e.v];
         }         
         for (i = 0; i < g->nV; i++) {
            if (g->cc[i] == d) {
               g->cc[i] = c;                    // move node from component d to c
            } else if (g->cc[i] == g->nC - 1) {
               g->cc[i] = d;                    // replace largest component ID by d
            }
         }
         g->nC--;
      }
   }
}

bool dfsPathCheck(Graph g, Vertex v, Vertex dest) {
   assert(g != NULL);
   Vertex w;

   g->visited[v] = true;
   if (v == dest) {
      return true;
   } else {
      for (w = 0; w < numOfVertices(g); w++) {
         if (adjacent(g, v, w) && !g->visited[w]) {
            if (dfsPathCheck(g, w, dest)) {
               return true;
            }
         }
      }
   }
   return false;
}

bool hasPath(Graph g, Vertex src, Vertex dest) {
   assert(g != NULL);
   Vertex v;

   for (v = 0; v < numOfVertices(g); v++) {
      g->visited[v] = false;
   }
   return dfsPathCheck(g, src, dest);
}

void dfsNewComponent(Graph g, Vertex v, int componentID) {
   Vertex w;

   g->cc[v] = componentID;
   for (w = 0; w < numOfVertices(g); w++) {
      if (adjacent(g, v, w)) {
         if (g->cc[w] != componentID) {
            dfsNewComponent(g, w, componentID);
         }
      }
   }
}

void removeEdge(Graph g, Edge e) {
   assert(g != NULL && validV(g,e.v) && validV(g,e.w));
   if (g->edges[e.v][e.w]) {   // edge e in graph
      g->edges[e.v][e.w] = 0;  // set to false
      g->edges[e.w][e.v] = 0;
      g->nE--;
      if (!hasPath(g, e.v, e.w)) {
         dfsNewComponent(g, e.v, g->nC);
         g->nC++;
      }
   }
}

bool adjacent(Graph g, Vertex v, Vertex w) {
   assert(g != NULL && validV(g,v) && validV(g,w));

   return (g->edges[v][w] != 0);
}

void showGraph(Graph g) {
   assert(g != NULL);
   int i, j;

   printf("Number of vertices: %d\n", g->nV);
   printf("Number of edges: %d\n", g->nE);
   for (i = 0; i < g->nV; i++) {
      for (j = i+1; j < g->nV; j++) {
         if (g->edges[i][j]) {
            printf("Edge %d - %d\n", i, j);
         }
      }
   }
}

void showComponents(Graph g) {
   assert(g != NULL);
   int i;

   printf("Connected components:\n");
   for (i = 0; i < g->nV; i++) {
      printf("%d", g->cc[i]);
      if (i < g->nV-1) {
         printf(", ");
      } else {
         printf("\n");
      }
   }
}

void freeGraph(Graph g) {
   assert(g != NULL);

   int i;
   for (i = 0; i < g->nV; i++) {
      free(g->edges[i]);
   }
   free(g->edges);
   free(g->cc);
   free(g->visited);
   free(g);
}

Sample Makefile:

CC      = gcc
CFLAGS  = -Wall -Werror -std=c11

.PHONY : clean

connectedComponents : connectedComponents.o GraphCC.o
   $(CC) $(CFLAGS) -o $@ connectedComponents.o GraphCC.c

connectedComponents.o : connectedComponents.c GraphCC.h
   $(CC) $(CFLAGS) -c connectedComponents.c

GraphCC.o : GraphCC.c GraphCC.h
   $(CC) $(CFLAGS) -c GraphCC.c

clean : 
   rm -f -- *.o connectedComponents

Graph 1: has both Euler and Hamiltonian paths (e.g. 0-1-2), but cannot have circuits as there are no cycles.

Graph 2: has both Euler paths (e.g. 0-1-2-0) and Hamiltonian paths (e.g. 0-1-2); also has both Euler and Hamiltonian circuits (e.g. 0-1-2-0).

Graph 3: has neither Euler nor Hamiltonian paths, nor Euler nor Hamiltonian circuits.

Graph 4: has Hamiltonian paths (e.g. 0-1-2-3) and Hamiltonian circuits (e.g. 0-1-2-3-0); it has neither an Euler path nor an Euler circuit.
An Euler path: 2-6-5-2-3-0-1-5-0-4-5

No Euler circuit since two vertices (2 and 5) have odd degree.

As an NP-hard problem, no tractable algorithm for computing the maximum size of a clique in a graph is known. Here is a sample 'brute-force' algorithm that essentially generates-and-tests all possible subsets of vertices to determine the maximum size of a complete subgraph.

maxCliqueSize(g,v,clique,k):
|  Input  g       graph with n nodes 0..n-1
|         v       next vertex to consider
|         clique  some subset of nodes 0..v-1 that forms a clique
|         k       size of that clique
|  Output size of largest complete subgraph of g that extends clique with nodes from v..n-1
|
|  if v=n then                            // no more vertices to consider
|     return k
|  else
|  |  k1=maxCliqueSize(g,v+1,clique,k)    /* find largest complete subgraph that
|  |                                         extends clique without considering v */
|  |  for all w∈clique do                 // check if v can be added to clique:
|  |  |  if v is not adjacent to w then   // if v not adjacent to some node in clique
|  |  |     return k1                     // ⇒ return largest clique size without v
|  |  |  end if
|  |  end for
|  |  add v to clique
|  |  k2=maxCliqueSize(g,v+1,clique,k+1)  // find largest clique extending clique ∪ {v}
|  |  if k2>k1 then return k2
|  |           else return k1
|  |  end if
|  end if

Starting with an empty clique, the function call maxCliqueSize(g,0,clique,0) will return the maximum clique size of graph g.

The following C program implements this algorithm:

int maxCliqueSize(Graph g, int v, int clique[], int k) {
//
// g         graph
// v         next vertex to consider
// clique[]  some subset of nodes 0..v-1 that forms a clique
// k         size of that clique
//
// returns size of largest complete subgraph of g that extends clique[] with nodes from v..nV-1
//
   if (v == numOfVertices(g)) {                    // no more vertices to consider
      return k;
   } else {
      int k1 = maxCliqueSize(g, v+1, clique, k);   /* find largest complete subgraph that
                                                      extends clique[] without considering v */
      int i;
      for (i = 0; i < k; i++)                      // check if v can be added to clique[]:
         if (!adjacent(g, v, clique[i]))           // if v not adjacent to some node in clique[]
            return k1;                             // => return largest clique size without v

      clique[k] = v;                               // add v to clique[]
      int k2 = maxCliqueSize(g, v+1, clique, k+1); // find largest clique extending clique[]+v
      if (k2 > k1)
         return k2;
      else
         return k1;
}

To call this recursive function on a graph g with nV vertices:

int *clique = malloc(nV * sizeof(int));            // allocate memory for an array of vertices
int m = maxCliqueSize(g, 0, clique, 0);            // start at vertex 0 with clique of size 0
free(clique);

For the keen: Here you can read more about the computational aspects of computing cliques including some references to more sophisticated algorithms.

Assessment