Week 09 Tutorial Answers
Sorting
Resources
Questions
-
Explain what these properties mean in relation to sorting algorithms:
- Stability
- Adaptability
- Comparison-based
Answer:
- Stability: A sorting algorithm is stable if it maintains the relative order of items with equal sort keys.
- Adaptiveness: Whether or not the pre-sortedness of the input affects the running time. A sorting algorithm is adaptive if it runs faster when the input is already partially sorted.
- Comparison-based: A sorting algorithm is comparison-based if it examines the data only through comparison operators (such as < and ≤). Examples of where sorting algorithms are not comparison based are when an item is used with the modulo operator (as in radix sort) or as an index (as in counting sort).
-
Consider the following simple table of enrolments, sorted by course code:
COMP1927 Jane 3970 COMP1927 John 3978 COMP1927 Pete 3978 MATH1231 John 3978 MATH1231 Adam 3970 PSYC1011 Adam 3970 PSYC1011 Jane 3970 Now we wish to sort it by student name, so that we can easily see what courses each student is studying. Show an example of what the final array would look like if
- we used a stable sorting algorithm
- we used an unstable sorting algorithm
Answer:
After a stable sort, the relative positions of items with equal sort keys is the same. In the context of this problem, this means that the subjects for a given student should be in the same order as in the original table (which happens to be sorted on subject):
MATH1231 Adam 3970 PSYC1011 Adam 3970 COMP1927 Jane 3970 PSYC1011 Jane 3970 COMP1927 John 3978 MATH1231 John 3978 COMP1927 Pete 3978 An unstable sorting algorithm could produce any order that is sorted on the sort key (in this case name), including the order that is produced by the sorting stable algorithm. For example:
PSYC1011 Adam 3970 MATH1231 Adam 3970 PSYC1011 Jane 3970 COMP1927 Jane 3970 MATH1231 John 3978 COMP1927 John 3978 COMP1927 Pete 3978 The result will definitely be sorted by name, but not necessarily be sorted by course-code within each name.
-
Sorting algorithms generally use comparison on the key to determine ordering. Ordering of items with equal keys has been left to the stability of the sorting algorithm used. Sometimes, it is important that items with the same key be ordered on a secondary key.
Write a function that takes two items and determines an order based on the primary key
k
and then on the value of a secondary keyj
. The function should return -1 if the first item is "less than" the second item, 1 if the first item is "greater than" the second item, and 0 if the items are equal. The function has the following signature:typedef struct item { int k; int j; /* ... other fields ...*/ } Item; int itemCmp(Item a, Item b) { ... }
Answer:
The following function would do what we want:
int itemCmp(Item a, Item b) { if (a.k < b.k) { return -1; } else if (a.k > b.k) { return 1; } else { // (a.k == b.k) if (a.j < b.j) { return -1; } else if (a.j > b.j) { return 1; } else { return 0; } } }
-
How many comparisons would be required to sort this array
int a[] = {4, 3, 6, 8, 2};
for each of the following sorting algorithms:
- selection sort
- bubble sort
- insertion sort
Answer:
The comparisons are highlighted in red.
-
selection sort
Iteration 1: 4 3 6 8 2 # assume 4 is the smallest item 4 3 6 8 2 # found a smaller item, 3 4 3 6 8 2 4 3 6 8 2 4 3 6 8 2 # found a smaller item, 2 2 3 6 8 4 # swap smallest item with item at index 0 Iteration 2: 2 3 6 8 4 # assume 3 is the next-smallest item 2 3 6 8 4 2 3 6 8 4 2 3 6 8 4 2 3 6 8 4 # swap next-smallest item with item at index 1 Iteration 3: 2 3 6 8 4 # assume 6 is the next-smallest item 2 3 6 8 4 2 3 6 8 4 # found a smaller item, 4 2 3 4 8 6 # swap next-smallest item with item at index 2 Iteration 4: 2 3 4 8 6 # assume 8 is the next-smallest item 2 3 4 8 6 # found a smaller item, 6 2 3 4 6 8 # swap next-smallest item with item at index 3 Done
10 comparisons are made in total.
-
bubble sort (bubble-up)
Iteration 1: 4 3 6 8 2 # 4 and 3 are out of order, swap 3 4 6 8 2 3 4 6 8 2 3 4 6 8 2 # 8 and 2 are out of order, swap 3 4 6 2 8 Iteration 2: 3 4 6 2 8 3 4 6 2 8 3 4 6 2 8 # 6 and 2 are out of order, swap 3 4 2 6 8 Iteration 3: 3 4 2 6 8 3 4 2 6 8 # 4 and 2 are out of order, swap 3 2 4 6 8 Iteration 4: 3 2 4 6 8 # 3 and 2 are out of order, swap 2 3 4 6 8 Done
10 comparisons are made in total.
adaptive bubble sort (bubble-down)
Iteration 1: 4 3 6 8 2 # 8 and 2 are out of order, swap 4 3 6 2 8 # 6 and 2 are out of order, swap 4 3 2 6 8 # 3 and 2 are out of order, swap 4 2 3 6 8 # 4 and 2 are out of order, swap 2 4 3 6 8 Iteration 2: 2 4 3 6 8 2 4 3 6 8 2 4 3 6 8 # 4 and 3 are out of order, swap 2 3 4 6 8 Iteration 3: 2 3 4 6 8 2 3 4 6 8 2 3 4 6 8 Finished an iteration without swaps, done
9 comparisons are made in total.
-
insertion sort
Iteration 1: 4 3 6 8 2 # in this iteration, we "insert" 3 4 3 6 8 2 # 4 and 3 are out of order, so we shift 3 down 3 4 6 8 2 # reached bottom of array, so 3 must be in the right place Iteration 2: 3 4 6 8 2 # in this iteration, we "insert" 6 3 4 6 8 2 # 4 and 6 are in order, so we stop 3 4 6 8 2 Iteration 3: 3 4 6 8 2 # in this iteration, we "insert" 8 3 4 6 8 2 # 6 and 8 are in order, so we stop 3 4 6 8 2 Iteration 4: 3 4 6 8 2 # in this iteration, we "insert" 2 3 4 6 8 2 # 8 and 2 are out of order, so we shift 2 down 3 4 6 2 8 # 6 and 2 are out of order, so we shift 2 down 3 4 2 6 8 # 4 and 2 are out of order, so we shift 2 down 3 2 4 6 8 # 3 and 2 are out of order, so we shift 2 down 2 3 4 6 8 # reached bottom of array, so 2 must be in the right place Done
7 comparisons are made in total.
-
Merge sort is typically implemented as follows:
void mergeSort(int A[], int lo, int hi) { if (lo >= hi) return; int mid = (lo + hi) / 2; mergeSort(A, lo, mid); mergeSort(A, mid + 1, hi); merge(A, lo, mid, hi); } void merge(int A[], int lo, int mid, int hi) { int nitems = hi - lo + 1; int *tmp = malloc(nitems * sizeof(int)); int i = lo; int j = mid + 1; int k = 0; // scan both segments into tmp while (i <= mid && j <= hi) { if (A[i] <= A[j]) { tmp[k++] = A[i++]; } else { tmp[k++] = A[j++]; } } // copy items from unfinished segment while (i <= mid) tmp[k++] = A[i++]; while (j <= hi) tmp[k++] = A[j++]; // copy items from tmp back to main array for (i = lo, k = 0; i <= hi; i++, k++) { A[i] = tmp[k]; } free(tmp); }
Suppose that at a particular point during the execution of the
mergeSort
function, the arrayA
looks like this:{ 1, 4, 5, 6, 7, 2, 3, 4, 7, 9 }
Show how the call
merge(A, 0, 4, 9)
would merge the elements of the two sorted subarrays into a single sorted array.Answer:
At the beginning of the function, a temporary array is created to sort the elements (before they are copied back into the original array at the end).
Then, three index variables are created:
i
is 0, which is the index into the beginning of the first sorted subarray, j is 5, which is the index into the beginning of the second sorted subarray, andk
is 0, which is just an index intotmp
so we know how many elements we have inserted intotmp
so far (and where the next element should be inserted).A: { 1, 4, 5, 6, 7, 2, 3, 4, 7, 9 } ^ ^ i j tmp: { , , , , , , , , , } ^ k
Now we enter the first loop, which does the majority of the merging. The loop compares the smallest elements of each subarray that haven't been consumed yet (i.e.,
A[i]
andA[j]
), and inserts the smaller of the two elements intotmp
. If the two elements are the same, it takes the element from the first subarray - this ensures stability. It continues to do this until one of the subarrays has been exhausted.A[i] (1) is less than A[j] (2) - so copy A[i] into tmp and advance i and k A: { 1, 4, 5, 6, 7, 2, 3, 4, 7, 9 } ^ ^ i j tmp: { 1, , , , , , , , , } ^ k
A[j] (2) is less than A[i] (4) - so copy A[j] into tmp and advance j and k A: { 1, 4, 5, 6, 7, 2, 3, 4, 7, 9 } ^ ^ i j tmp: { 1, 2, , , , , , , , } ^ k
A[j] (3) is less than A[i] (4) - so copy A[j] into tmp and advance j and k A: { 1, 4, 5, 6, 7, 2, 3, 4, 7, 9 } ^ ^ i j tmp: { 1, 2, 3, , , , , , , } ^ k
A[i] (4) is equal to A[j] (4) - so copy A[i] into tmp and advance i and k A: { 1, 4, 5, 6, 7, 2, 3, 4, 7, 9 } ^ ^ i j tmp: { 1, 2, 3, 4, , , , , , } ^ k
A[j] (4) is less than A[i] (5) - so copy A[j] into tmp and advance j and k A: { 1, 4, 5, 6, 7, 2, 3, 4, 7, 9 } ^ ^ i j tmp: { 1, 2, 3, 4, 4, , , , , } ^ k
...
A[i] (7) is equal to A[j] (7) - so copy A[i] into tmp and advance i and k A: { 1, 4, 5, 6, 7, 2, 3, 4, 7, 9 } ^ ^ i j tmp: { 1, 2, 3, 4, 4, 5, 6, 7, , } ^ k
The first subarray has been exhausted, which means the condition
(i <= mid)
is false, so the loop stops executing. (If the second subarray was exhausted instead, then the condition(j <= hi)
would be false, so the loop would also stop executing.) The next two loops simply copy the remaining elements from the non-exhausted subarray intotmp
. In this example, the first subarray was exhausted, so the first loop doesn't actually do anything - the second loop copies the remaining elements from the second subarray intotmp
.A: { 1, 4, 5, 6, 7, 2, 3, 4, 7, 9 } ^ ^ i j tmp: { 1, 2, 3, 4, 4, 5, 6, 7, 7, 9 } ^ k
Notice that
tmp
contains all of the elements of the entire subarray in sorted order. All that needs to be done now is to copy the elements fromtmp
back into the original array.tmp
is no longer needed, so it is freed.A: { 1, 2, 3, 4, 4, 5, 6, 7, 7, 9 }
-
An important operation in the quicksort algorithm is partition, which takes a element called the pivot, and reorganises the elements in the subarray
A[lo..hi]
such that all elements to the left of the pivot in the subarray are less than or equal to the pivot, and all elements to the right of the pivot in the subarray are greater than the pivot.Here is one possible implementation of partition:
int partition(int A[], int lo, int hi) { int pivot = A[lo]; // pivot int l = lo + 1; int r = hi; while (l < r) { while (l < r && A[l] <= pivot) l++; while (l < r && A[r] >= pivot) r--; if (l == r) break; swap(A, l, r); } int m = A[l] <= pivot ? l : l - 1; swap(A, lo, m); return m; }
Show how the call
partition(A, 0, 9)
would partition each of these arrays. What index is returned by each of these calls?{ 5, 3, 9, 6, 4, 2, 9, 8, 1, 7 }
{ 5, 9, 8, 7, 6, 0, 1, 2, 3, 4 }
{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }
{ 9, 8, 7, 6, 5, 4, 3, 2, 1, 0 }
Answer:
-
{ 4, 3, 1, 2, 5, 6, 9, 8, 9, 7 }
=> returns index 4 -
{ 0, 4, 3, 2, 1, 5, 6, 7, 8, 9 }
=> returns index 5 -
{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }
=> returns index 0 -
{ 0, 8, 7, 6, 5, 4, 3, 2, 1, 9 }
=> returns index 9
-
It is easy to check whether an array of integers is sorted. A trickier problem is to determine whether a sorting algorithm produced a stable sort. Assume that the array is composed of
Item
s with two fields, as in:typedef struct { int a; int b; } Item;
Assume also that the array has been sorted on the
Item.a
field.Given the final sorted array alone, you cannot determine whether the sort was stable. We also assume that the sorting client keeps a copy of the original array, which is available for checking.
Given the above, write a function that takes the original array, the sorted array, and determines whether the sort was stable. The function has the following interface:
int isStableSort(Item original[], Item sorted[], int size) { ... }
Answer:
A simple solution would be to run a sorting algorithm that is known to be stable (such as bubble sort, insertion sort, or merge sort) on the original array, and check whether the output of this algorithm matches the given sorted array. If they match, then the sort was stable (as a stable sorting algorithm has only one possible output). If they don't, then the sort was unstable.
Here is a more inefficient solution, which uses the definition of stability to check that items with the same sort key remain in the same relative order.
bool isStableSort(Item original[], Item sorted[], int size) { int i, j, k, key; for (i = 0; i < size; i++) { key = sorted[i].a; j = 0; k = 0; while (j < size && k < size) { for (; j < size; j++) if (sorted[j].a == key) break; for (; k < size; k++) if (original[k].a == key) break; if (j < size && k < size) { if (sorted[j].b != original[k].b) return false; j++; k++; } } } return true; }
Here is an efficient solution which uses a hash table that maps the type of the key (in this case, an integer) to a Queue.
bool isStableSort(Item original[], Item sorted[], int size) { HashTable ht = HashTableNew(); for (int i = 0; i < size; i++) { if (!HashTableContains(ht, original[i].a)) { HashTableSet(ht, original[i].a, QueueNew()); } Queue q = HashTableGet(ht, original[i].a); QueueEnqueue(q, original[i].b); } for (int i = 0; i < size; i++) { Queue q = HashTableGet(ht, sorted[i].a); if (QueueDequeue(q) != sorted[i].b) { return false; } } return true; }
-
Write a version of selection sort that sorts a linked list. The sort should modify the original list - do not create any new nodes. The function has the following signature:
typedef struct Node *List; struct Node { int value; List next; }; List selectionSort(List ls) { ... }
Answer:
Keeping the spirit of array-based selection sort, each iteration of this algorithm does the following:
- find the largest element remaining in the original list
- remove the largest element from the original list
- insert the element at the front of the new list
List selectionSort(List ls) { List old = ls; List sorted = NULL; // result while (old != NULL) { // find largest node in old list List max = listMax(old); // unlink largest node in old list old = listUnlinkNode(old, max); // add node to beginning of list max->next = sorted; sorted = max; } return sorted; } // Returns a pointer to the node containing the largest // element of the given list. static List listMax(List ls) { List max = ls; for (List curr = ls; curr != NULL; curr = curr->next) { if (curr->value > max->value) { max = curr; } } return max; } // Removes a given node from a list without freeing it. static List listUnlinkNode(List ls, List node) { if (ls == node) { return ls->next; } else { ls->next = listUnlinkNode(ls->next, node); return ls; } }
-
Write a program that reads two sorted files and merges them onto standard output. The names of the files are provided as command-line arguments.
./merge File1 File2
If either file is not readable, exit with the error message:
Invalid input file(s).
Answer:
int main(int argc, char *argv[]) { if (argc != 3) { fprintf(stderr, "Usage: ./merge File1 File2\n"); return 1; // error code } FILE *in1 = fopen(argv[1], "r"); FILE *in2 = fopen(argv[2], "r"); if (in1 == NULL || in2 == NULL) { fprintf(stderr, "Invalid input file(s).\n"); return 1; // error code } // now start merging char line1[MAXLINE] char line2[MAXLINE]; bool more1 = fgets(line1, MAXLINE, in1); bool more2 = fgets(line2, MAXLINE, in2); while (more1 && more2) { int diff = strcmp(line1, line2); if (strcmp(line1, line2) <= 0) { fputs(line1, stdout); more1 = fgets(line1, MAXLINE, in1); } else { fputs(line2, stdout); more2 = fgets(line2, MAXLINE, in2); } } while (fgets(line1, MAXLINE, in1) != NULL) fputs(line1, stdout); while (fgets(line2, MAXLINE, in2) != NULL) fputs(line2, stdout); fclose(in1); fclose(in2); }
-
Write a program that reads data about students in the form:
5059413 Daisy 3762 15 3461045 Dotty 3648 42 3474881 Daisy 8543 16 5061020 David 3970 3
from standard input, and then:
- stores it in an array of Student structs
{zid, name, prog, favnum}
- uses
qsort()
to sort it, making use of thestuCmp()
function (see below) - order is initially by name, and then by zID if the names are the same
- writes out sorted array, one Student per line
- use the following format for printing students
"%7d %-20s %4d %d\n"
You can assume that all of the input lines contain valid student entries, and that the
MAXxxx
constants are defined, and large enough to deal with any input data. You do not need to implement thestuCmp()
function; just assume it exists.Use the following template for the program:
typedef struct student { int zid; // 7-digit number char name[MAXNAME]; // names are stored *within* the struct int prog; // 4-digit number int favnum; // favourite number } Student; // return -ve if a < b, +ve if a > b, 0 if a == b int stuCmp(const void *a, const void *b); int main(int argc, char *argv[]) { Student students[MAXSTUDENTS]; // read stdin line-by-line into students[] // sort the students[] array // print the contents of the students[] array }
Answer:
int main(int argc, char *argv[]) { Student students[MAXSTUDENTS]; // read stdin line-by-line into students[] int i = 0; while (scanf("%d %s %d %d", &students[i].zid, students[i].name, &students[i].prog, &students[i].favnum) == 4) { i++; } // sort the students[] array int nstudents = i; qsort(students, nstudents, sizeof(Student), stuCmp); // print the contents of the students[] array for (i = 0; i < nstudents; i++) { printf("%7d %-20s %4d %d\n", students[i].zid, students[i].name, students[i].prog, students[i].favnum); } }
- stores it in an array of Student structs