COMP2011 Tutorial Solutions 9, Week 10

Hash Tables, Sorting

  1. R-8.4 Draw the 11-item hash table that results from using the hash function

    h(i) = (2i+5) mod 11

    to hash the keys 12, 44, 13, 88, 23, 94, 11, 39, 20, 16 and 5, assuming collisions are handled by chaining.

           +-----------------------------------+
           |  0  1  2  3  4  5  6  7  8  9  10 |
           +-----------------------------------+
                 |        |  |  |  |     |
                20       16 44 94 12    13
                          |  |  |  |
                          5 88 39 23
                             |
                            11
          
    R-8.5 What is the result of the previous exercise, assuming collisions are handled by linear probing?
           +----------------------------------+
           |  0  1  2  3  4  5  6  7  8  9 10 |
           +----------------------------------+
           | 11 39 20  5 16 44 88 12 23 13 94 |
           +----------------------------------+
          
    R-8.7 What is the result of Exercise R-8.4 assuming collisions are handled by double hashing, using the secondary hash function

    h'(k) = 7 - (k mod 7) ?
           +----------------------------------+
           |  0  1  2  3  4  5  6  7  8  9 10 |
           +----------------------------------+
           | 11 23 20 16 39 44 94 12 88 13  5 |
           +----------------------------------+
          
  2. R-10.7,8: Consider a version of the quick-sort algorithm for which, in an n-element sequence, the "middle" element (at index n/2) is chosen as pivot. What is the running time of this algorithm on a sequence that is already sorted? Describe the kind of sequence that would cause this version of quick-sort to run in O(n2) time.

    Ans: On an already sorted sequence it will run in O(n log n) time. This is because the choice of the middle element as pivot will split the input almost exactly into two equal halves each time. The computation tree will therefore be almost full and balanced. Then apply the fact that the partition algorithm (that still has to run at each phase) takes O(k) time for a piece of size k. At depth d, the total size of all the pieces is n. (independent of depth!). The depth of the tree is O(log n), so all told O(n log n) for the run time.

    The worst-case sequence is when each step only peels one number off the sequence. Such a sequence can be built up recursively from the bottom up, as follows:

    246895371
    24681537|9
    2467153|8
    246315|7
    24531|6
    2413|5
    231|4
    21|3
    1|2
    
  3. Suppose we are given a sequence S of n elements, each of which is coloured red or blue. Assuming S is represented as an array, give an in-place method for ordering S so that all the blue elements are listed before all the red elements. Can you extend your approach to three colours?

    Ans: Start with a "low" marker at the beginning of the array and a "high" marker at the end. If the low marker is pointing to a blue element, keep incrementing it until it points to a red one. If the high marker is pointing to a red element, keep decrementing it until it points to a blue one. Swap these two elements, and repeat the process.

    For three colours: first use the above method to put the red elements at the end of the array (with the other two colours in jumbled order at the beginning). Then start the low marker at the beginning again and repeat the process to sort the other two colours. (Note that the high marker does not need to be re-set).

  4. C-10.11: Suppose we are given an n-element sequence S such that each element in S represents a different vote for class president, where each vote is given as an integer representing the student ID of the chosen candidate. Without making any assumptions about who is running or even how many candidates there are, design an O(n log n)-time algorithm to see who wins the election S represents, assuming the candidate with the most votes wins.

    Ans: Start by sorting the list, which takes O(n log n) time. Then move through the list, keeping track of the which candidate has the most votes so far, and how many votes they have received. If the number of votes for the current candidate exceeds the previous maximum, make it the new new maximum. The second (counting) stage takes O(n) time, so the whole process takes O(n log n) time.

  5. C-10.12: Consider the voting problem from the previous exercise, but now suppose that we know the number k < n of candidates running. Describe an O(n log k)-time algorithm for determining who wins the election.

    Ans: In this case, the candidates can be stored in a balanced binary tree (for example, an AVL Tree). Each node should store a student ID and the number of votes they have received so far. As each vote in the sequence is processed, search the tree for the student ID of the chosen candidate (which takes O(log k) time). If the ID is found, add one to its number of votes. Otherwise, create a new node with this ID and with one vote. At the end, go through all the nodes in the tree to find the one with the most votes. The process could be sped up even further in the average case (though not in the worst case) by replacing the AVL Tree with a Hash Table.


Author: Alan Blair