Assignment 1

Balanced Binary Search Trees

Changelog

All important changes to the assignment specification and files will be listed here.

  • [11/03 16:19] Clarified that other libraries are not to be #include'd.
  • [12/03 15:44] Added stdbool.h to bBST.h for prototypes. This shouldn't affect you as the file is #include'd in bBST.c after stdbool.h anyway, unless you make your own test program.
  • [19/03 19:46] Rephrased TreeKthSmallest and TreeKthLargest to specify bounds on \(k\).
  • [20/03 11:02] Released testBBST.c.
  • [25/03 17:55] Corrected the note in analysis.txt. It now says:
    NOTE:
    - Your time complexities should be in big-O notation.
    - Your time complexities should be in terms of n, where n is the number
      of nodes in the tree, and, if applicable:
      - the given value of k, or
      - m, the size of the returned list
    

Aims

  • To implement different operations in a balanced binary search tree and analyse the time complexity of its operations
  • To give you practice with binary search trees and complexity analysis
  • To appreciate the importance of using efficient data structures and algorithms

Admin

Marks
contributes 15% towards your final mark (see Assessment section for more details)
Submit
see the Submission section
Deadline
5pm Wednesday Week 7
Late penalty
0.2% per hour or part thereof deducted from the attained mark, submissions later than 5 days not accepted

Prerequisite Knowledge

  • Recursion
  • Abstract Data Types
  • Binary Search Trees
  • Balanced BSTs (including AVL Trees)
  • Analysis of Algorithms

Background

Binary Trees

A binary tree is a data structure which can be used for searching (binary search trees), sorting (heaps), binary space partitioning (as in 3D video games), compression algorithms (Huffman Coding) to name a few.

In the Trees and Search Trees lecture, we introduced some fundamental tree operations:

  • insert a key into a tree

  • delete a key from a tree

  • search for a key in a tree

  • Traversals: preorder, inorder, postorder

These operations are only guaranteed to be efficient if the tree is relatively balanced. It was shown that the efficiency of many operations are improved if the tree is height balanced. In the class we discussed AVL trees which are examples of Balanced Binary Search Trees. Throughout this assignment we will consider implementations using AVL trees.

Setting Up

Change into the directory you created for the assignment and run the following command:

unzip /web/cs2521/23T1/ass/ass1/downloads/files.zip

If you're working at home, download files.zip by clicking on the above link and then unzip the downloaded file.

You should now have the following files:

Makefile
a set of dependencies used to control compilation
bBST.h
interface to the balanced binary search tree
bBST.c
implementation of the balanced binary search tree (incomplete)
List.h
interface to List ADT (complete)
List.c
implementation of the List ADT (complete)
testBBST.c
driver code with a few basic tests
analysis.txt
a template for you to enter your time complexity analysis for selected functions

Note that the only files that you are allowed to submit are bBST.c and analysis.txt. Therefore, the only files you should modify are bBST.c, analysis.txt and any testing files (although you will NOT be submitting your tests).

Task 1: Implementation

Your task is to design and implement some operations on a balanced binary search tree, efficiently. We have divided the operations into groups - you must fully implement all the operations in each group before moving on to the next group.

Note

When we say you must use a balanced binary search tree, we mean a height-balanced binary search tree. This means for every node in the tree, the absolute difference in height between its left and right subtrees must be no greater than 1.

Important Constraint

The method of converting the given tree(s) into an array or linked list, solving the main problem using the array/linked list is strictly forbidden. You will receive zero for any functions implemented this way.

Group 1: Basic Operations

See the comments above function prototypes for the specifics of each function, including their required time complexities.

Operation/Function Description
TreeNew Creates a new empty tree
TreeFree Frees all memory allocated for a tree
TreeSearch Searches for a key in the tree
TreeInsert Inserts a key into the given tree. Any integer may be inserted (including negative integers) except for the value UNDEFINED.
TreeDelete Deletes a key from the tree
TreeToList Creates a list containing all the keys in a tree in ascending order

Group 2: Other Operations

See the comments above function prototypes for the specifics of each function, including their required time complexities.

Operation/Function Description
TreeKthSmallest Finds the \(k\)-th smallest key in the tree. \(k\) is guaranteed to be at least 1 and at most the number of nodes in the tree.
TreeKthLargest Finds the \(k\)-th largest key in the tree. \(k\) is guaranteed to be at least 1 and at most the number of nodes in the tree.
TreeLCA Given two keys, find the key of the node which is the lowest common ancestor of the nodes containing the keys. The lowest common ancestor of two nodes is the root node of the smallest subtree that contains both nodes.
TreeFloor Given a key \( v \), returns the largest key less than or equal \( v \) in the tree, or UNDEFINED if there is no such key.
TreeCeiling Given a key \( v \), returns the smallest key greater than or equal to \( v \) in the tree, or UNDEFINED if there is no such key.
TreeSearchBetween Searches for all keys between the two given keys (inclusive) and returns the keys in a list in ascending order.

Examples

Consider the following tree:

Example Tree
TreeKthSmallest and TreeKthLargest
  • For \(k = 1\), the \(k\)-th smallest key is 3 and the \(k\)-th largest key is 27
  • For \(k = 4\), the \(k\)-th smallest key is 10 and the \(k\)-th largest key is 19
TreeLCA
  • The lowest common ancestor of 3 and 14 is 10
  • The lowest common ancestor of 3 and 27 is 16
  • The lowest common ancestor of 20 and 27 is 25
  • The lowest common ancestor of 20 and 20 is 20
  • The lowest common ancestor of 2 and 25 is UNDEFINED
TreeFloor and TreeCeiling
  • The floor of 6 is 3, and the ceiling of 6 is 7
  • The floor of 21 is 20, and the ceiling of 21 is 25
  • The floor of 10 is 10, and the ceiling of 10 is 10
TreeSearchBetween
  • Searching between 5 and 17 should return a list containing 7, 9, 10, 14, 15 and 16
  • Searching between 10 and 25 should return a list containing 10, 14, 15, 16, 19, 20 and 25

Task 2: Complexity Analysis

You are required to determine the worst-case time complexity of your implementation of the following operations, and write your answers in analysis.txt along with an explanation of each answer.

  • TreeKthSmallest
  • TreeKthLargest
  • TreeLCA
  • TreeFloor
  • TreeCeiling
  • TreeSearchBetween

Your answers should be in big-O notation and be in terms of \( n \), where \( n \) is the size of the tree.

Full marks will be given only to efficient solutions.

In your explanations, you may use time complexities established in lectures, as long as you indicate that it was established in lectures. For example, you may simply state (without explanation/proof) that the worst-case time complexity of searching in an AVL tree is \(O(\log n)\), as this was established in lectures.

Testing

We have provided a main program testBBST.c containing basic test cases.

To run these tests, first run make, which compiles your code and creates an executable called testBBST, and then run ./testBBST.

We strongly recommend you to add your own tests, as the given tests are very simple. You can easily add your own tests by creating a new function in testBBST.c and then calling it from the main function. You are also free to completely restructure the testing program if you want.

You can call individual tests by specifying their number(s). For example, to run tests 4, 2, and 7, run

./testBBST 4 2 7
testTreeKthSmallest...
Test passed.

testTreeDelete...
Test passed.

testTreeFloor...
Test passed.

All 3 tests passed!

We have provided 9 functions testing basic functionality, but you can modify the tests and numTests to add your own. Providing no command-line arguments will default to running all tests:

./testBBST
testTreeInsertSearch...
Test passed.

testTreeDelete...
Test passed.

testTreeToList...
-- TreeList returned: [1, 3, 5, 10, 15, 17, 20, 21, 22, 23, 25]
Test passed.

testTreeKthSmallest...
Test passed.

testTreeKthLargest...
Test passed.

testTreeLCA...
Test passed.

testTreeFloor...
Test passed.

testTreeCeiling...
Test passed.

testTreeSearchBetween...
-- TreeSearchBetween(1, 25) returned: [1, 3, 5, 10, 15, 17, 20, 21, 22, 23, 25]
-- TreeSearchBetween(17, 17) returned: [17]
-- TreeSearchBetween(10, 30) returned: [10, 15, 17, 20, 21, 22, 23, 25]
-- TreeSearchBetween(-2, 16) returned: [1, 3, 5, 10, 15]
-- TreeSearchBetween(11, 14) returned: []
Test passed.

All 9 tests passed!

Debugging

Debugging is an important and inevitable aspect of programming. We expect everyone to have an understanding of basic debugging methods, including using print statements, knowing basic GDB commands and running Valgrind. In the forum and in help sessions, tutors will expect you to have made a reasonable attempt to debug your program yourself using these methods before asking for help.

You can learn about GDB and Valgrind in the Debugging with GDB and Valgrind lab exercise.

Frequently Asked Questions

  • Are we allowed to create our own functions? You are always allowed to create your own functions. All additional functions you create should be made static.
  • Are we allowed to create our own #defines and structs? Yes.
  • Are we allowed to #include other libraries? No. All the libraries required for this assignment are provided already in the bBST.c file.
  • Are we allowed to modify bBST.h in any way? No.
  • Are we allowed to use functions used in AVL trees as discussed in class? Yes, certainly and encouraged.
  • Are we allowed to change the signatures of the given functions? No. If you change these, your code won't compile and we won't be able to test it.
  • What errors do we need to handle? You should handle common errors such as NULL returned from malloc by printing an error message to stderr and terminating the program. You are not required to handle other errors.
  • What invalid inputs do we need to handle? You are not required to handle invalid inputs, such as NULL being passed in as a tree. It is the user's responsibility to provide valid inputs.
  • Will we be assessed on our tests? No. You will not be submitting any test files, and therefore you will not be assessed on your tests.
  • Are we allowed to share tests? No. Testing is an important part of software development. Students that test their code more will likely have more correct code, so to ensure fairness, each student should independently develop their own tests.

Submission

You must submit the files bBST.c, and analysis.txt. You can submit via the command line using the give command:

give cs2521 ass1 bBST.c analysis.txt

You can also submit via give's web interface. You can submit multiple times. Only your last submission will be marked. You can check the files you have submitted here.

Compilation

You must ensure that your final submission compiles on CSE machines. Submitting non-compiling code leads to extra administrative overhead and will result in a 10% penalty.

Every time you make a submission, a dryrun test will be run on your code to check that it compiles. Please ensure that your final submission successfully compiles, even for parts that you have not completed.

Assessment Criteria

This assignment will contribute 15% to your final mark.

Correctness

75% of the marks for this assignment will be based on the correctness of your code, and will be based on autotesting. Marks for correctness will be distributed as follows:

Group Function Complexity Weight
Group 1 TreeSearch \(O(\log{n})\) 5%
TreeInsert \(O(\log{n})\) 15%
TreeDelete \(O(\log{n})\) 18%
TreeToList \(O(n)\) 2%
Group 2 TreeKthSmallest \(O(\log{n} + k)\) 5%
TreeKthLargest \(O(\log{n} + k)\) 2%
TreeLCA \(O(\log{n})\) 7%
TreeFloor \(O(\log{n})\) 7%
TreeCeiling \(O(\log{n})\) 7%
TreeSearchBetween \(O(\log{n} + m)^*\) 7%

*Note: \(m\) is the size of the list returned by TreeSearchBetween.

Please note that we expect you to prioritise completing functions correctly over implementing as many functions as possible (potentially incorrectly), so we will not be awarding sympathy marks for functions that don't work (i.e., the marks you receive from autotesting will be your final correctness mark). You are expected to exercise your debugging skills to fix problems with the function you are currently working on, before moving on to the next function.

Complexity analysis

15% of the marks for this assignment will be based on the correctness of your time complexity analysis in analysis.txt and the quality of your explanations.

Full marks will be given only to efficient solutions. Naive solutions will get only half the marks for this section, where a more efficient solution exists.

Style

10% of the marks for this assignment will come from hand marking of the readability of the code you have written. These marks will be awarded on the basis of clarity, commenting, elegance and style. The following is an indicative list of characteristics that will be assessed, though your program will be assessed wholistically so other characteristics may be assessed too (see the style guide for more details):

  • Consistent and sensible indentation and spacing
  • Using blank lines and whitespace
  • Using functions to avoid repeating code
  • Decomposing code into functions and not having overly long functions
  • Using comments effectively and not leaving planning or debugging comments

The course staff may vary the assessment scheme after inspecting the assignment submissions but it will remain broadly similar to the description above.

Originality of Work

This is an individual assignment. The work you submit must be your own work and only your work apart from the exceptions below. Joint work is not permitted. At no point should a student read or have a copy of another student's assignment code.

You may use any amount of code from the lectures and labs of the current iteration of this course. You must clearly attribute the source of this code in a comment.

You may use small amounts (< 10 lines) of general purpose code (not specific to the assignment) obtained from sites such as Stack Overflow or other publicly available resources. You must clearly attribute the source of this code in a comment.

You are not permitted to request help with the assignment apart from in the course forum, help sessions or from course lecturers or tutors.

Do not provide or show your assignment work to any other person (including by posting it publicly on the forum) apart from the teaching staff of COMP2521. When posting on the course forum, teaching staff will be able to view the assignment code you have recently submitted with give.

The work you submit must otherwise be entirely your own work. Submission of work partially or completely derived from any other person or jointly written with any other person is not permitted. The penalties for such an offence may include negative marks, automatic failure of the course and possibly other academic discipline. Assignment submissions will be examined both automatically and manually for such issues.

Relevant scholarship authorities will be informed if students holding scholarships are involved in an incident of plagiarism or other misconduct. If you knowingly provide or show your assignment work to another person for any reason, and work derived from it is submitted, then you may be penalised, even if the work was submitted without your knowledge or consent. This may apply even if your work is submitted by a third party unknown to you.