Week 04 Lab Exercise

Binary Search Trees

Objectives

  • To explore the implementation of binary search trees
  • To get some practice with binary search trees
  • To implement a level-order traversal
  • To get some practice with complexity analysis

Admin

Marks
5 (see the Assessment section for more details)
Demo
in the Week 4, 5 or 7 lab session
Submit
see the Submission section
Deadline to submit to give
5pm Monday of Week 5
Late penalty
0.2% per hour or part thereof deducted from the attained mark, submissions later than 5 days not accepted

Background

In lectures, we introduced the binary search tree data type and implemented some operations on it. In this week's lab, we will implement some additional operations on it.

Recall that a binary search tree is an ordered binary tree with the following properties:

  • the tree consists of a (possibly empty) collection of linked nodes
  • each node contains a single integer value, and has links to two subtrees
  • either or both subtrees of a given node may be empty
  • all values in a node's left subtree will be less than the value in the node
  • all values in a node's right subtree will be greater than the value in the node

Setting Up

Create a directory for this lab, change into it, and run the following command:

unzip /web/cs2521/24T1/labs/week04/downloads/files.zip

If you're working at home, download files.zip by clicking on the above link and then unzip the downloaded file.

If you've done the above correctly, you should now have the following files:

Makefile
a set of dependencies used to control compilation
bst.h
interface for the BST module
bst.c
implementation of the BST module (incomplete)
Queue.h
interface for the Queue ADT
Queue.c
implementation of the Queue ADT (complete)
runBst.c
interactive test program for the BST module
analysis.txt
a template for you to enter your time complexity analysis

Once you've got these files, the first thing to do is to run the command

make

This will compile the initial version of the files, and produce the ./runBst executable.

File Walkthrough

runBst.c

This program allows you to test the BST module interactively by entering commands in the terminal. Here is an example run of the program once all the tasks have been completed:

./runBst
Interactive BST Tester
Enter ? to see the list of commands.
> + 6 8 4 1 9 3 5
Inserted 6
Inserted 8
Inserted 4
Inserted 1
Inserted 9
Inserted 3
Inserted 5
> p
    6
   / \
  4   8
 / \   \
1   5   9
 \
  3
> l
The BST contains 3 leaves
> r
The range of the BST is 8
> P
Pre-order traversal: 6 4 1 3 5 8 9 
> L
Level-order traversal: 6 4 8 1 5 9 3 
> d
Deleted all the leaves in the BST
> p
    6
   / \
  4   8
 /
1
> l
The BST contains 2 leaves
> c 3
Closest value to 3 in the BST: 4
> L
Level-order traversal: 6 4 8 1 
> d
Deleted all the leaves in the BST
> p
  6
 /
4
> d
Deleted all the leaves in the BST
> p
6
> q
bst.c

This file contains functions related to creating, querying and manipulating BSTs. Most of these functions are complete, and the rest of the functions need to be implemented in the tasks below.

Queue.h

Queue.h defines the interface to the Queue ADT. The Queue ADT is currently not used, but it will be used in one of the tasks below.

Task 1 - Count Leaves in a BST

Your task is to implement the following function in bst.c:

int bstNumLeaves(struct node *t);

This function should return a count of the number of leaf nodes in the given BST. A leaf node is a node that has no child nodes. The following diagram shows some sample trees, with the leaf nodes highlighted in red.

Note: You are allowed to use any approach for the tasks in this lab. Both iterative and recursive solutions are allowed.

Once you think you've got the function working, test it by recompiling with make and running the runBst program.

When you're certain that the function works correctly, determine its worst-case time complexity and write it in analysis.txt along with an explanation. The time complexity should be in terms of \( n \) or \( h \), where \( n \) is the number of nodes in the tree and \( h \) is the height of the tree. You should decide which is more appropriate for your implementation.

Since the binary search trees in this lab are not guaranteed to be balanced, the height of a tree with \( n \) nodes can vary significantly. This means for some algorithms, expressing their time complexity in terms of \( h \) can give a more accurate measure of efficiency. An example of this is searching a BST - an efficient implementation involves only going down one branch of the tree, so its efficiency depends on the height of the tree, rather than the total number of nodes.

Task 2 - Find the Range of a BST

Your task is to implement the following function in bst.c:

int bstRange(struct node *t);

This function should return the range of the given BST. The range of a BST is the difference between its smallest and largest values. If the given tree is empty, the function should return -1. For example, consider the following BST:

The range of this BST is 27, because the largest value is 30, the smallest value is 3, and \( 30 - 3 = 27 \).

Once you think you've got the function working, test it by recompiling with make and running the runBst program.

When you're certain that the function works correctly, determine its worst-case time complexity and write it in analysis.txt along with an explanation. The time complexity should be in terms of \( n \) or \( h \), where \( n \) is the number of nodes in the tree and \( h \) is the height of the tree. You should decide which is more appropriate for your implementation.

Task 3 - Delete the Leaves of a BST

Your task is to implement the following function in bst.c:

struct node *bstDeleteLeaves(struct node *t);

This function should delete all the leaf nodes in the given BST and return the root of the updated tree. The following diagram shows the result of repeatedly applying the function to a sample tree.

Once you think you've got the function working, test it by recompiling with make and running the runBst program.

When you're certain that the function works correctly, determine its worst-case time complexity and write it in analysis.txt along with an explanation. The time complexity should be in terms of \( n \) or \( h \), where \( n \) is the number of nodes in the tree and \( h \) is the height of the tree. You should decide which is more appropriate for your implementation.

Task 4 - Find the Closest Value in a BST

Your task is to implement the following function in bst.c:

int bstClosest(struct node *t, int val);

This function should return the value in the BST which is closest to the given value. The function can assume the given BST is not empty. If there is more than one closest value, the function may return any of them. For example, consider the following BST:

  • If the given value is 30, then the function should return 30, because the closest value to 30 in the BST is 30.
  • If the given value is 15, then the function should return 18, because the closest value to 15 in the BST is 18.
  • If the given value is 7, then the function should return 3 or 11, because the closest values to 7 in the BST are 3 or 11 and they are equally close to 7.
Hint: You may find the abs function (provided by stdlib.h) useful.
To get full marks for this task, your function must have a worst-case time complexity of \(O(h)\), where \(h\) is the height of the BST. A solution which is slower than \(O(h)\) can earn at most half of the marks for this task.

Once you think you've got the function working, test it by recompiling with make and running the runBst program.

No time complexity analysis is required for this task.

Task 5 - Level-Order Traversal of a BST

Your task is to implement the following function in bst.c:

void bstLevelOrder(struct node *t);

This function should print the level-order traversal of the given BST on a single line separated by spaces (i.e., the same format as the other traverse-and-print functions). The following diagram aims to give an idea of how level-order traversal scans the nodes in a tree:

The output from this traversal would be: 5 3 7 2 4 6 9 1 8.

Level-order traversal cannot be done recursively (at least not easily) and is typically implemented using a queue. The algorithm is roughly as follows:

Level Order Traversal(BST t):
	initialise an empty queue
	add t's root node to the queue
	while the queue is not empty do
		remove the node at the front of the queue
		print the value in the node
		add its left child (if any) to the queue
		add its right child (if any) to the queue
	end while

You must implement this algorithm by making use of the Queue ADT provided. Note that the Queue ADT stores pointers (void * is a generic pointer type). This is because you should store node pointers in the queue rather than integers - if the queue stored integers then after dequeuing an integer, there would be no easy way to add its children to the queue.

Once you think you've got the function working, test it by recompiling with make and running the runBst program.

When you're certain that the function works correctly, determine its worst-case time complexity and write it in analysis.txt along with an explanation. The time complexity should be in terms of \( n \) or \( h \), where \( n \) is the number of nodes in the tree and \( h \) is the height of the tree. You should decide which is more appropriate for your implementation.

Submission

Submit via the command line using the give command:

give cs2521 lab04 bst.c analysis.txt

You can also submit via give's web interface. You can submit multiple times. Only your last submission will be marked. You can check the files you have submitted here.

Assessment

Most of the marks for this lab will come from automarking. To receive the rest of the marks, you must show your work to your tutor during your Week 4, 5 or 7 lab session. You will be marked based on the following criteria:

Aspect Marks Criteria
Code correctness 4 Your code for each task will be automarked, with each task worth 0.8 marks. Automarking will be run after submissions have closed. After automarking is run you will be able to view your results here.
Complexity analysis 1 This mark is based on how accurate you were with your time complexity analysis and the quality of your explanations in analysis.txt. Analysis of each function is worth 0.25 marks.