Week 01 Lab Exercise
Sanitizers, Makefiles and C Revision
Objectives
In this lab, you will:
- Get reacquainted with C programming
- Practice programming with arrays and linked lists
- Learn how to read and understand sanitizer error messages
- Learn how to use the
make
command
Admin
- Marks
- 5 (see the Assessment section for more details)
- Demo
- in the Week 1, 2 or 3 lab session
- Submit
- see the Submission section
- Deadline to submit to give
- 12pm Monday of Week 2
- Late penalty
- 0.2% per hour or part thereof deducted from the attained mark, submissions later than 5 days not accepted
Background
Welcome to COMP2521!
The aim of this lab exercise is to introduce you to the tooling used in COMP2521, as well as give you practice programming with arrays and linked lists, as many of the data structures we will be covering are built upon them.
Setting Up
To keep your files manageable, it's worth doing each lab exercise in a separate directory (folder). We suggest creating a subdirectory in your home directory called "COMP2521", "cs2521" or "2521", and then creating subdirectories under that called "lab01", "lab02", etc.
Create a directory for this lab, change into it, and run the following command:
unzip /web/cs2521/24T3/labs/week01/downloads/files.zip
If you're working at home, download files.zip
by clicking on the above link and then unzip the downloaded file.
If you've done the above correctly, you should now have three subdirectories in your lab01 directory:
sanitizers
, which is for Part 1makefiles
, which is for Part 2revision
, which is for Part 3
Part 1 - Sanitizers 💧️👏️
clang
In COMP1511/COMP1911, you used dcc
, a compiler created at CSE which is intended to help novice programmers by providing easy-to-understand explanations.
In COMP2521, since you are no longer a novice programmer, we will be using clang
, a real-world compiler! (Fun fact: dcc
actually uses clang
.)
When you used dcc
, you always compiled your programs like this:
dcc -o prog prog.c
When using clang
, there are three other flags which we always compile with: -Wall
, -Werror
and -g
. -Wall
causes the compiler to warn us about (almost) all possible syntax mistakes in our program, -Werror
causes warnings to be treated like errors (which prevents the program from being compiled), and -g
causes the compiler to generate debug information (such as line numbers and variable names), which can be used by debuggers such as gdb
.
Sanitizers
In this course, we often compile with sanitizers, which are tools that insert extra code into the program to detect various errors at runtime. Each sanitizer detects different kinds of errors. We use the following sanitizers:
- AddressSanitizer (ASan): detects invalid memory accesses
- MemorySanitizer (MSan): detects uses of uninitialized memory
- UndefinedBehaviourSanitizer (UBSan): detects various kinds of undefined behaviour (e.g., integer overflow during addition)
- LeakSanitizer (LSan): detects memory leaks
To compile with sanitizers, we use the -fsanitize=
flag followed by a comma-separated list of sanitizers. For example, to compile a program with AddressSanitizer and UndefinedBehaviourSanitizer, we would use the following command:
clang -Wall -Werror -g -fsanitize=address,undefined -o prog prog.c
While it would be nice if we could compile our program with all the sanitizers at once, not all the sanitizers are compatible with each other. For example, AddressSanitizer is not compatible with MemorySanitizer. Thus, to ensure our programs do not contain any errors detected by either sanitizer, we must compile and run our programs twice: once with AddressSanitizer, and once with MemorySanitizer.
Since the error messages produced by sanitizers are not as friendly or as easily comprehensible, the goal of this activity is to teach you the meaning of common error types and how to locate useful information in error messages.
Example
Sanitizers produce detailed error messages which describe the type of error, the line of code that triggered the error, and more.
Here is an example error message produced by AddressSanitizer (with important parts bolded):
================================================================= ==796003==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000018 at pc 0x0000004c877a bp 0x7ffc206af420 sp 0x7ffc206af418 READ of size 8 at 0x602000000018 thread T0 #0 0x4c8779 in func /tmp/sanitizer-example/prog.c:49:15 #1 0x4c8359 in main /tmp/sanitizer-example/prog.c:35:9 #2 0x7fa437b12d09 in __libc_start_main csu/../csu/libc-start.c:308:16 #3 0x41e2b9 in _start (/tmp/sanitizer-example/prog+0x41e2b9) 0x602000000018 is located 8 bytes inside of 16-byte region [0x602000000010,0x602000000020) freed by thread T0 here: #0 0x4980ad in free (/tmp/sanitizer-example/prog+0x4980ad) #1 0x4c86cf in func /tmp/sanitizer-example/prog.c:48:2 #2 0x4c8359 in main /tmp/sanitizer-example/prog.c:35:9 #3 0x7fa437b12d09 in __libc_start_main csu/../csu/libc-start.c:308:16 previously allocated by thread T0 here: #0 0x49832d in malloc (/tmp/sanitizer-example/prog+0x49832d) #1 0x4c8066 in main /tmp/sanitizer-example/prog.c:21:20 #2 0x7fa437b12d09 in __libc_start_main csu/../csu/libc-start.c:308:16 SUMMARY: AddressSanitizer: heap-use-after-free /tmp/sanitizer-example/prog.c:49:15 in func Shadow bytes around the buggy address: ... ==796003==ABORTING
Even without the program's source code, there are a number of things we can deduce from this message:
- The type of error is "heap-use-after-free". This means the program freed a block of memory, and then tried to access it afterwards. It is an error to access a memory block after it has been freed, because by freeing it, the program is declaring that it can be reused for other purposes, and it has potentially been re-allocated for another purpose and overwritten with other data already.
- The error occurred when the program tried to read from a memory address.
- The error occurred in the
func
function, on line 49 ofprog.c
. This function was called by themain
function on line 35 ofprog.c
. - The memory block was 16 bytes, and was freed in the
func
function on line 48 ofprog.c
. This function was called by themain
function on line 35 ofprog.c
. - The memory block was allocated by
malloc
, which was called from themain
function on line 21 ofprog.c
.
Tip: The groups of lines starting with # are stack traces. Recall that when a program calls a function, a stack frame is created for it on the stack which remains until the function returns. A stack trace lists all the function calls that were on the stack at a particular point in time while the program was running, as well as the line that the program was up to in each function. Simply put, a stack trace is used to describe where a program was up to.
For example, consider the first stack trace in the above error message. This tells us that when the heap-use-after-free error occurred, the function calls on the stack (from earliest to latest) were:
- _start
- __libc_start_main
- main (line 35)
- func (line 49)
This tells us that the main
function called the func
function on line 35, and then the error occurred on line 49. You can ignore _start and __libc_start_main, as these are not part of the program's source code (these functions/procedures perform some initialisation for the program before the main
function is actually called).
Sanitizers guide
The COMP2521 Sanitisers Guide (written by Franco Reyes, one of our awesome tutors) is a helpful guide that explains how to read error messages, and contains detailed explanations and examples for all common error types.
Your task
The sanitizers/
directory contains several buggy programs. For each of the programs below, your task is to:
- Compile the program as instructed
- Run the program until you get an error message from AddressSanitizer or MemorySanitizer
- With help from the Sanitisers Guide, identify the following pieces of information from the error message:
- The type of error, and what it means
- The name of the function in which the error occurred, and the source file and line number of the instruction that triggered the error
- The series of functions that were on the function call stack when the error occurred
- The meaning of any other stack traces in the error message (these usually have a line of pink text above them)
- Deduce why the error occurred
- Deduce how to fix the program (but don't actually fix it)
- In the file
part-1-answers.txt
(in therevision
directory), under the heading for the program, paste the error message given by the program, briefly explain why the error occurred and briefly describe how it can be fixed. During lab marking, your tutor will ask you about some of the programs, so the notes you write in this file will be helpful.Note: Thepart-1-answers.txt
file is in therevision
directory because when you submit the lab, all the files you submit will need to be in the same directory.
Program 1: listAddStart.c
The purpose of this program is to repeatedly read values and insert them at the beginning of a linked list. Here is the intended behaviour of the program:
./listAddStart Enter values: [] ./listAddStart Enter values: 3 1 4 1 5 [5, 1, 4, 1, 3]
Compile the program with the following command:
clang -Wall -Werror -g -fsanitize=address,leak,undefined -o listAddStart listAddStart.c
Program 2: shuffleArray.c
The purpose of this program is to read values into an array and then shuffle the array. Here is the intended behaviour of the program:
./shuffleArray Enter array size: 5 Enter array values: 3 1 4 1 5 Original array: [3, 1, 4, 1, 5] Shuffled array: [1, 3, 1, 5, 4]
Note that the program randomly shuffles the array, so it is expected to produce different outputs each time it is run.
Compile the program with the following command:
clang -Wall -Werror -g -fsanitize=address,leak,undefined -o shuffleArray shuffleArray.c
scanf_common
in shuffleArray.c
). This suggests that the error occurred in a library function. In this case, you should look further down the function call stack until you find a function that exists in the source code.
Program 3: reverseArray.c
The purpose of this program is to read values into an array and then reverse the array. Here is the intended behaviour of the program:
./reverseArray Enter array size: 5 Enter array values: 3 1 4 1 5 Original array: [3, 1, 4, 1, 5] Reversed array: [5, 1, 4, 1, 3]
Compile the program with the following command:
clang -Wall -Werror -g -fsanitize=address,leak,undefined -o reverseArray reverseArray.c
Program 4: listPrepend.c
The purpose of this program is to repeatedly read values and insert them at the beginning of a linked list. Here is the intended behaviour of the program:
./listPrepend Enter values: [] ./listPrepend Enter values: 3 1 4 1 5 [5, 1, 4, 1, 3]
Compile the program with the following command:
clang -Wall -Werror -g -fsanitize=memory,undefined -fsanitize-memory-track-origins -o listPrepend listPrepend.c
Program 5: listAppend.c
The purpose of this program is to read values and insert them at the end of a list. Here is the intended behaviour of the program:
./listAppend Enter values: [] ./listAppend Enter values: 3 1 4 1 5 [3, 1, 4, 1, 5]
Compile the program with the following command:
clang -Wall -Werror -g -fsanitize=address,leak,undefined -o listAppend listAppend.c
Program 6: listDeleteFirst.c
The purpose of this program is to read values into a list and then delete the first value. Here is the intended behaviour of the program:
./listDeleteFirst Enter values: Original list: [] List after deleting first value: [] ./listDeleteFirst Enter values: 3 1 4 1 5 Original list: [3, 1, 4, 1, 5] List after deleting first value: [1, 4, 1, 5]
Compile the program with the following command:
clang -Wall -Werror -g -fsanitize=address,leak,undefined -o listDeleteFirst listDeleteFirst.c
Program 7: listNumEvens.c
The purpose of this program is to read values into a list and then count the number of even values. Here is the intended behaviour of the program:
./listNumEvens Enter values: Number of even values: 0 ./listNumEvens Enter values: 3 1 4 1 5 Number of even values: 1
This program does not contain any errors that would be detected by AddressSanitizer, but it does contain errors that would be detected by MemorySanitizer. Compile the program with the following command:
clang -Wall -Werror -g -fsanitize=memory,undefined -fsanitize-memory-track-origins -o listNumEvens listNumEvens.c
Program 8: mostFrequentLetter.c
The purpose of this program is to read a line of text and then determine the most frequent letter. Here is the intended behaviour of the program:
./mostFrequentLetter Enter some text: hello world The most frequent letter is: l
Compile the program with the following command:
clang -Wall -Werror -g -fsanitize=memory,undefined -fsanitize-memory-track-origins -o mostFrequentLetter mostFrequentLetter.c
Part 2 - Makefiles 🛠️
make
command. You won't be assessed on your understanding of Makefiles and make
.
If you got tired of typing out the compilation commands in Part 1, you're not alone! In COMP2521, we will usually not type out the compilation commands to compile our programs; instead, we will use Makefiles and the make
command.
A Makefile is a file that specifies the build products of a project, their dependencies (e.g., source code files) and instructions for building them (e.g., compilation commands). If a Makefile exists, the make
command will use the Makefile to determine what needs to be built (or rebuilt), and then run the commands needed to build them.
The main benefit of using Makefiles is that make
uses the modification timestamps of files to decide whether something needs to be rebuilt, and does not rebuild something if its dependencies have not been modified since it was last built. This means, for example, if we have only modified one file since we last ran make
, and then we run make
again, only parts of the project that depend on the file we modified will be rebuilt. This saves a lot of build time in larger projects.
In COMP2521, we will always provide Makefiles for any coding activity, including in labs, assignments and the exam, so you don't need to know how to write a Makefile. However, you should know how to run the make
command.
Let's take a look at a simple Makefile. Here is the Makefile from the makefiles
directory of this lab:
CC = clang CFLAGS = -Wall -Wvla -Werror -g ######################################################################## asan: CFLAGS += -fsanitize=address,leak,undefined asan: all msan: CFLAGS += -fsanitize=memory,undefined -fsanitize-memory-track-origins msan: all nosan: all ######################################################################## all: concatStrings getMax concatStrings: concatStrings.c $(CC) $(CFLAGS) -o concatStrings concatStrings.c getMax: getMax.c $(CC) $(CFLAGS) -o getMax getMax.c ######################################################################## clean: rm -f concatStrings getMax
The important thing to note here are the targets, which are the words that appear before a colon. The targets in this Makefile are: asan
, msan
, nosan
, all
, concatStrings
, getMax
and clean
. Note that some targets refer to actual files that will be produced (concatStrings
and getMax
), while the other targets don't refer to actual files. When we run make
, we can either specify one or more targets to build, or not specify any targets, in which case make
will simply build the first target in the Makefile.
The dependencies of a target are listed after the colon on the same line. For example, the asan
target has one dependency, which is the all
target. Meanwhile, the all
target has two dependencies: concatStrings
and getMax
.
It's time to try out the make
command! Change to the makefiles/
directory, then run make
. You should get the following output:
make clang -Wall -Wvla -Werror -g -fsanitize=address,leak,undefined -o concatStrings concatStrings.c clang -Wall -Wvla -Werror -g -fsanitize=address,leak,undefined -o getMax getMax.c
You may be wondering why make
compiled concatStrings
and getMax
when the first target in the Makefile is asan
. That is because concatStrings
and getMax
are indirect dependencies of the asan
target - the asan
target depends on the all
target, which depends on concatStrings
and getMax
.
Let's see what happens if we try to run make
again. You should get the following output:
make make: Nothing to be done for 'asan'.
Since the dependencies of asan
have not been modified since we last ran make
, make
deduces that nothing needs to be rebuilt, and thus outputs "Nothing to be done".
As we mentioned before, in Makefiles, you can create targets which don't actually refer to real files. This can be useful because (1) you may want to automate tasks that don't involve any compilation, but are still related to the compilation process or (2) you may want to compile the same program in different ways. For example, in all of our Makefiles, we will provide a target called "clean" which removes all build products from the directory. To use it, specify the clean
target. You should get the following output:
make clean rm -f concatStrings getMax
We will also include a few targets to let you specify what sanitizers you want to compile with, as not all sanitizers are compatible with each other. Since AddressSanitizer is the most useful sanitizer, this is what make
will compile with if you run it without any arguments (which you've seen above). However, you can explicitly state that you want to compile with AddressSanitizer like so:
make asan
If you want to compile with MemorySanitizer, then you must specify the msan
target, like so:
make msan
If you wish to compile without any sanitizers, which you should only need to do if you want to use valgrind
, then you must specify the nosan
target like so:
make nosan
Note that if you build each of these targets one after the other without modifying any files in between, make
will not rebuild the programs and instead output "Nothing to be done for [target]". This is because even though asan
, msan
and nosan
are different targets, they all have the same base dependencies (concatStrings.c
and getMax.c
), which have not been modified. If you want to compile a program with AddressSanitizer, and then compile it again with MemorySanitizer, you will need to run make clean
in between.
Your task
The makefiles/
directory contains two buggy programs. One of these programs contains an error that will be caught by AddressSanitizer, while the other program contains an error that will be caught by MemorySanitizer. Follow these steps:
- Use
make
to compile the programs with AddressSanitizer. - Run each of the programs. Observe that one of the programs produces an error, while the other exits normally.
- In the file
part-2-answers.txt
(in therevision
directory), paste the error message under the AddressSanitizer heading.
- In the file
- Now use
make
to compile the programs with MemorySanitizer. - Run each of the programs again. Observe that the program that produced an error in step 2 now exits normally, while the other program now produces an error.
- In the file
part-2-answers.txt
(in therevision
directory), paste the error message under the MemorySanitizer heading.
- In the file
Part 3 - Revision Exercises 💪️
The remainder of this lab exercise consists of some array and linked list exercises to give you some practice programming with them again. These exercises are in the revision/
directory.
Task 1 - Shifting an Array
Your task is to implement the following function in arrayShift.c
:
void shift(int arr[], int size, int n);
This function shifts an array of the given size to the right \( n \) times. When an array is shifted to the right, each element is moved one index to the right (i.e., if an element was at index \(i\), it will now be at index \(i + 1\)), and the element at the end of the array (at index \(N - 1\) where \(N\) is the size of the array) is moved to the start (index \(0\)).
\(n\) can be assumed to be non-negative.
Testing
arrayShift.c
contains a main function which allows you to test your shift
function. The main function:
- Reads the size of the array
- Reads the values of the array
- Displays the array
- Reads the value of \(n\)
- Calls
shift
- Displays the updated array
Here are some examples of its usage:
make ... ./arrayShift Enter array size: 3 Enter array values: 16 5 2 Array: [16, 5, 2] Enter shift: 1 Array after shifting 1 time(s): [2, 16, 5] ./arrayShift Enter array size: 5 Enter array values: 16 5 2 -3 4 Array: [16, 5, 2, -3, 4] Enter shift: 3 Array after shifting 3 time(s): [2, -3, 4, 16, 5] ./arrayShift Enter array size: 5 Enter array values: 16 5 2 -3 4 Array: [16, 5, 2, -3, 4] Enter shift: 10 Array after shifting 10 time(s): [16, 5, 2, -3, 4]
Task 2 - Shifting a Linked List
Your task is to implement the following function in listShift.c
:
struct node *shift(struct node *list, int n);
This function shifts a linked list to the right \( n \) times and returns a pointer to the start of the updated list. When a linked list is shifted to the right, the node at the end of the list is moved to the start.
\(n\) can be assumed to be non-negative.
Constraints
- You must not use
malloc
or any of its variants - You must not modify the integer values in any nodes
Testing
listShift.c
contains a main function which allows you to test your shift
function. The main function:
- Reads the size of the list
- Reads the values of the list
- Displays the list
- Reads the value of \(n\)
- Calls
shift
- Displays the updated list
Here are some examples of its usage:
make ... ./listShift Enter list size: 3 Enter list values: 16 5 2 List: [16, 5, 2] Enter shift: 1 List after shifting 1 time(s): [2, 16, 5] ./listShift Enter list size: 5 Enter list values: 16 5 2 -3 4 List: [16, 5, 2, -3, 4] Enter shift: 3 List after shifting 3 time(s): [2, -3, 4, 16, 5] ./listShift Enter list size: 5 Enter list values: 16 5 2 -3 4 List: [16, 5, 2, -3, 4] Enter shift: 10 List after shifting 10 time(s): [16, 5, 2, -3, 4]
Task 3 - Inserting into an Ordered Array
Your task is to implement the following function in arrayInsertOrdered.c
:
void insertOrdered(int arr[], int size, int value);
This function takes an array which is sorted in increasing order, and inserts the given value into the array such that the array remains ordered.
The function can assume that it can store an element at index size
. For example, suppose that the inputs are:
Then the value should be inserted like so:
Testing
arrayInsertOrdered.c
contains a main function which allows you to test your insertOrdered
function. The main function:
- Reads the size of the array
- Reads the values of the array
- Displays the array
- Reads the value to be inserted
- Calls
insertOrdered
- Displays the updated array
Here are some examples of its usage:
make ... ./arrayInsertOrdered Enter array size: 4 Enter array values (must be in ascending order): 2 5 7 8 Array: [2, 5, 7, 8] Enter value to insert: 6 Array after inserting 6: [2, 5, 6, 7, 8] ./arrayInsertOrdered Enter array size: 5 Enter array values (must be in ascending order): -8 -3 -1 2 4 Array: [-8, -3, -1, 2, 4] Enter value to insert: -9 Array after inserting -9: [-9, -8, -3, -1, 2, 4] ./arrayInsertOrdered Enter array size: 6 Enter array values (must be in ascending order): -5 0 2 6 9 14 Array: [-5, 0, 2, 6, 9, 14] Enter value to insert: 2 Array after inserting 2: [-5, 0, 2, 2, 6, 9, 14]
Task 4 - Inserting into an Ordered Linked List
Your task is to implement the following function in listInsertOrdered.c
:
struct node *insertOrdered(struct node *list, int value);
This function takes a linked list which is sorted in increasing order, inserts the given value into the list such that the list remains ordered, and then returns a pointer to the start of the updated list.
Testing
listInsertOrdered.c
contains a main function which allows you to test your insertOrdered
function. The main function:
- Reads the size of the list
- Reads the values of the list
- Displays the list
- Reads the value to be inserted
- Calls
insertOrdered
- Displays the updated list
Here are some examples of its usage:
make ... ./listInsertOrdered Enter list size: 4 Enter list values (must be in ascending order): 2 5 7 8 List: [2, 5, 7, 8] Enter value to insert: 6 List after inserting 6: [2, 5, 6, 7, 8] ./listInsertOrdered Enter list size: 5 Enter list values (must be in ascending order): -8 -3 -1 2 4 List: [-8, -3, -1, 2, 4] Enter value to insert: -9 List after inserting -9: [-9, -8, -3, -1, 2, 4] ./listInsertOrdered Enter list size: 6 Enter list values (must be in ascending order): -5 0 2 6 9 14 List: [-5, 0, 2, 6, 9, 14] Enter value to insert: 2 List after inserting 2: [-5, 0, 2, 2, 6, 9, 14]
Optional Challenge Task
Note: This task is optional. It is not marked.
In tasks 1 and 2, implement the shift
function without using any nested loops (non-nested loops are fine) and without using malloc
, calloc
or any temporary arrays.
Submission
Submit via the command line using the give
command (make sure you are in the revision
directory first):
give cs2521 lab01
You can also submit via give's web interface. You can submit multiple times. Only your last submission will be marked. You can check the files you have submitted here.
Assessment
Parts 1 and 2 will be handmarked, and Part 3 will be automarked. Please note that handmarking must be completed in Weeks 1, 2 or 3; you cannot ask your tutor to handmark your lab01 work in Week 4 or later (they won't be able to enter your mark). The marking criteria are as follows:
Part | Marks | Criteria |
---|---|---|
Part 1 - Sanitizers | 2 | Your tutor will pick two programs from the list of programs in Part 1. For each program, they will ask you to run the program until AddressSanitizer or MemorySanitizer produces an error message, and then ask you identify the following:
|
Part 2 - Makefiles | 1 | Your tutor will ask you to use the make command to compile the programs in the makefiles directory, once with AddressSanitizer, and once with MemorySanitizer. They will also check your answers in part-2-answers.txt . |
Part 3 - Revision Exercises | 2 | 0.5 marks per task. Automarking will be run after submissions have closed. After automarking is run you will be able to view your results here. |