Week 08 Laboratory Exercises
Objectives
- learning how to access files
- learning how to work with binary data
- learning how to use lseek
Preparation
Before the lab you should re-read the relevant lecture slides and their accompanying examples.
Getting Started
lab08
and changing to this directory.
mkdir lab08 cd lab08
There are some provided files for this lab which you can fetch with this command:
1521 fetch lab08
If you're not working at CSE, you can download the provided files as a zip file or a tar file.
Exercise — individual:
Create a File of Integers
Write a C program, create_integers_file
,
which takes 3 arguments:
- a filename,
- the beginning of a range of integers, and
- the end of a range of integers;
and which creates a file of this name containing the specified integers. You can assume the range will always be strictly ascending. For example:
./create_integers_file fortytwo.txt 40 42 cat fortytwo.txt 40 41 42 ./create_integers_file a.txt 1 5 cat a.txt 1 2 3 4 5 ./create_integers_file 1000.txt 1 1000 wc 1000.txt 1000 1000 3893 1000.txt
Your program should print a suitable error message if given the wrong number of arguments, or if the file can not be created.
When you think your program is working,
you can use autotest
to run some simple automated tests:
1521 autotest create_integers_file
When you are finished working on this exercise,
you must
submit your work by running give
:
give cs1521 lab08_create_integers_file create_integers_file.c
You must run give
before Monday 04 November 12:00 (midday) (2024-11-04 12:00:00)
to obtain the marks for this lab exercise.
Note that this is an individual exercise,
the work you submit with give
must be entirely your own.
Exercise — individual:
Print the Bytes of A File
Write a C program, print_bytes
,
which takes one argument, a filename,
and which should read the specified file
and print one line for each byte of the file.
The line should show the byte in decimal and hexadecimal.
If that byte is an ASCII printable character,
its ASCII value should also be printed.
Assume ASCII printable characters are those for which
the ctype.h
function isprint(3) returns
a non-zero value.
Follow the format in this example exactly:
echo "Hello Andrew!" >hello.txt ./print_bytes hello.txt byte 0: 72 0x48 'H' byte 1: 101 0x65 'e' byte 2: 108 0x6c 'l' byte 3: 108 0x6c 'l' byte 4: 111 0x6f 'o' byte 5: 32 0x20 ' ' byte 6: 65 0x41 'A' byte 7: 110 0x6e 'n' byte 8: 100 0x64 'd' byte 9: 114 0x72 'r' byte 10: 101 0x65 'e' byte 11: 119 0x77 'w' byte 12: 33 0x21 '!' byte 13: 10 0x0a
When you think your program is working,
you can use autotest
to run some simple automated tests:
1521 autotest print_bytes
When you are finished working on this exercise,
you must
submit your work by running give
:
give cs1521 lab08_print_bytes print_bytes.c
You must run give
before Monday 04 November 12:00 (midday) (2024-11-04 12:00:00)
to obtain the marks for this lab exercise.
Note that this is an individual exercise,
the work you submit with give
must be entirely your own.
Exercise — individual:
Create a Binary File
Write a C program, create_binary_file
,
which takes at least one argument: a filename,
and subsequently, integers in the range 0…255 inclusive
specifying byte values.
It should create a file of the specified name,
containing the specified bytes.
For example:
./create_binary_file hello.txt 72 101 108 108 111 33 10 cat hello.txt Hello! ./create_binary_file count.binary 1 2 3 251 252 253 254 255 ./print_bytes count.binary byte 0: 1 0x01 byte 1: 2 0x02 byte 2: 3 0x03 byte 3: 251 0xfb byte 4: 252 0xfc byte 5: 253 0xfd byte 6: 254 0xfe byte 7: 255 0xff ./create_binary_file 4_bytes.binary 222 173 190 239 ./print_bytes 4_bytes.binary byte 0: 222 0xde byte 1: 173 0xad byte 2: 190 0xbe byte 3: 239 0xef
Your program should print a suitable error message if given the wrong number of arguments, or if the file can not be created.
When you think your program is working,
you can use autotest
to run some simple automated tests:
1521 autotest create_binary_file
When you are finished working on this exercise,
you must
submit your work by running give
:
give cs1521 lab08_create_binary_file create_binary_file.c
You must run give
before Monday 04 November 12:00 (midday) (2024-11-04 12:00:00)
to obtain the marks for this lab exercise.
Note that this is an individual exercise,
the work you submit with give
must be entirely your own.
Exercise — individual:
Read a file of little-endian integers
Write a C program, read_lit_file
,
which takes one argument: a filename,
specifying the path to a file using the LIT
file format.
A LIT
, or Little-endian Integer file is
a format for storing a sequence of integers in a file which we have defined.
The format for a LIT
file starts off with a header. The header
format is as follows:
name | length | type | description |
---|---|---|---|
magic number | 3 B | characters sequence | The magic number for LIT files, which is the sequence of bytes
0x4C, 0x49, 0x54 (ASCII LIT ).
|
After the header, the file contains a sequence of records. Each record uses the following format:
name | length | type | description |
---|---|---|---|
number of bytes | 1 B | ASCII character | One of the characters '1' , '2' , '3' ,
'4' , '5' , '6' , '7' or
'8' , indicating the number of bytes forming the integer
stored in this record.
|
value | num-bytes | little-endian integer | The integer value stored in this record, using the specified number of bytes. |
Your program should read the file specified by the argument, and print out the integers stored in the file, one per line, in the order they appear in the file.
Your program should print a suitable error message to stderr(3) and then exit(3) with status 1
if:
- the user does not provide exactly one argument
- the file does not exist
- the magic number is not correct, or the file is not long enough to contain a complete header
- the file contains a record with an invalid number of bytes
- the file ends before the end of a record
unzip examples.zip ... extracting ... dcc -o read_lit_file read_lit_file.c xxd examples/one_record.lit 00000000: 4c49 5433 5432 32 LIT3T22 ./read_lit_file examples/one_record.lit 3289684 ./read_lit_file examples/assortment.lit 123 250 4242 4242 424242424242424242 18374686479671623680 71776119061217280 71776119061217280 49 12345678987654321 4242424242424242424 ./read_lit_file examples/completely_empty.lit Failed to read magic ./read_lit_file examples/too_short_1.lit 42 Failed to read record ./read_lit_file examples/invalid_num_bytes_1.lit 42 Invalid record length
When you think your program is working,
you can use autotest
to run some simple automated tests:
1521 autotest read_lit_file
When you are finished working on this exercise,
you must
submit your work by running give
:
give cs1521 lab08_read_lit_file read_lit_file.c
You must run give
before Monday 04 November 12:00 (midday) (2024-11-04 12:00:00)
to obtain the marks for this lab exercise.
Note that this is an individual exercise,
the work you submit with give
must be entirely your own.
Exercise — individual:
Print File Modes
Write a C program, file_modes.c
, which is given one or more pathnames as command line arguments.
It should print one line for each pathnames which gives the permissions of the file or directory.
Follow the output format below.
dcc file_modes.c -o file_modes ls -ld file_modes.c file_modes diary.c diary -rwxr-xr-x 1 z5555555 z5555555 116744 Nov 2 13:00 diary -rw-r--r-- 1 z5555555 z5555555 604 Nov 2 12:58 diary.c -rwxr-xr-x 1 z5555555 z5555555 222672 Nov 2 13:00 file_modes -rw-r--r-- 1 z5555555 z5555555 2934 Nov 2 12:59 file_modes.c ./file_modes file_modes file_modes.c diary diary.c -rwxr-xr-x file_modes -rw-r--r-- file_modes.c -rwxr-xr-x diary -rw-r--r-- diary.c chmod 700 file_modes chmod 640 diary.c chmod 600 file_modes.c ls -ld file_modes.c file_modes diary.c diary -rwxr-xr-x 1 z5555555 z5555555 116744 Nov 2 13:00 diary -rw-r----- 1 z5555555 z5555555 604 Nov 2 12:58 diary.c -rwx------ 1 z5555555 z5555555 222672 Nov 2 13:00 file_modes -rw------- 1 z5555555 z5555555 2934 Nov 2 12:59 file_modes.c ./file_modes file_modes file_modes.c diary diary.c -rwx------ file_modes -rw------- file_modes.c -rwxr-xr-x diary -rw-r----- diary.cThe first character on each line should be '-' for ordinary files and 'd' for directories. For example:
./file_modes /tmp /web/cs1521/index.html drwxrwxrwx /tmp -rw-r--r-- /web/cs1521/index.html
When you think your program is working,
you can use autotest
to run some simple automated tests:
1521 autotest file_modes
When you are finished working on this exercise,
you must
submit your work by running give
:
give cs1521 lab08_file_modes file_modes.c
You must run give
before Monday 04 November 12:00 (midday) (2024-11-04 12:00:00)
to obtain the marks for this lab exercise.
Note that this is an individual exercise,
the work you submit with give
must be entirely your own.
Challenge Exercise — individual:
Extract ASCII from a Binary File
We are distributing programs as binaries, and would like to know what if the C compiler is leaving any confidential information in the binaries as ASCII strings.
Only 95 of 256 byte values correspond to printable ASCII characters, so several byte values in a row corresponding to printable characters probably will occur infrequently in non-ASCII data. There is only a 2% chance that four (independent, uniform) random byte values will correspond to ASCII printable characters.
Write a C program, hidden_strings
,
which takes one argument, a filename;
it should read that file,
and print all sequences of length 4 or longer
of consecutive byte values
corresponding to printable ASCII characters.
In other words,
your program should read through the bytes of the file,
and if it finds 4 bytes in a row containing printable characters,
it should print those bytes,
and any following bytes containing ASCII printable characters.
Print each sequence on a separate line.
Assume ASCII printable characters are those for which
the ctype.h
function isprint(3)
returns a non-zero value.
Do not read the entire file into an array.
Use the create_binary_file
program from the previous exercise
to create simple test data. For example:
dcc hidden_strings.c -o hidden_strings ./create_binary_file test_file 72 101 108 108 111 255 255 65 110 100 114 101 119 ./hidden_strings test_file Hello Andrew
When you think your program is working, try extracting strings from a compiled binary. For example:
cat secret.c #define secret_hash_define 1 // secret comment int secret_global_variable; int main(void) { int secret_local_variable; char *s = "secret string"; } int secret_function_name() { } gcc secret.c -o binary1 gcc secret.c -g -o binary2 gcc secret.c -s -o binary3 ./hidden_strings binary1 /lib64/ld-linux-x86-64.so.2 libc.so.6 __cxa_finalize __libc_start_main GLIBC_2.2.5 ... ./hidden_strings binary1|grep secret secret string secret.c secret_function_name secret_global_variable ./hidden_strings binary2|grep secret secret string secret.c secret.c secret_global_variable secret_function_name secret_local_variable secret.c secret_function_name secret_global_variable ./hidden_strings binary3|grep secret secret string
The above example shows that, by default, gcc(1) leaves function names, global variables names and the filename in the binary.
If you specify the -g
command line option,
variable names are also left in the binary.
This is part of information left for debuggers
such as gdb(1) (which dcc uses).
This information allows debuggers to print the current value of variables.
If you specify the -s
command line option,
all names are stripped from the binary but the string remains.
When you think your program is working,
you can use autotest
to run some simple automated tests:
1521 autotest hidden_strings
When you are finished working on this exercise,
you must
submit your work by running give
:
give cs1521 lab08_hidden_strings hidden_strings.c
You must run give
before Monday 04 November 12:00 (midday) (2024-11-04 12:00:00)
to obtain the marks for this lab exercise.
Note that this is an individual exercise,
the work you submit with give
must be entirely your own.
Challenge Exercise — individual:
Print The Last line of Huge Files
Write a C program, last_line
,
which takes one argument, a filename,
and which should print the last line of that file.
For example:
dcc last_line.c -o last_line echo -e 'hello\ngood bye' >hello_goodbye.txt cat hello_goodbye.txt hello good bye ./last_line hello_goodbye.txt good bye
You program should not assume the last byte of the file is a newline character.
echo -n -e 'hello\ngoodbye' >no_last_newline.txt hello goodbye./last_line no_last_newline.txt goodbye
Your program should handle extremely large files. It should not read the entire file. As this is a challenge exercise, marks will not be awarded for programs which read the entire file.
For example, it should be able to print the last line of a one-terabyte file:
echo -e 'Hello\nGood Bye'|dd status=none seek=1T bs=1 of=/tmp/gigantic_file$$ ls -l /tmp/gigantic_file$$ -rw-r--r-- 1 z5555555 z5555555 1099511627791 Oct 26 17:27 gigantic_file12345 ./last_line /tmp/gigantic_file$$ Good Bye
The gigantic file created above is a sparse file, consisting almost entirely of zero bytes: it uses little actual disk space, but, to be safe, remove it when you finish the exercise.
The commands above create the sparse file in /tmp to avoid it accidentally being backed up or otherwise copied.
Sparse files can create problems if they are accidentally copied by a program which doesn't handle them specially — and most programs don't.
BTW the $$ in the above command is replaced by the shell process id. This is because /tmp is shared so we'd like to use a filename that is (more or less) unique.
When you think your program is working,
you can use autotest
to run some simple automated tests:
1521 autotest last_line
When you are finished working on this exercise,
you must
submit your work by running give
:
give cs1521 lab08_last_line last_line.c
You must run give
before Monday 04 November 12:00 (midday) (2024-11-04 12:00:00)
to obtain the marks for this lab exercise.
Note that this is an individual exercise,
the work you submit with give
must be entirely your own.
Submission
give
.
You can run give
multiple times.
Only your last submission will be marked.
Don't submit any exercises you haven't attempted.
If you are working at home, you may find it more convenient to upload your work via give's web interface.
Remember you have until Week 9 Monday 12:00:00 (midday) to submit your work without receiving a late penalty.
You cannot obtain marks by e-mailing your code to tutors or lecturers.
You check the files you have submitted here.
Automarking will be run by the lecturer several days after the submission deadline,
using test cases different to those autotest
runs for you.
(Hint: do your own testing as well as running autotest
.)
After automarking is run by the lecturer you can view your results here. The resulting mark will also be available via give's web interface.
Lab Marks
When all components of a lab are automarked you should be able to view the the marks via give's web interface or by running this command on a CSE machine:
1521 classrun -sturec