Assignment 2: Standard Input/Output Library

version: 1.0.0 last updated: 2026-02-28 20:00:00

Contents

Aims

  • building a concrete understanding of file system objects;
  • understanding the file operation abstractions provided by the C standard library;
  • practising C, including bitwise operations and robust error handling;

The Task

So far you have used stdio.h, part of the C standard library, to implement user programs. In this assignment, you will be writing your own implementation of stdio.h.

A large part of your task will be defining high level functions like fgets, fwrite, etc. in terms of low level calls to read and write.

Below, the specification for this assignment is structured with the following sections:

Overview
What you need to implement. Where relevant, changes that might need to be made to your existing implementation.
Behaviour
The behaviour of the relevant function.
Errors
Error cases you must handle.
Undefined behaviour
Cases which are undefined, and will not be tested. You may assume that any parameters that would be classified as undefined behaviour will never be passed to your function.
Differences to standard implementation
Differences to the "real" implemenation of stdio. Note that for almost all functions, there are none. You are writing a mostly complete implementation!
References
Manual pages which will provide additional implementation about the behaviour of functions (both the functions you are implementing, and the underlying syscall wrappers that you are calling). Any manual page in section (2) refers to a syscall wrapper. In any case that the manual page conflicts with this specification, you are to take the specification as correct instead.

We have provided an implementation of cs1521_printf for debugging your code. It might interest you to look at the implementation - the only thing which is not in the scope of COMP1521 is the use of va_args, which walks the argument registers and the stack to obtain function arguments when more arguments are needed for the format string.

Getting Started

Create a new directory for this assignment, change to this directory, and fetch the provided code by running

mkdir -m 700 stdio
cd stdio
1521 fetch stdio

If you're not working at CSE, you can download the provided files as a zip file or a tar file.

This will give you the following files:

cs1521_stdio.c
is the only file you need to change: it contains stubbed definitions of library functions to which you need to add code to complete the assignment. You can also add your own functions to this file.
cs1521_stdio.h
contains function declarations and constant definitions. Do not change this file.
cs1521_stdio.mk
contains a Makefile fragment for compiling test cases and examples.

You should run 1521 stdio-examples to get a directory called examples/ full of test files and example files to test your program against.

1521 stdio-examples

You can run make(1) to compile any of the examples; and you should be able to run the result.

make examples/subset0/fclose_fileno_errors
dcc		 -o examples/subset0/fclose_fileno_errors examples/subset0/fclose_fileno_errors.c cs1521_stdio.c
./examples/subset0/fclose_fileno_errors
failed to fopen("examples/data/mips.txt", "r")

You can compile all of the examples at once by invoking:

make
dcc		 -o examples/subset0/a_mode_works examples/subset0/a_mode_works.c cs1521_stdio.c
dcc		 -o examples/subset0/fclose_fileno_errors examples/subset0/fclose_fileno_errors.c cs1521_stdio.c
dcc		 -o examples/subset0/fclose_monitored examples/subset0/fclose_monitored.c cs1521_stdio.c
...

You can also compile all of the examples for a specific subset by invoking:

make subset1
dcc		 -o examples/subset1/fgets_eof examples/subset1/fgets_eof.c cs1521_stdio.c
dcc		 -o examples/subset1/fgets_errors examples/subset1/fgets_errors.c cs1521_stdio.c
dcc		 -o examples/subset1/fgets_full examples/subset1/fgets_full.c cs1521_stdio.c
...

You may optionally create extra .c or .h files. You can modify the provided Makefile fragment if you choose to do so.

Subset 0

To complete subset 0, you need to implement the following functions and data types:

FILE (struct file)
a struct that wraps around an underlying file descriptor.
fopen
performs an underlying open() call and returns a FILE* wrapping the obtained file descriptor.
fclose
performs an underlying close() call and frees any resources associated with the FILE*.
fileno
retrieves the underlying file descriptor for a FILE*.
fgetc
perform an underlying read()call to read one byte from a file.
fputc
perform an underlying write()call to write one byte into a file.

To test your implementations, you can use 1521 autotest stdio subset0.


FILE

Overview

You should implement struct file, which is currently defined as an empty struct in cs1521_stdio.c.

You should only need to include a field for the file descriptor. However, later in the assignment, you will likely need to add additional fields.


fopen

Overview
You should implement cs1521_FILE *cs1521_fopen(char *pathname, char *mode), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fopen should attempt to open pathname with the mode specified by mode, using an underlying call to open.
  • If fopen calls open with the O_CREAT flag, permissions for the new file must be provided to open as a secret third argument. You should use the mode 0644 in this case.
  • fopen should allocate memory for a FILE struct using malloc, and return a pointer to the new FILE
  • The new FILE should record, at a minimum, the file descriptor returned by open.
The mode argument should be one of the following strings
  • "r"
  • "w"
  • "a"
  • "r+"
  • "w+"
  • "a+"
The meaning of each string should be as specified by the fopen(3) manual page.
Errors
  • If the underlying call to open fails, then fopen should return NULL. errno will have already been set by open.
  • If the provided mode is not one of the six strings listed above, you should set errno to EINVAL and return NULL.
  • If a call to malloc fails, you should return NULL. errno will have already been set by malloc.
Undefined Behaviour
  • pathname is not a valid nul-terminated string.
  • mode is not a valid nul-terminated string.
Differences to standard implementation
  • Our implementation does not handle 'b' or 'e' modes.
  • A bad mode is typically undefined behaviour, but we define it as an error case.
References
The following manual pages will be useful:
  • fopen(3)
  • open(2)
  • errno(3)
  • malloc(3)

fclose

Overview
You should implement int cs1521_fclose(cs1521_FILE *stream), which is currently stubbed inside cs1521_stdio.c.
Behaviour
  • fclose should close the underlying file descriptor referred to by stream (with an underlying call to close) and free any associated resources (e.g. free any memory allocated with malloc).
  • On success, fclose should return 0.
Errors
  • If the underlying call to close fails, then fclose should return -1. errno will have already been set by close.
Undefined behaviour
  • stream is not a pointer to a valid FILE.
  • fclose has already been called on stream.
Differences to standard implementation
  • No notable differences.
References
The following manual pages will be useful:
  • fclose(3)
  • close(2)
  • free(3)

fileno

Overview
You should implement int cs1521_fileno(cs1521_FILE *stream) which is currently stubbed in cs1521_stdio.c
Behaviour
  • fileno should return the underlying file descriptor for stream.
Errors
  • There are no errors.
Undefined behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences.
References
The following manual pages will be useful:
  • fileno(3)

fgetc

Overview
You should implement int cs1521_fgetc(cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fgetc should read one byte from the file referenced by stream with an underlying call to read.
  • On success, that byte should be returned.
  • On failure (due to reaching end of file or due to an error encountered by read), cs1521_EOF should be returned instead.
Errors
  • If the underlying call to read fails, then fgetc should return cs1521_EOF. errno will have already been set by read.
Undefined behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences.
References
The following manual pages will be useful:
  • read(2)
  • fgetc(3)

fputc

Overview
You should implement int cs1521_fputc(int c, cs1521_FILE *stream).
Behaviour
  • fputc should write the least significant byte of c into the file referenced by stream with an underlying call to write.
  • On success, fputc should return the value of the byte written.
  • On failure (due to an error encountered by write), cs1521_EOF should be returned instead.
Errors
  • If the underlying call to write fails, then fputc should return cs1521_EOF. errno will have already been set by write.
Undefined behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences.
References
The following manual pages will be useful:
  • write(2)
  • fputc(3)

Subset 1

To complete subset 1, you need to implement the following functions and global variables:

stdin
a global FILE* wrapping the file descriptor 0
stdout
a global FILE* wrapping the file descriptor 1
stderr
a global FILE* wrapping the file descriptor 2
fread
perform one or more underlying read()calls to read multiple bytes from a file.
fwrite
perform an underlying write()call to write multiple bytes into a file.
fgets
perform one or more underlying read() calls to read one line from a file.
fputs
perform an underlying write()call to write a string to a file.

To test your implementations, you can use 1521 autotest stdio subset1.


stdin

Overview

You should create a global struct file initialised to contain the file descriptor 0.

cs1521_FILE *cs1521_stdin should be a separate global variable which points to the struct file.

stdin should behave as if it was opened with the mode "r"


stdout

Overview

You should create a global struct file initialised to contain the file descriptor 1.

cs1521_FILE *cs1521_stdout should be a separate global variable which points to the struct file.

stdout should behave as if it was opened with the mode "w"


stderr

Overview

You should create a global struct file initialised to contain the file descriptor 2.

cs1521_FILE *cs1521_stderr should be a separate global variable which points to the struct file.

stderr should behave as if it was opened with the mode "w"


fread

Overview
You should implement size_t cs1521_fread(void *ptr, size_t size, size_t nitems, cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fread should attempt to read size * nitems bytes from the file referenced by stream into the buffer at ptr.
  • fread should return the number of items (not the number of bytes) successfully and fully read.
  • If the underlying call to read partially fulfills the request, then fread should continue to repeatedly call read until either:
    • The request is fully fulfilled;
    • EOF is reached (read returns 0); or
    • read fails (read returns -1)
  • Clarification: if any call to read fails with error, 0 is returned (even if valid data has been read prior). Note: we won't be testing this behaviour anyway.
Errors
  • If an underlying call to read fails, then fread should return 0. errno will have already been set by read.
Undefined behaviour
  • ptris not a pointer to a valid buffer which is large enough to store size * nitems bytes
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences.
References
The following manual pages will be useful:
  • read(2)
  • fread(3)

fwrite

Overview
You should implement size_t cs1521_fwrite(void *ptr, size_t size, size_t nitems, cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fwrite should attempt to write size * nitems bytes into the file referenced by stream from the buffer at ptr.
  • fwrite should return the number of items (not the number of bytes) successfully written.
  • Unlike with fread, repeatedly calling write is not required, and won't be tested for.
Errors
  • If the underlying call to write fails, then fwrite should return 0. errno will have already been set by write.
Undefined behaviour
  • ptris not a pointer to a valid buffer which is large enough to store size * nitems bytes
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • The standard implementation of fwrite will repeatedly call write if a partial success occurs. Our implementation does not do this.
References
The following manual pages will be useful:
  • write(2)
  • fwrite(3)

fgets

Overview
You should implement char *cs1521_fgets(char *ptr, size_t size, cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fgets should read at most size - 1 bytes from the file referenced by stream into the buffer at ptr.
  • If fgets encounters a newline byte '\n' or the end of the file before reading all size - 1 bytes, reading stops immediately. If reading stopped due to a newline byte, then that newline byte is written into the buffer.
  • After reading into the buffer, fgets adds a null terminator character '\0' immediately after the last byte read into the buffer.
  • If at least one byte was successfully read (and no errors occured), then fgets returns ptr. Otherwise, NULL is returned.
Errors
  • If size is 0 then fgets should set errno to EINVAL and return NULL.
  • If an underlying call to read fails, then fgets should return NULL. errno will have already been set by read.
Undefined Behaviour
  • ptris not a pointer to a valid buffer which is large enough to store size bytes
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • Typically a zero size is undefined behaviour. We instead define it as an explicit error case.
References
The following manual pages will be useful:
  • read(2)
  • fgets(3)

fputs

Overview
You should implement int cs1521_fputs(char *s, cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fputs should write the array of characters at s to the file identified by stream.
  • The terminating null byte should not be written.
  • fputs should return 0 on success, and cs1521_EOF on failure
  • If write only partially succeeds, it should be treated as a failure.
Errors
  • If an underlying call to write fails, then fputs should return cs1521_EOF. errno will have already been set by write.
Undefined Behaviour
  • sis not a pointer to a valid nul-terminated string
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • fputs(3)
  • write(2)

Subset 2

To complete subset 2, you need to implement the following functions and update previous subsets to implement the eof and error indicators:

fseek
perform an underlying lseek()call to reposition a FILE*.
ftell
perform an underlying lseek()call to obtain the current offset of a FILE*.
perror
perform an underlying write() call to print an error message to stderr based on the current value of errno.
feof
check if the eof indicator is set for a FILE*.
ferror
check if the error indicator is set for a FILE*.
clearerr
clear the error and eof indicators for a FILE*.

To test your implementations, you can use 1521 autotest stdio subset2.


fseek

Overview
You should implement int cs1521_fseek(cs1521_FILE *stream, long offset, int whence), which is currently stubbed in cs1521_stdio.c
Behaviour
  • fseek should change the offset of the file referenced by stream.
  • whence determines the relative position in the file to which we add offset, as per the manual page for fseek(3)
  • A call to fseek which causes the offset to be beyond the end of the file is valid and should not cause an error.
  • On success, fseek should return 0.
Errors
  • If stream refers to a pipe, fifo, or socket, then fseek should set errno to ESPIPE and return -1.
  • If whence is not one of SEEK_SET, SEEK_CUR or SEEK_END, then fseek should set errno to EINVAL and return -1.
  • If the new offset would be negative, then fseek should set errno to EINVAL and return -1.
  • If an underlying call to lseek fails, then fseek should return -1. errno will have already been set by lseek.
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • fseek(3)
  • lseek(2)
  • stat(3)
  • inode(7)

ftell

Overview
You should implement long cs1521_ftell(cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • ftell should return the current offset of stream.
Errors
  • If an underlying call to lseek fails, then ftell should return -1. errno will have already been set by lseek.
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • ftell(3)
  • lseek(2)

perror

Overview
You should implement void cs1521_perror(char *msg), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • perror should obtain the system error message for the current value of errno and print it to the standard error stream.
  • If msg is NULL or the empty string, the system error message should output followed by a newline byte
  • Otherwise, the output should instead be msg, then exactly ": ", then the system error message, and finally a newline byte.
  • perror may make only one underlying call to write (calling fputs is okay, as long as the implementation of fputs uses only one call to write.)
  • Students attempting subset 4 may assume that perror is not called when the buffering behaviour of stderr has been altered - this case will not be tested.
Errors
  • If the underlying call to write (or fputs) fails, then perror should silently fail.
Undefined Behaviour
  • msg is not a valid nul-terminated string
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • perror(3)
  • errno(3)
  • strerror(3)
  • strcat(3)

feof

Overview
You should implement int cs1521_feof(cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.

You must also modify your previous work to correctly set and clear the eof indicator for a stream. This will require adding at least one field to struct file.

The eof indicator should be set when:

  • Any of fgetc, fread, or fgets encounters the end of a file while trying to read bytes.

The eof indicator for a stream should be cleared when:

  • fseek is successfully called on the stream
  • clearerr is called on the stream.

When the eof indicator is set, any attempts to read from the stream should not even attempt to read the underlying file and instead immediately return whatever is appropriate for that function upon encountering EOF.

Behaviour
  • feof should return 1 if the eof indicator is set for the stream, or 0 otherwise.
Errors
  • There are no errors for feof.
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • feof(3)
  • fseek(3)

ferror

Overview
You should implement int cs1521_ferror(cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.

You must also modify your previous work to correctly set and clear the error indicator for a stream. This will require adding at least one field to struct file.

The error indicator should be set when:

  • Any IO operation on a stream fails due to error (i.e. if read or write fail during a call to a function like fgetc, fputs, etc.).

The eof indicator for a stream should be cleared when:

  • clearerr is called on the stream.
Behaviour
  • ferror should return 1 if the error indicator is set for the stream, or 0 otherwise.
Errors
  • There are no errors for ferror.
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • ferror(3)

clearerr

Overview
You should implement void cs1521_clearerr(cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • clearerr should clear both the eof and error indicators for stream.
Errors
  • There are no errors for clearerr.
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • clearerr(3)

Subset 3

To complete subset 3, you need to implement the following functions:
fgetcw
read utf-8 encoded characters from a file and return them as utf-32 characters
fputw
take utf-32 characters and output them to a file as utf-8 encoded characters
posix_spawnp
note: not typically part of the stdio.h library. spawn a process from an executable file using underlying calls to fork() and execve(), finding the file's location by searching the PATH environment variable.

To test your implementations, you can use 1521 autotest stdio subset3.


fgetwc

Overview
You should implement cs1521_wchar_t cs1521_fgetwc(cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fgetwc should read a UTF-8 encoded codepoint from the file referred to by stream.
  • The codepoint should be converted to UTF-32 and returned as a cs1521_wchar_t.
  • In other words, the literal value of the unicode codepoint should be returned.
  • If no character could be read (EOF or error), then fgetwc should return cs1521_WEOF.
  • If a the UTF-8 codepoint is encoded across more bytes than necessary (overlong encoding), fgetcw should succesfully read it regardless.

For example, if the file contains the four bytes:

0b11110000 0b10011111 0b10011000 0b10001010,

then fgetcw should extract the unicode codepoint:

0b11110000 0b10011111 0b10011000 0b10001010

= 0b000011111011000001010 (0x1f60a), and return this value.

Errors
  • If an underlying call to read fails, then fgetc should return cs1521_WEOF. errno will have already been set by read.
  • If a UTF character cannot be read due to invalid or incomplete encoding, then errno should be set to EILSEQ.
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • The standard implementation of fgetwc respects the locale(7) of the C program. Our implementation assumes wchar_t is UTF-32, and always reads from the stream using UTF-8 encoding.
References
The following manual pages will be useful:
  • fgetwc(3)
  • read(2)

fputwc

Overview
You should implement cs1521_wchar_t cs1521_fputwc(cs1521_wchar_t wc, cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fputwc should write the UTF-32 encoded codepoint wc into the file referred to by stream using UTF-8 encoding. fputcw must use the smallest number of bytes possible to encode wc.
  • On success, wc should be returned. On failure, cs1521_WEOF should be returned.
  • If write partially succeeds, it should be treated as a failure.

For example, if wc = 0b11111011000001010 (0x1f60a),

then fputcw should identify the minimum number of utf8 bytes needed to encode this value (4), construct those utf8 bytes:

0b11110000 0b10011111 0b10011000 0b10001010 and then write them into the file referenced by stream.

Errors
  • If an underlying call to write fails, then fputwc should return cs1521_WEOF. errno will have already been set by write.
  • If wc is an invalid UTF-32 character, then fputwc should return cs1521_WEOF and set errno=EILSEQ. UTF-32 characters are invalid if they are larger than the largest unicode codepoint 0x10FFFF, or if they are in the range of values used for encoding surrogate UTF-16 pairs (0xd800 to 0xdfff)
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • The standard implementation of fputwc respects the locale(7) of the C program. Our implementation assumes wchar_t is UTF-32, and always outputs to the stream using UTF-8 encoding.
References
The following manual pages will be useful:
  • fputwc(3)
  • write(2)

posix_spawnp

Overview
You should implement int cs1521_posix_spawnp(int *pid, char *file, cs1521_posix_spawnp_file_actions_t *file_actions,

cs1521_posix_spawnpattr_t *attrp, char *argv[], char *envp[]), which is currently stubbed in cs1521_stdio.c.

Behaviour
  • posix_spawnp should create a child process from the specified executable file name file using fork and execve.
  • The provided file will be an incomplete path (like "cat"). To find the full path for the specified file name, posix_spawnp should search all of the directories included in the PATH environment variable for a file with that name. PATH contains a list of directores separated by a ':' character. Your implementation does not need to handle complete paths (like "/bin/cat" or "./program").
  • The first matching file should be passed to execve as an absolute path (for example, "/usr/bin/cat", if PATH was something like "/usr/bin:/bin:/usr/local/bin") and we found a file named "cat" in the /usr/bin directory.
  • You may assume that all directories specified in PATH are separated by a ':' character, and you may also assume that the final pathname is no longer than 1023 bytes. You may not assume that all the entries in PATH are valid directories.
  • pid should be set to the process ID of the new child process.
  • The arguments file_actions and attrp are beyond the scope of this assignment and can be ignored in your implementation.
  • On success, you should return 0.
  • When searching PATH directories, a recursive search is not required. You are only required to find top-level files like /bin/cat - finding a file like /bin/directory/cat2 is not expected.
Errors
  • If no file can be found in any directory from PATH with the name file, posix_spawnp should set errno to ENOENT and return -1. No child process should be created.
  • If the child process fails to invoke execve (for example, if the executable file is found but ends up having incorrect permissions), it should exit with status 127.
  • If any other underlying function call fails, posix_spawnp should return -1. errno will have already been set by the underlying function.
Undefined Behaviour
  • pid is not a valid reference to a pid_t.
  • path is not a valid nul-terminated string.
  • argv is not a valid NULL terminated array of valid nul-terminated strings.
  • envp is not a valid NULL terminated array of valid nul-terminated strings.
  • TOCTOU (Time of check to time of use) errors (i.e. if the file is found, but is removed by someone else right before we call execve).
Differences to standard implementation
  • posix_spawnp typically supports the two arguments which we have chosen to ignore.
  • Our implementation of posix_spawnp has simplified error handling compared to a typical implementation.
References
The following manual pages will be useful:
  • posix_spawnp(3)
  • fork(2)
  • execve(2)
  • opendir(3)
  • readdir(3)
  • closedir(3)
  • stat(2)

Subset 4

To complete subset 4, you need to implement the following buffering behaviour and functions.
Input buffering
input should be buffered.
Output buffering
output should be buffered.
Default buffering
stdin, stdout, stderr and files created by fopen should have default buffering behaviour.
Flushing/refilling
buffers must be flushed and refilled according to this scheme so that you pass our autotests.
Handling fseek
fseek must behave in a particular way to support buffering, particularly for files which are opened with read AND write permissions.
Buffering and errors
some errors must be reported, even on buffered files when the output has not yet actually occured.
fflush
flush an output buffer for a FILE* with an underlying write().
fpurge
clear a buffer for a FILE*, discarding any buffered data.
setvbuf
modify the buffering behaviour for a FILE*.

To test your implementations, you can use 1521 autotest stdio subset4.


Buffering overview

You should extend your implementation of previous subsets to implement input/output buffering. This will require adding more fields to struct file and modifying your logic for input/output. Details are in the below sections.


Input buffering

When input is buffered, attempting to read data from a FILE* does not translate directly to an underlying call to read of the same size. Instead, whenever data is needed from the underlying file, a larger read is made to fill the input buffer. Future requests to read from the underlying file can then be fulfilled from extra data remaining in the input buffer without needing more calls to read.

The behaviour of input buffering should be as follows:

  • If an input stream is fully buffered, then each read from the underlying file should request at least a full buffer worth of data to be stored in the internal buffer. Any extra data leftover after fulfilling the initial read from the FILE* should be used to fulfil further reads until the buffer is exhausted. Only when insufficient data remains in the internal buffer should further reads from the FILE* trigger another read from the underlying file to refill the buffer. Even in the case where refilling is required, you should still try and return a full read to the caller of the stdio input function (which contains some data from the original buffer, and some from the new buffer).
  • If an input stream is line buffered, it should behave the same as a fully buffered input stream
  • If an input stream is unbuffered, then each read from the FILE* immediately invokes a read from the underlying file.

Output buffering

When output is buffered, attempting to write data to the FILE* does not result in an immediate call to write. Instead, data to be written is collected in a buffer (an array, i.e. allocated with malloc) and an underlying write is performed only when there is enough data accumulated in the buffer.

The behaviour of output buffering should be as follows:

  • If an output stream is fully buffered, then the internal buffer of output data should only be written out to the underlying file when it is completely full, or when it is explicitly flushed ("flushing" a buffer refers to emptying the in-memory buffer by writing the contents out to the actual file).
  • If an output stream is line buffered, then the internal buffer of output data should only be written out to the underlying file when a newline character is written to the buffer, or when the buffer is completely full, or when it is explicitly flushed.
  • If an output stream is unbuffered, then there is no internal buffer and all writes to the FILE* immediately invoke a write to the underlying file.

For any files opened in read/write mode (e.g w+, r+, a+), the behaviour of both input and output buffering should apply. The FILE* should still only have one internal buffer.

In order to implement buffering for read/write files, you may make the assumption that whenever the FILE* switches from being used for input to being used for output (or vice versa), fseek will be called in between the two input/output calls. As specified below, calling fseek will always flush an output buffer.


Default buffering behaviour

Unless modified by setvbuf, FILE*s should have the following buffering behaviour:

  • stdout should be line buffered
  • stdin should be line buffered
  • stderr should be unbuffered
  • When creating a FILE*, if it refers to a terminal device (see isatty(3)), then it should be line buffered
  • Otherwise, new FILE*s are fully buffered
  • The default buffer size should be cs1521_BUFSIZ, which is already defined in cs1521_stdio.h.

Flushing and refilling buffers

Explicit flushing of a buffer occurs in the following situations:

  • When fclose is called on the FILE*.
  • When fseek is called on the FILE*.
  • When fflush is called on the FILE*.

Flushing of a buffer also occurs when buffers become full or when line-buffered files output a newline character, as detailed in the above sections on buffering.

To test your implementation of buffering, autotest will enforce the following restrictions on the number of read/write syscalls made by your program. Your implementation of buffering must satisfy these restrictions for fully buffered files:

  • If a read request can be satisfied by data from the internal buffer, no call should be made to read.
  • If a read request cannot be satisfied by data from the internal buffer, it should be repeatedly refilled with a call to read as many times as needed. You should still try and return a full read to the caller of the stdio input function (which, for a large read, might contain many buffers worth of data).
  • If a write request can be satisfied just by writing to the internal buffer, then no call should be made to write.
  • If a write request cannot be satisfied just by writing to the internal buffer, it should be repeatedly written to and flushed with a call to write as many times as needed.

It must satisfy the following restrictions for line buffered files:

  • If a read request can be satisfied by data from the internal buffer, no call should be made to read.
  • If a read request cannot be satisfied by data from the internal buffer, it should be repeatedly refilled with a call to read as many times as needed. You should still try and return a full read to the caller of the stdio input function (which, for a large read, might contain many buffers worth of data).
  • Each time a newline character is encountered, all existing data must be flushed (including the newline itself). Otherwise, if the internal buffer becomes full without encountering a newline, it should be flushed as usual. In both cases, a call to write should be made.
  • This means each individual line should cause at least one call to write, and possibly more if the line is larger than the size of the internal buffer.

You may notice that this implementation could be further optimised (i.e. we can produce sane buffering behaviour with less calls to write or read). You are regardless required to satisfy these restrictions as they are simple and easier for us to autotest (and for you to implement!).


Handling fseek

Your implementation of fseek must always call lseek, even if your implementation simply updates an internal file offset into an input buffer. This ensures that fseek correctly detects errors when attempting to seek on a non-seekable stream (i.e a pipe).

Your implementation of ftell must always call lseek, even if your implementation tracks the current file offset and could simply return it. This ensures that ftell correctly detects errors when attempting to find the offset of a non-seekable stream (i.e a pipe).

In addition, your implementation of fseek must always flush output buffers before seeking. This ensures that append streams behave correctly, and also enables read/write streams to function with a single buffer.


Buffering and errors

Buffering complicates reporting errors. Output errors which would have occured immediately are now delayed until we flush the output buffer. We will require you to "emulate" some simple errors (so they immediately return a failure to the caller), while most errors are permitted to occur only when the buffer is actually flushed.

The following errors must be reported immediately, even if the output is not yet being performed:

  • Attempting to write to a read-onlyFILE*: errno=EBADF

Buffering can also cause inconsistencies if you open the same file multiple times at once (buffered writes are not visible to other readers). We won't test this, but it might interest you to try and think of a solution for this.

All other errors may be reported by fclose, fseek or fflush as relevant (since these are the only functions which force the buffer to be flushed).


fflush

Overview
You should implement int cs1521_fflush(cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fflush should flush all the data in the internal buffer for stream with an underlying call to write.
  • If the internal buffer is empty, then fflush should succeed without calling write.
Errors
  • If the underlying call to write fails, then fflush should return -1. errno will have already been set by write.
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
  • stream is an input stream (or if opened for reading and writing, stream has an internal buffer containing buffered input).
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • fflush(3)

fpurge

Overview
You should implement int cs1521_fpurge(cs1521_FILE *stream), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • fpurge should discard all the data in the internal buffer for stream (if one exists) without any underlying call to write (if applicable).
Errors
  • There are no errors for fpurge.
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • fpurge(3)

setvbuf

Overview
You should implement int cs1521_setvbuf(cs1521_FILE *stream, char *buf, int type, size_t size), which is currently stubbed in cs1521_stdio.c.
Behaviour
  • setvbuf should change the buffering for stream.
  • The new buffering behaviour should use buf as the underlying buffer, or if buf is NULL then setvbuf should allocate a buffer on behalf of the caller. Note that buffers provided by the user should not be freed when the file is closed, while buffers allocated by setvbuf should be freed.
  • If setvbuf allocates a buffer, it should be size bytes large, or cs1521_BUFSIZ bytes if size is zero.
  • type is the new buffering behaviour. It can be either cs1521__IONBF for unbuffered, cs1521__IOLBF for line buffered, or cs1521__IOFBF for fully buffered. These constants are all already defined in cs1521_stdio.h.
  • size is the size of the new buffer.
  • If an output stream stream has data in an underlying buffer when setvbuf is called, it should attempt to copy the existing data into the new buffer without flushing data with a call to write. However, if the new buffer is not large enough, then all existing data should be flushed with a call to write.
  • For an input stream stream, data not fitting into the new buffer should be discarded and the underlying file descriptor should seek backwards as required. Data that has already been read from the FILE* should not be copied into the new buffer.
Errors
  • There are no errors for setvbuf.
  • If the request cannot be honored for any reason (malloc fails, flushing existing data fails, etc.), then setvbufshould return -1. However, the stream must remain functional, retaining its previous buffering behaviour.
Undefined Behaviour
  • stream is not a pointer to a valid FILE.
  • buf is not NULL but does not refer to a valid buffer of size size.
  • type is not one of the above types.
Differences to standard implementation
  • No notable differences
References
The following manual pages will be useful:
  • setvbuf(3)

Assumptions and Clarifications

Like all good programmers, you should make as few assumptions as possible. If anything is unclear in this specification, please post on the class forum.

  • Your submitted code must be a single C library only. You may not submit code in other languages.

  • You can call some functions from the C standard library available by default on CSE Linux systems: including, e.g., stdlib.h, string.h, math.h, assert.h, as well as any C POSIX libraries used in lectures or lecture slides such as unistd.h, sys/types.h, sys/stat.h, fcntl.h, dirent.h.

  • However, you are forbidden from using the following libraries, or any functions from within them. If your solution uses them, you will receive a zero for performance: stdio.h spawn.h

  • We will compile your code with dcc when marking. Run-time errors from illegal or invalid C will cause your code to fail automarking (and will likely result in you losing marks).

  • Your program must not require extra compile options. It must compile successfully with the provided Makefile:

    make
    
  • You may not use functions from other libraries.

  • If your program prints debugging output, Make sure you disable any debugging output before submission. it will fail automarking tests.

  • You may not create or use temporary files.

  • With the exception of implementing posix_spawnp, you may not create subprocesses: you may not use posix_spawn(3), posix_spawnp(3), system(3), popen(3), fork(2), vfork(2), clone(2), or any of the exec* family of functions, like execve(2).

  • stdio should make a reasonable attempt to free all memory it has allocated and close any open files when errors occur. However, marks will not be deducted for minor memory management mistakes which do not cause any incorrect behaviour.
  • stdio should be able to recover from errors defined in this specification. Errors should not compromise the functionality of the library.

  • stdio if multiple error conditions are met, the order which errors are reported is not specified and will not be tested.

You are required to submit intermediate versions of your assignment. See below for details.

Subset weighting

The weighting of each subset in the performance mark is as follows:
  • Subset 0: 30%
  • Subset 1: 25%
  • Subset 2: 20%
  • Subset 3: 15%
  • Subset 4: 10%

Change Log

Version 1.0.0
(2026-02-28 20:00:00)
  • First draft

Assessment

Testing

When you think your program is working, you can use autotest to run some simple automated tests:

1521 autotest stdio [optionally: any extra .c or .h files]
You can also run autotests for a specific subset. For example, to run all tests from subset 1:
1521 autotest stdio subset1 [optionally: any extra .c or .h files]
Some tests are more complex than others. If you are failing more than one test, you are encouraged to focus on solving the first of those failing tests. To do so, you can run a specific test by giving its name to the autotest command:
1521 autotest stdio subset1_fread_simple [optionally: any extra .c or .h files]

1521 autotest will not test everything.
Always do your own testing.

Automarking will be run by the lecturer after the submission deadline, using a superset of tests to those autotest runs for you.

Submission

When you are finished working on the assignment, you must submit your work by running give:

give cs1521 ass2_stdio cs1521_stdio.c [optionally: any extra .c or .h files]

You must run give before Week 10 Friday 18:00:00 to obtain the marks for this assignment. Note that this is an individual exercise, the work you submit with give must be entirely your own.

You can run give multiple times.
Only your last submission will be marked.

If you are working at home, you may find it more convenient to upload your work via give's web interface.

You cannot obtain marks by emailing your code to tutors or lecturers.

You can check your latest submission on CSE servers with:

1521 classrun check ass2_stdio

You can check the files you have submitted here.

Manual marking will be done by your tutor, who will mark for style and readability, as described in the Assessment section below. After your tutor has assessed your work, you can view your results here; The resulting mark will also be available via give's web interface.

Due Date

This assignment is due Week 10 Friday 18:00:00 (2026-04-24 18:00:00).

The UNSW standard late penalty for assessment is 5% per day for 5 days - this is implemented hourly for this assignment.

Your assignment mark will be reduced by 0.2% for each hour (or part thereof) late past the submission deadline.

For example, if an assignment worth 60% was submitted half an hour late, it would be awarded 59.8%, whereas if it was submitted past 10 hours late, it would be awarded 57.8%.

Beware - submissions 5 or more days late will receive zero marks. This again is the UNSW standard assessment policy.

Assessment Scheme

This assignment will contribute 15 marks to your final COMP1521 mark.

80% of the marks for assignment 2 will come from the performance of your code on a large series of tests.

20% of the marks for assignment 2 will come from hand marking. These marks will be awarded on the basis of clarity, commenting, elegance and style. In other words, you will be assessed on how easy it is for a human to read and understand your program.

An indicative assessment scheme for performance follows. The lecturer may vary the assessment scheme after inspecting the assignment submissions, but it is likely to be broadly similar to the following:

100% for performance completely working subsets 1, 2, 3, 4 & 5 - everything works!
90% for performance completely working subsets 1, 2, 3 & 4.
80% for performance completely working subsets 1, 2 & 3
65% for performance completely working subsets 1 & 2.
50% for performance completely working subset 1.
30-40% for performance good progress, but not passing subset 1 autotests.
0% knowingly providing your work to anyone
and it is subsequently submitted (by anyone).
0 FL for
COMP1521
submitting any other person's work;
this includes joint work.
academic
misconduct
submitting another person's work without their consent;
paying another person to do work for you.

An indicative assessment scheme for style follows. The lecturer may vary the assessment scheme after inspecting the assignment submissions, but it is likely to be broadly similar to the following:

100% for style perfect style
90% for style great style, almost all style characteristics perfect.
80% for style good style, one or two style characteristics not well done.
70% for style good style, a few style characteristics not well done.
60% for style ok style, an attempt at most style characteristics.
≤ 50% for style an attempt at style.

An indicative style rubric follows:

  • Formatting (6/20):
    • Whitespace (e.g. 1 + 2 instead of 1+2)
    • Indentation (consistent, tabs or spaces are okay)
    • Line length (below 80 characters unless very exceptional)
    • Line breaks (using vertical whitespace to improve readability)
  • Documentation (8/20):
    • Header comment (with name and zID)
    • Function comments (above each function with a good description)
    • Descriptive variable names (e.g. char *home_directory instead of char *h)
    • Descriptive function names (e.g. get_home_directory instead of get_hd)
    • Sensible commenting throughout the code (don't comment every single line; leave comments when necessary)
  • Elegance (5/20):
    • Does this code avoid redundancy? (e.g. Don't repeat yourself!)
    • Are helper functions used to reduce complexity? (functions should be small and simple where possible)
    • Are constants appropriately created and used? (magic numbers should be avoided)
  • Portability (1/20):
    • Would this code be able to compile and behave as expected on other POSIX-compliant machines? (using standard libraries without platform-specific code)
    • Does this code make any assumptions about the endianness of the machine it is running on?

Note that the following penalties apply to your total mark for plagiarism:

0 for asst2 knowingly providing your work to anyone
and it is subsequently submitted (by anyone).
0 FL for
COMP1521
submitting any other person's work; this includes joint work.
academic
misconduct
submitting another person's work without their consent;
paying another person to do work for you.

Intermediate Versions of Work

You are required to submit intermediate versions of your assignment.

Every time you work on the assignment and make some progress you should copy your work to your CSE account and submit it using the give command above. It is fine if intermediate versions do not compile or otherwise fail submission tests. Only the final submitted version of your assignment will be marked.

Assignment Conditions

  • Joint work is not permitted on this assignment.

    This is an individual assignment. The work you submit must be entirely your own work: submission of work even partly written by any other person is not permitted.

    Do not request help from anyone other than the teaching staff of COMP1521 — for example, in the course forum, or in help sessions.

    Do not post your assignment code to the course forum. The teaching staff can view code you have recently submitted with give, or recently autotested.

    Assignment submissions are routinely examined both automatically and manually for work written by others.

    Rationale: this assignment is designed to develop the individual skills needed to produce an entire working program. Using code written by, or taken from, other people will stop you learning these skills. Other CSE courses focus on skills needed for working in a team.

  • The use of generative tools such as Github Copilot, ChatGPT, Google Gemini is not permitted on this assignment.

    Rationale: this assignment is designed to develop your understanding of basic concepts. Using synthesis tools will stop you learning these fundamental concepts, which will significantly impact your ability to complete future courses.

  • Sharing, publishing, or distributing your assignment work is not permitted.

    Do not provide or show your assignment work to any other person, other than the teaching staff of COMP1521. For example, do not message your work to friends.

    Do not publish your assignment code via the Internet. For example, do not place your assignment in a public GitHub repository.

    Rationale: by publishing or sharing your work, you are facilitating other students using your work. If other students find your assignment work and submit part or all of it as their own work, you may become involved in an academic integrity investigation.

  • Sharing, publishing, or distributing your assignment work after the completion of COMP1521 is not permitted.

    For example, do not place your assignment in a public GitHub repository after this offering of COMP1521 is over.

    Rationale: COMP1521 may reuse assignment themes covering similar concepts and content. If students in future terms find your assignment work and submit part or all of it as their own work, you may become involved in an academic integrity investigation.

Violation of any of the above conditions may result in an academic integrity investigation, with possible penalties up to and including a mark of 0 in COMP1521, and exclusion from future studies at UNSW. For more information, read the UNSW Student Code, or contact the course account.