2. Introduction
In this assignment you will be implementing a set of file-related
system calls. Upon completion, your operating system will be able to
run a single application at user-level and perform some basic file
I/O.
A substantial part of this assignment is understanding how OS/161
works and determining what code is required to implement the required
functionality. Expect to spend at least as long browsing and digesting
OS/161 code as actually writing and debugging your own code.
If you attempt the advanced part, you will add process related
system calls and the ability to run multiple applications.
Your current OS/161 system has minimal support for running
executables, nothing that could be considered a true
process. Assignment 2 starts the transformation of OS/161 into
something closer to a true operating system. After this assignment, it
will be capable of running a process from actual compiled programs
stored in your account. The program will be loaded into OS/161 and
executed in user mode by System/161; this will occur under the control
of your kernel. First, however, you must implement part of the
interface between user-mode programs ("userland") and the kernel. As
usual, we provide part of the code you will need. Your job is to
design and build the missing pieces.
Our code can run one user-level C program at a time as long as it
doesn't want to do anything but shut the system down. We have provided
sample user programs that do this (reboot, halt, poweroff), as well as
others that make use of features you might be adding in this and future
assignments. So far, all the code you have written for OS/161 has
only been run within, and only been used by, the operating system
kernel. In a real operating system, the kernel's main function is to
provide support for user-level programs. Most such support is accessed
via "system calls." We give you two system calls: sys_reboot()
in startup/main.c and sys_time() in
syscall/time_syscalls.c. In GDB, if you put a breakpoint on
sys_reboot() and run the "reboot" program, you can use
"backtrace" to see how it got there.
For those attempting the advanced version of the assignment, you
will also be implementing the subsystem that keeps track of multiple
tasks. You must decide what data structures you will need to hold the
data pertinent to a "process" (hint: look at kernel include files of
your favorite operating system for suggestions, specifically the proc
structure.) The first step is to read and understand the parts of the
system that we have written for you.
Our System/161 simulator can run normal C programs if they
are compiled with a cross-compiler, os161-gcc. This runs on a
host (e.g., a Linux x86 machine) and produces MIPS executables; it is the
same compiler used to compile the OS/161 kernel. To create new user
programs, you will need to edit the Makefile in bin, sbin, or testbin
(depending on where you put your programs) and then create a directory
similar to those that already exist. Use an existing program and its
Makefile as a template.
In the beginning, you should tackle this assignment by producing a
DESIGN DOCUMENT. The design document should clearly reflect the
development of your solution. They should not merely explain what you
programmed. If you try to code first and design later, or even if you
design hastily and rush into coding, you will most certainly end up in
a software "tar pit". Don't do it! Plan everything you will do. Don't
even think about coding until you can precisely explain to your
partner what problems you need to solve and how the pieces relate to
each other. Note that it can often be hard to write (or talk) about
new software design, you are facing problems that you have not seen
before, and therefore even finding terminology to describe your ideas
can be difficult. There is no magic solution to this problem; but it
gets easier with practice. The important thing is to go ahead and
try. Always try to describe your ideas and designs to someone else. In
order to reach an understanding, you may have to invent terminology
and notation, this is fine. If you do this, by the time you have
completed your design, you will find that you have the ability to
efficiently discuss problems that you have never seen before. Why do
you think that CS is filled with so much jargon? To help you get
started, we have provided the following questions as a guide for
reading through the code. We recommend that you answer questions for
the different modules and be prepared to discuss them in your Week 8
tutorial. Once you have prepared the answers, you should be ready to
develop a strategy for designing your code for this assignment.
Bring your answers to the code walk-through questions to your week 8
tutorial.
This directory contains some syscall implementations, and the files
that are responsible for the loading and running of userlevel
programs. Currently, the only files in the directory are
loadelf.c, runprogram.c, and time_syscalls.c,
although you may want to add more of your own during this assignment. Understanding
these files is the key to getting started with the assignment,
especially the advanced part, the implementation of multiprogramming. Note
that to answer some of the questions, you will have to look in files
outside this directory.
loadelf.c
This file contains the functions responsible for loading an ELF
executable from the filesystem and into virtual memory space. Of
course, at this point this virtual memory space does not provide what
is normally meant by virtual memory, although there is translation
between the addresses that executables "believe" they are using and
physical addresses, there is no mechanism for providing more memory
than exists physically.
runprogram.c
This file contains only one function, runprogram(), which is the
function that is responsible for running a program from the kernel
menu. Once you have designed your file system calls, a program started
by runprogram() should have the standard file descriptors (stdout,
stderr) available while it's running.
In the advanced assignment, runprogram() is a good base for
writing the execv() system call, but only a base. When writing your
design doc, you should determine what more is required for execv()
that runprogram() does not need to worry about. Once you have design
your process framework, runprogram() should be altered to start
processes properly within this framework.
Questions
- What are the ELF magic numbers?
- In runprogram(), why is it important to call vfs_close
before going to usermode?
- What function forces the processor to switch into usermode? Is this
function machine dependent?
This file contains functions for moving data between kernel and user
space. Knowing when and how to cross this boundary is critical to
properly implementing userlevel programs, so this is a good file to
read very carefully. You should also examine the code in
lib/uio.c.
Questions
- What is the difference between UIO_USERISPACE and
UIO_USERSPACE? When should one use UIO_SYSSPACE instead?
- Why can the struct uio that is used to read in an ELF segment be
allocated on the stack in load_segment() (i.e., where does the
memory read actually go)?
- In what file are copyin() and copyout() defined?
memmove()? Why can't copyin() and copyout() be
implemented as simply as memmove()?
Exceptions are the key to operating systems; they are the mechanism
that enables the OS to regain control of execution and therefore do
its job. You can think of exceptions as the interface between the
processor and the operating system. When the OS boots, it installs an
"exception handler" (carefully crafted assembly code) at a specific
address in memory. When the processor raises an exception, it invokes
this, which sets up a "trap frame" and calls into the operating
system. Since "exception" is such an overloaded term in computer
science, operating system lingo for an exception is a "trap", when the
OS traps execution. Interrupts are exceptions, and more significantly
for this assignment, so are system calls. Specifically, syscall.c
handles traps that happen to be syscalls. Understanding at least the C
code in this directory is key to being a real operating systems
junkie, so we highly recommend reading through it carefully.
locore/trap.c
mips_trap() is the key function for returning control to the operating
system. This is the C function that gets called by the assembly
exception handler. kill_curthread() is the function for
handling broken user programs; when the processor is in usermode and
hits something it can't handle (say, a bad instruction), it raises an
exception. There's no way to recover from this, so the OS needs to
kill off the process. The advance part of this assignment will include
writing a useful version of this function.
syscall/syscall.c
syscall() is the function that delegates the actual work of a
system call off to the kernel function that implements it. Notice that
reboot and time are the only cases currently handled.
Questions
- What is the numerical value of the exception code for a MIPS system call?
- How many bytes is an instruction in MIPS? (Answer this by reading
syscall() carefully, not by looking somewhere else.)
- What is the contents of the struct trapframe? Where is the
struct trapframe that is passed into syscall() stored?
- What would be required to implement a system call that took more than 4 arguments?
- What is the purpose of userptr_t?
There's only one file in here, mips/crt0.S, which contains the MIPS
assembly code that receives control first when a user-level program is
started. It calls main(). This is the code that your execv()
implementation will be interfacing to, so be sure to check what values
it expects to appear in what registers and so forth.
There's obviously a lot of code in the OS/161 C library (and a lot
more yet in a real system's C library...) We don't expect you to read
it all, although it may be instructive in the long run to do so. Job
interviewers have an uncanny habit of asking people to implement
simple standard C functions on the whiteboard. For present purposes
you need only look at the code that implements the user-level side of
system calls.
errno.c
This is where the global variable errno is defined.
syscalls-mips.S
This file contains the machine-dependent code necessary for
implementing the userlevel side of MIPS system calls.
syscalls.Sis created from this file at compile time and is the
actual file assembled to put into the C library. The actual names of
the system calls are placed in this file using a script called
gensyscalls.sh that reads them from the kernel's header files. This
avoids having to make a second list of the system calls. In a real
system, typically each system call stub is placed in its own source
file, to allow selectively linking them in. OS/161 puts them all
together to simplify the makefiles.
Questions
- What is the purpose of the SYSCALL macro?
- What is the MIPS instruction that actually triggers a system call?
(Answer this by reading the source in this directory, not looking
somewhere else.)
The files vfs.h and vnode.h in this directory
contain function declarations and comments that are directly relevant
to this assignment.
Questions
- How are vfs_open, vfs_close used? What other
vfs_() calls are relevant?
- What are VOP_READ, VOP_WRITE? How are they used?
- What does VOP_TRYSEEK do?
- Where is the struct thread defined? What does this structure contain?
3.7. fork()
Answer these questions by reading the fork() man page and the
sections dealing with fork() in the textbook.
Questions
- What is the purpose of the fork() system call?
- What process state is shared between the parent and child?
- What process state is copied between the parent and child?
Remember to use a 3231 subshell (or continue using your
modified PATH) for this assignment, as outlined in asst0.
SVN setup
Only one of you needs to do the following.
In this section, you will be setting up the svn repository that will
contain your code. We suggest your partner sit in on this part of the
assignment.
Import the OS/161 sources into your repository as follows:
% cd /home/cs3231/assigns
% svn import asst2/src file:///home/osprjXXX/repo/asst2/trunk -m "Initial import"
Make an immediate branch of this import for easy reference when you generate your diff:
% svn copy -m "Tag initial import" file:///home/osprjXXX/repo/asst2/trunk file:///home/osprjXXX/repo/asst2/initial
Checkout
The following instructions are now for both partners.
You have now completed setting up a shared repository for both partners. We'll
assume your cs3231 directory is intact from the previous assignment.
Change to your directory:
% cd ~/cs3231
Now checkout a copy of the os161 sources to work on from your shared
repository:
% svn checkout file:///home/osprjXXX/repo/asst2/trunk asst2-src
You should now have a asst2-src directory to work on.
Configure OS/161 for Assignment 2
Before proceeding further, configure your new sources.
% cd ~/cs3231/asst2-src
% ./configure
Unlike previous the previous assignment, you will need to build
and install the user-level programs that will be run by your kernel in
this assignment.
% cd ~/cs3231/asst2-src
% bmake
% bmake install
For your kernel development, again we have provided you with a
framework for you to run your solutions for ASST2.
You have to reconfigure your kernel before you can use this
framework. The procedure for configuring a kernel is the same as in
ASST0 and ASST1, except you will use the ASST2 configuration file:
% cd ~/cs3231/asst2-src/kern/conf
% ./config ASST2
You should now see an ASST2 directory in the compile directory.
Building for ASST2
When you built OS/161 for ASST1, you ran make from compile/ASST1 . In
ASST2, you run make from (you guessed it) compile/ASST2.
% cd ../compile/ASST2
% bmake depend
% bmake
% bmake install
If you are told that the compile/ASST2 directory does not exist, make
sure you ran config for ASST2.
Command Line Arguments to OS/161
Your solutions to ASST2 will be tested by running OS/161 with command
line arguments that correspond to the menu options in the OS/161 boot
menu.
IMPORTANT: Please DO NOT change these menu option strings!
Running "asst2"
For this assignment, we have supplied a user-level OS/161 program that
you can use for testing. It is called asst2, and its sources
live in src/testbin/asst2.
You can test your assignment by typing p /testbin/asst2 at the
OS/161 menu prompt. As a short cut, you can also specify menu arguments on the
command line, example: sys161 kernel "p /testbin/asst2".
Note: On cygwin, you need to type p /testbin/asst2.exe.
Note: If you don't have a sys161.conf file, you can use
this one.
Running the program produces output similar to the following prior to
starting the assignment.
Unknown syscall 55
Unknown syscall 55
Unknown syscall 55
Unknown syscall 55
:
:
Unknown syscall 55
Unknown syscall 55
Unknown syscall 3
asst2 produces the following output on a (maybe partially) working
assignment.
OS/161 kernel [? for menu]: p /testbin/asst2
Operation took 0.000212160 seconds
OS/161 kernel [? for menu]:
**********
* File Tester
**********
* write() works for stdout
**********
* write() works for stderr
**********
* opening new file "test.file"
* open() got fd 3
* writing test string
* wrote 45 bytes
* writing test string again
* wrote 45 bytes
* closing file
**********
* opening old file "test.file"
* open() got fd 3
* reading entire file into buffer
* attemping read of 500 bytes
* read 90 bytes
* attemping read of 410 bytes
* read 0 bytes
* reading complete
* file content okay
**********
* testing lseek
* reading 10 bytes of file into buffer
* attemping read of 10 bytes
* read 10 bytes
* reading complete
* file lseek okay
* closing file
**********
* testing fork
Unknown syscall 0
* fork FAILED.
Unknown syscall 3
Implement the following file-based system calls. The full range of system
calls that is listed in kern/include/kern/syscall.h. For this
assignment you should implement: open, read, write,
lseek, close, dup2. Note: You are implementing
the kernel code that implements the system call functionality within the
kernel. The C stubs that user-level applications call to invoke the system
calls are already automatically generated when you build OS/161.
Note that the basic assignment does not involve implementing fork()
(that's part of the advanced assignment). However, the design and implentation
of your system calls should nonetheless not assume a single process.
It's crucial that your syscalls handle all error conditions
gracefully (i.e., without crashing OS/161.) You should consult the
OS/161 man pages
(also included in the distribution) and understand fully the
system calls that you must implement. You must return the error codes
as decribed in the man pages. Additionally, your syscalls must return
the correct value (in case of success) or error code (in case of
failure) as specified in the man pages. Some of the auto-marking
scripts rely on the return of appropriate error codes; adherence to
the guidelines is as important as the correctness of the
implementation.
The file user/include/unistd.h contains the user-level
interface definition of the system calls that you will be writing for
OS/161 (including ones you will implement in later assignments). This
interface is different from that of the kernel functions that you will
define to implement these calls. You need to design this interface and
put it in kern/include/syscall.h. As you discovered (ideally) in
Assignment 0, the integer codes for the calls are defined in
kern/include/kern/syscall.h. You need to think about a variety of
issues associated with implementing system calls. Perhaps, the most
obvious one is: can two different user-level processes (or user-level
threads, if you choose to implement them) find themselves running a
system call at the same time?
open(), read(), write(), lseek(), close(), and dup2()
For any given process, the first file descriptors (0, 1, and 2) are
considered to be standard input (stdin), standard output (stdout), and
standard error (stderr) respectively. For this basic assignment, the file
descriptors 1 (stdout) and 2 (stderr) must start out attached to the
console device ("con:"). You will probably modify runprogram() to
achieve this. Your implementation must allow programs to use dup2() to
change stdin, stdout, stderr to point elsewhere.
Although these system calls may seem to be tied to the filesystem,
in fact, these system calls are really about manipulation of file
descriptors, or process-specific filesystem state. A large part of
this assignment is designing and implementing a system to track this
state. Some of this information (such as the cwd) is specific only to
the process, but others (such as offset) is specific to the process
and file descriptor. Don't rush this design. Think carefully about the
state you need to maintain, how to organise it, and when and how it
has to change.
While this assignment requires you to implement file-system-related
system calls, you actually have to write virtually no
low-level file system code in this assignment. You will use the
existing VFS layer to do most of the work. Your job is to construct
the subsystem that implements the interface expected by user-level
programs by invoking the appropriate VFS and vnode operations.
While you are not restricted to only modifying these files,
please place most of your implementation in the following files:
function prototypes and data types for your file subsystem in
kern/include/file.h, and the function implementations and
variable instantiations in kern/syscall/file.c.
A note on errors and error handling of system calls
The man pages in the OS/161 distribution contain a description of the
error return values that you must return. If there are conditions that
can happen that are not listed in the man page, return the most
appropriate error code from kern/include/kern/errno.h. If
none seem particularly appropriate, consider adding a new one. If
you're adding an error code for a condition for which UNIX has a
standard error code symbol, use the same symbol if possible. If not,
feel free to make up your own, but note that error codes should always
begin with E, should not be EOF, etc. Consult UNIX man pages to learn
about error codes. Note that if you add an error code to
kern/include/kern/errno.h you need to add a corresponding
error message to the file user/lib/libc/string/strerror.c.
Here are some additional questions and issues to aid you in developing
your design. They are by no means comprehensive, but they are a
reasonable place to start developing your solution.
What primitive operations exist to support the transfer of data to
and from kernel space? Do you want to implement more on top of these?
You will need to "bullet-proof" the OS/161 kernel from user program
errors. There should be nothing a user program can do to crash the
operating system when invoking the file system calls. It is okay in
the basic assignment for the kernel to panic for an unimplemented
system call (e.g. execv()), or a user-level program error.
Decide which functions you need to change and which structures
you may need to create to implement the system calls.
How you will keep track of open files? For which system calls is
this useful?
For additional background, consult one or more of the following
texts for details how similar existing operating systems structure
their file system management:
- Section 10.6.3 and "NFS implementation" in Section 10.6.4,
Tannenbaum, Modern Operating Systems .
- Section 6.4 and Section 6.5, McKusick et al., The
Design and Implementation of the 4.4 BSD Operating System.
- Chapter 8, Vahalia, Unix Internals: the new frontiers.
- The original VFS paper is available here.
Documenting your solution
This is a compulsory component of this assignment. You must
submit a small design document identifying the basic issues in this
assignment, and then describe your solution to the problems you have
identified. The design document you developed in the planning phase
(outlined above) would be an ideal start. The document must be plain
ASCII text. We expect such a document to be roughly 500—1000 words,
i.e. clear and to the point.
The document will be used to guide our markers in their evaluation
of your solution to the assignment. In the case of a poor results in
the functional testing combined with a poor design document, we will
base our assessment on these components alone. If you can't describe
your own solution clearly, you can't expect us to reverse engineer the
code to a poor and complex solution to the assignment.
Place your design document in design.txt (which we have
created for you) at the top of the source tree to OS/161
(i.e. in ~/cs3231/asst2-src/design.txt).
When you later commit your changes into your repository, your
design doc will be included in the commit, and later in your
submission.
Also, please word wrap you design doc if your have not already
done so. You can use the unix fmt command to achieve this if
your editor cannot.
As with the previous assignments, you again will be submitting a diff
of your changes to the original tree.
You should first commit your changes back to the repository using
the following command. Note: You will have to supply a comment on your
changes. You also need to coordinate with your partner that the
changes you have (or potentially both have) made are committed
consistently by you and your partner, such that the repository
contains the work you want from both partners.
% cd ~/cs3231/asst2-src
% svn commit
If the above fails, you may need to run svn update to bring
your source tree up to date with commits made by your partner. If you
do this, you should double check and test your assignment prior to
submission.
Beware! If you have created new files for this assignment, they will
not be included in your submission unless you add them, using svn add:
% svn add filename.c
If you add files after running svn commit, you will need to run
svn commit again.
Now generate a file containing the diff.
% cd ~
% svn diff file:///home/osprjXXX/repo/asst2/initial file:///home/osprjXXX/repo/asst2/trunk >~/asst2.diff
Testing Your Submission
Look Even though the generated diff output should represent all the
changes you have made to the supplied code, occasionally students do
something "ingenious" and generate non representative diff output.
We strongly suggest keeping your svn repository intact to allow
for recovery of your work if need be.
The advanced assignment is to complete the basic assignment, plus the
additional task below. Remember that you must finish the basic
assignment, then get approval from the lecturer in charge, Kevin
Elphinstone, at least one week before the due date.
Given you're doing the advanced version of the assignment, I'm assuming
you're competent with managing your SVN repository and don't need simple
directions. You basically need to generate a diff between your final
version and the base. There are two ways you can do this: the simpler
(but messier) option is to continue developing along your mainline branch
and generate the diff in the same way as for the basic assignment. A neater
approach is to create a new branch in SVN to work on your advanced solution.
Whichever approach you take, make sure you test your diff before you
submit it!
The amount of code to implement fork is quite small; the main
challenge is to understand what needs to be done. We strongly
encourage you to implement the file-related system calls first, with
fork in mind.
A pid, or process ID, is a unique number that identifies a process. The
implementation of getpid() is not terribly challenging, but pid
allocation and reclamation are the important concepts that you must implement.
It is not OK for your system to crash because over the lifetime of its
execution you've used up all the pids. Design your pid system; implement all
the tasks associated with pid maintenance, and only then implement
getpid(). When your pid system is working correctly, change your
fork() implementation to return the child's pid to the parent, rather
than 1.
Now generate a file containing the diff. (NOTE: How this works depends
on how you have set up branches in svn.)