2. Introduction
In this assignment you will be implementing a set of file-related
system calls. Upon completion, your operating system will be able to
run a single application at user-level and perform some basic file
I/O.
A substantial part of this assignment is understanding how OS/161
works and determining what code is required to implement the required
functionality. Expect to spend at least as long browsing and digesting
OS/161 code as actually writing and debugging your own code.
If you attempt the advanced part, you will add process related
system calls and the ability to run multiple applications.
Your current OS/161 system has minimal support for running
executables, nothing that could be considered a true
process. Assignment 2 starts the transformation of OS/161 into
something closer to a true operating system. After this assignment, it
will be capable of running a process from actual compiled programs
stored in your account. The program will be loaded into OS/161 and
executed in user mode by System/161; this will occur under the control
of your kernel. First, however, you must implement part of the
interface between user-mode programs ("userland") and the kernel. As
usual, we provide part of the code you will need. Your job is to
design and build the missing pieces.
Our code can run one user-level C program at a time as long as it
doesn't want to do anything but shut the system down. We have provided
sample user programs that do this (reboot, halt, poweroff), as well as
others that make use of features you might be adding in this and future
assignments. So far, all the code you have written for OS/161 has
only been run within, and only been used by, the operating system
kernel. In a real operating system, the kernel's main function is to
provide support for user-level programs. Most such support is accessed
via "system calls." We give you one system call, reboot(), which is
implemented in the function sys_reboot() in main.c. In GDB, if you put
a breakpoint on sys_reboot and run the "reboot" program, you can use
"backtrace" to see how it got there.
For those attempting the advanced version of the assignment, you
will also be implementing the subsystem that keeps track of multiple
tasks. You must decide what data structures you will need to hold the
data pertinent to a "process" (hint: look at kernel include files of
your favorite operating system for suggestions, specifically the proc
structure.) The first step is to read and understand the parts of the
system that we have written for you.
Our System/161 simulator can run normal C programs if they
are compiled with a cross-compiler, cs161-gcc. This runs on a
host (e.g., a Linux x86 machine) and produces MIPS executables; it is the
same compiler used to compile the OS/161 kernel. To create new user
programs, you will need to edit the Makefile in bin, sbin, or testbin
(depending on where you put your programs) and then create a directory
similar to those that already exist. Use an existing program and its
Makefile as a template.
In the beginning, you should tackle this assignment by producing a
DESIGN DOCUMENT. The design document should clearly reflect the
development of your solution. They should not merely explain what you
programmed. If you try to code first and design later, or even if you
design hastily and rush into coding, you will most certainly end up in
a software "tar pit". Don't do it! Plan everything you will do. Don't
even think about coding until you can precisely explain to your
partner what problems you need to solve and how the pieces relate to
each other. Note that it can often be hard to write (or talk) about
new software design, you are facing problems that you have not seen
before, and therefore even finding terminology to describe your ideas
can be difficult. There is no magic solution to this problem; but it
gets easier with practice. The important thing is to go ahead and
try. Always try to describe your ideas and designs to someone else. In
order to reach an understanding, you may have to invent terminology
and notation, this is fine. If you do this, by the time you have
completed your design, you will find that you have the ability to
efficiently discuss problems that you have never seen before. Why do
you think that CS is filled with so much jargon? To help you get
started, we have provided the following questions as a guide for
reading through the code. We recommend that you answer questions for
the different modules and be prepared to discuss them in your Week 7
tutorial. Once you have prepared the answers, you should be ready to
develop a strategy for designing your code for this assignment.
Bring your answers to the code walk-through questions to your week 7
tutorial.
This directory contains the files that are responsible for the loading
and running of userlevel programs. Currently, the only files in the
directory are loadelf.c, runprogram.c, and uio.c, although you may
want to add more of your own during this assignment. Understanding
these files is the key to getting started with the assignment,
especially the advanced part, the implementation of multiprogramming. Note
that to answer some of the questions, you will have to look in files
outside this directory.
loadelf.c
This file contains the functions responsible for loading an ELF
executable from the filesystem and into virtual memory space. Of
course, at this point this virtual memory space does not provide what
is normally meant by virtual memory, although there is translation
between the addresses that executables "believe" they are using and
physical addresses, there is no mechanism for providing more memory
than exists physically.
runprogram.c
This file contains only one function, runprogram(), which is the
function that is responsible for running a program from the kernel
menu. Once you have designed your file system calls, a program started
by runprogram() should have the standard file descriptors (stdout,
stderr) available while it's running.
In the advanced assignment, runprogram() is a good base for
writing the execv() system call, but only a base. When writing your
design doc, you should determine what more is required for execv()
that runprogram() does not need to worry about. Once you have design
your process framework, runprogram() should be altered to start
processes properly within this framework.
uio.c
This file contains functions for moving data between kernel and user
space. Knowing when and how to cross this boundary is critical to
properly implementing userlevel programs, so this is a good file to
read very carefully. You should also examine the code in
lib/copyinout.c.
Questions
1. What are the ELF magic numbers?
2. What is the difference between UIO_USERISPACE and
UIO_USERSPACE? When should one use UIO_SYSSPACE instead?
3. Why can the struct uio that is used to read in a segment be allocated on
the stack in load_segment() (i.e., where does the memory read actually
go)?
4. In runprogram(), why is it important to call vfs_close before going to
usermode?
5. What function forces the processor to switch into usermode? Is this
function machine dependent?
6. In what file are copyin and copyout defined? memmove? Why can't
copyin and copyout be implemented as simply as memmove?
7. What (briefly) is the purpose of userptr_t?
Exceptions are the key to operating systems; they are the mechanism
that enables the OS to regain control of execution and therefore do
its job. You can think of exceptions as the interface between the
processor and the operating system. When the OS boots, it installs an
"exception handler" (carefully crafted assembly code) at a specific
address in memory. When the processor raises an exception, it invokes
this, which sets up a "trap frame" and calls into the operating
system. Since "exception" is such an overloaded term in computer
science, operating system lingo for an exception is a "trap", when the
OS traps execution. Interrupts are exceptions, and more significantly
for this assignment, so are system calls. Specifically, syscall.c
handles traps that happen to be syscalls. Understanding at least the C
code in this directory is key to being a real operating systems
junkie, so we highly recommend reading through it carefully.
trap.c
mips_trap() is the key function for returning control to the operating
system. This is the C function that gets called by the assembly
exception handler. md_usermode() is the key function for returning
control to user programs. kill_curthread() is the function for
handling broken user programs; when the processor is in usermode and
hits something it can't handle (say, a bad instruction), it raises an
exception. There's no way to recover from this, so the OS needs to
kill off the process. The advance part of this assignment will include
writing a useful version of this function.
syscall.c
mips_syscall() is the function that delegates the actual work of a
system call off to the kernel function that implements it. Notice that
reboot is the only case currently handled.
Questions
8. What is the numerical value of the exception code for a MIPS system
call?
9. Why does mips_trap() set curspl to SPL_HIGH "manually", instead of
using splhigh()?
10. How many bytes is an instruction in MIPS? (Answer this by reading
mips_syscall() carefully, not by looking somewhere else.)
11. What is the contents of the struct trapframe? Where
the struct trapframe that is passed into mips_syscall stored?
12. What would be required to implement a system call that took more than
4 arguments?
13. What is the purpose of userptr_t?
There's only one file in here, mips-crt0.S, which contains the MIPS
assembly code that receives control first when a user-level program is
started. It calls main(). This is the code that your execv
implementation will be interfacing to, so be sure to check what values
it expects to appear in what registers and so forth.
There's obviously a lot of code in the OS/161 C library (and a lot
more yet in a real system's C library...) We don't expect you to read
it all, although it may be instructive in the long run to do so. Job
interviewers have an uncanny habit of asking people to implement
simple standard C functions on the whiteboard. For present purposes
you need only look at the code that implements the user-level side of
system calls.
errno.c
This is where the global variable errno is defined.
syscalls-mips.S
This file contains the machine-dependent code necessary for
implementing the userlevel side of MIPS system calls.
syscalls.S
This file is created from syscalls-mips.S at compile time and is the
actual file assembled to put into the C library. The actual names of
the system calls are placed in this file using a script called
callno-parse.sh that reads them from the kernel's header files. This
avoids having to make a second list of the system calls. In a real
system, typically each system call stub is placed in its own source
file, to allow selectively linking them in. OS/161 puts them all
together to simplify the makefiles.
Questions
14. What is the purpose of the SYSCALL macro?
15. What is the MIPS instruction that actually triggers a system call?
(Answer this by reading the source in this directory, not looking
somewhere else.)
The files vfs.h and vnode.h in this directory
contain function declarations and comments that are directly relevant
to this assignment.
Questions
16. How are vfs_open, vfs_close used? What other
vfs_() calls are relevant?
17. What are VOP_READ, VOP_WRITE? How are they used?
18. What does VOP_TRYSEEK do?
19. Where is the struct thread defined? What does this
structure contain?
fork()
Answer these questions by reading the fork() man page and the
sections dealing with fork() in the textbook.
Questions
20. What is the purpose of the fork() system call?
21. What process state is shared between the parent and
child?
22. What process state is copied between the parent and
child?
Remember to use a 3231 subshell (or continue using your
modified PATH) for this assignment, as outlined in ASST0.
Obtaining and setting up ASST2 in SVN
In this section, you will be setting up the svn repository that will
contain your code. Only one of you needs to do the
following. We suggest your partner sit in on this part of the
assignment.
- Import the OS/161 sources into your repository as follows
% cd /home/cs3231/assigns
% svn import asst2/src file:///home/osprjXXX/repo/asst2/trunk -m "Initial import"
- Make an immediate branch of this import for easy reference when you generate your diff:
% svn copy -m "Tag initial import" file:///home/osprjXXX/repo/asst2/trunk
file:///home/osprjXXX/repo/asst2/initial
You have now completed setting up a shared repository for both
partners. The following instructions are now for both partners.
Configure OS/161 for Assignment 2
Before proceeding further, configure your new sources.
% cd ~/cs3231/asst2-src
% ./configure
Unlike previous the previous assignment, you will need to build
and install the user-level programs that will be run by your kernel in
this assignment.
% cd ~/cs3231/asst2-src
% make
Note: "make" in this directory does both "make" and "make install".
For your kernel development, again we have provided you with a
framework for you to run your solutions for
ASST2.
You have to reconfigure your kernel before you can use this
framework. The procedure for configuring a kernel is the same as in
ASST0 and ASST1, except you will use the ASST2 configuration file:
% cd ~/cs3231/asst2-src/kern/conf
% ./config ASST2
You should now see an ASST2 directory in the compile directory.
Building for ASST2
When you built OS/161 for ASST1, you ran make from compile/ASST1 . In
ASST2, you run make from (you guessed it) compile/ASST2.
% cd ../compile/ASST2
% make depend
% make
% make install
If you are told that the compile/ASST2 directory does not exist, make
sure you ran config for ASST2.
Command Line Arguments to OS/161
Your solutions to ASST2 will be tested by running OS/161 with command
line arguments that correspond to the menu options in the OS/161 boot
menu.
IMPORTANT: Please DO NOT change these menu option strings!
Running "asst2"
For this assignment, we have supplied a user-level OS/161 program that
you can use for testing. It is called asst2, and its sources
live in src/testbin/asst2.
You can test your assignment by typing p /testbin/asst2
at the OS/161 menu prompt. As a short cut, you can also specify menu
arguments on the command line, example: sys161 kernel "p /testbin/asst2".
Note: On cygwin, you need to type p
/testbin/asst2.exe.
Note: If you don't have a sys161.conf file, you can use this one.
Running the program produces output similar to the following prior to
starting the assignment.
Unknown syscall 6
Unknown syscall 6
Unknown syscall 6
Unknown syscall 6
Unknown syscall 6
Unknown syscall 6
Unknown syscall 0
asst2 produces the following output on a (maybe partially) working
assignment.
OS/161 kernel [? for menu]: p /testbin/asst2
Operation took 0.000212160 seconds
OS/161 kernel [? for menu]:
**********
* File Tester
**********
* write() works for stdout
**********
* write() works for stderr
**********
* opening new file "test.file"
* open() got fd 3
* writing test string
* wrote 45 bytes
* writing test string again
* wrote 45 bytes
* closing file
**********
* opening old file "test.file"
* open() got fd 3
* reading entire file into buffer
* attemping read of 500 bytes
* read 90 bytes
* attemping read of 410 bytes
* read 0 bytes
* reading complete
* file content okay
**********
* testing lseek
* reading 10 bytes of file into buffer
* attemping read of 10 bytes
* read 10 bytes
* reading complete
* file lseek okay
* closing file
Unknown syscall 0
Implement the following file-based system calls. The full range of system calls
that we think you might want over the course of the semester is listed in
kern/include/kern/callno.h. For this assignment you should implement: open,
read, write, lseek, close, dup2. Note: You are implementing the kernel code
that implements the system call functionality within the kernel. The C
stubs that user-level applications call to invoke the system calls are already
automatically generated when you build OS/161.
Note that the basic assignment does not involve implementing
fork (that's part of the advanced assignment). However, the
design and implentation of your system calls should nonetheless not
assume a single process.
It's crucial that your syscalls handle all error conditions
gracefully (i.e., without crashing OS/161.) You should consult the
OS/161 man pages included in the distribution and understand fully the
system calls that you must implement. You must return the error codes
as decribed in the man pages. Additionally, your syscalls must return
the correct value (in case of success) or error code (in case of
failure) as specified in the man pages. Some of the auto-marking
scripts rely on the return of appropriate error codes; adherence to
the guidelines is as important as the correctness of the
implementation. The file include/unistd.h contains the user-level
interface definition of the system calls that you will be writing for
OS/161 (including ones you will implement in later assignments). This
interface is different from that of the kernel functions that you will
define to implement these calls. You need to design this interface and
put it in kern/include/syscall.h. As you discovered (ideally) in
Assignment 0, the integer codes for the calls are defined in
kern/include/kern/callno.h. You need to think about a variety of
issues associated with implementing system calls. Perhaps, the most
obvious one is: can two different user-level processes (or user-level
threads, if you choose to implement them) find themselves running a
system call at the same time?
open(), read(), write(), lseek(), close(), and dup2()
For any given process, the first file descriptors (0, 1, and 2) are
considered to be standard input (stdin), standard output (stdout), and
standard error (stderr) respectively. For this basic assignment, the file
descriptors 1 (stdout) and 2 (stderr) must start out attached to the
console device ("con:"). You will probably modify runprogram() to
achieve this. Your implementation must allow programs to use dup2() to
change stdin, stdout, stderr to point elsewhere.
Although these system calls may seem to be tied to the filesystem,
in fact, these system calls are really about manipulation of file
descriptors, or process-specific filesystem state. A large part of
this assignment is designing and implementing a system to track this
state. Some of this information (such as the cwd) is specific only to
the process, but others (such as offset) is specific to the process
and file descriptor. Don't rush this design. Think carefully about the
state you need to maintain, how to organise it, and when and how it
has to change.
While this assignment requires you to implement file-system-related
system calls, you actually have to write virtually no
low-level file system code in this assignment. You will use the
existing VFS layer to do most of the work. Your job is to construct
the subsystem that implements the interface expected by user-level
programs by invoking the appropriate VFS and vnode operations.
While you are not restricted to only modifying these files,
please place most of your implementation in the following files:
function prototypes and data types for your file subsystem in
src/kern/include/file.h, and the function implementations and
variable instantiations in src/kern/userprog/file.c.
A note on errors and error handling of system calls
The man pages in the OS/161 distribution contain a description of the
error return values that you must return. If there are conditions that
can happen that are not listed in the man page, return the most
appropriate error code from kern/errno.h. If none seem particularly
appropriate, consider adding a new one. If you're adding an error code
for a condition for which Unix has a standard error code symbol, use
the same symbol if possible. If not, feel free to make up your own,
but note that error codes should always begin with E, should not be
EOF, etc. Consult Unix man pages to learn about Unix error codes.
Note that if you add an error code to src/kern/include/kern/errno.h,
you need to add a corresponding error message to the file
src/lib/libc/strerror.c.
Here are some additional questions and issues to aid you in developing
your design. They are by no means comprehensive, but they are a
reasonable place to start developing your solution.
What primitive operations exist to support the transfer of data to
and from kernel space? Do you want to implement more on top of these?
You will need to "bullet-proof" the OS/161 kernel from user program
errors. There should be nothing a user program can do to crash the
operating system when invoking the file system calls. It is okay in
the basic assignment for the kernel to panic for an unimplemented
system call (e.g. execv()), or a user-level program error.
Decide which functions you need to change and which structures
you may need to create to implement the system calls.
How you will keep track of open files? For which system calls is
this useful?
For additional background, consult one or more of the following
texts for details how similar existing operating systems structure
their file system management:
- Section 10.6.3 and "NFS implementation" in Section 10.6.4,
Tannenbaum, Modern Operating Systems .
- Section 6.4 and Section 6.5, McKusick et al., The
Design and Implementation of the 4.4 BSD Operating System.
- Chapter 8, Vahalia, Unix Internals: the new frontiers.
- The original VFS paper is available here.
Documenting your solution
This is a compulsory component of this assignment. You must
submit a small design document identifying the basic issues in this
assignment, and then describe your solution to the problems you have
identified. The design document you developed in the planning phase
(outlined above) would be an ideal start. The document must be plain
ASCII text. We expect such a document to be roughly 500 - 1000 words,
i.e. clear and to the point.
The document will be used to guide our markers in their evaluation
of your solution to the assignment. In the case of a poor results in
the functional testing combined with a poor design document, we will
base our assessment on these components alone. If you can't describe
your own solution clearly, you can't expect us to reverse engineer the
code to a poor and complex solution to the assignment.
Create your design document to the top of the source tree to
OS/161 (~/cs3231/asst2-src), and include it in svn as follows.
% cd ~/cs3231/asst2-src
% svn add design.txt
When you later commit your changes into your repository, your
design doc will be included in the commit, and later in your
submission.
Also, please word wrap you design doc if your have not already
done so. You can use the unix fmt command to achieve this if
your editor cannot.
As with the previous assignments, you again will be submitting a diff
of your changes to the original tree.
You should first commit your changes back to the repository using
the following command. Note: You will have to supply a comment on your
changes. You also need to coordinate with your partner that the
changes you have (or potentially both have) made are committed
consistently by you and your partner, such that the repository
contains the work you want from both partners.
% cd ~/cs3231/asst2-src
% svn commit
If the above fails, you may need to run svn update to bring
your source tree up to date with commits made by your partner. If you
do this, you should double check and test your assignment prior to
submission.
Beware! If you have created new files for this assignment, they
will not be included in your submission unless you add them, using svn add:
% svn add filename.c
If you add files after running svn commit, you will need to run svn commit
again.
Now generate a file containing the diff.
% cd ~
% svn diff file:///home/osprjXXX/repo/asst2/initial
file:///home/osprjXXX/repo/asst2/trunk >~/asst2.diff
Testing Your Submission
Look Even though the generated diff output should represent all the
changes you have made to the supplied code, occasionally students do
something "ingenious" and generate non representative diff output.
We strongly suggest keeping your svn repository intact to allow
for recovery of your work if need be.
Given you're doing the advanced version of the assignment, I'm
assuming you're competent with managing your SVN repository and don't
need simple directions. You basically need to geneerate a diff between your final version and the base. There are two ways you can do this: the simpler (but messier) option is to continue developing along your mainline branch and generate the diff in the same way as for asst2. A neater approach is to create a new branch in SVN to work on your advanced solution. Whichever approach you take, make sure you test your diff before you submit it!
Passing arguments from one user program, through the kernel, into
another user program, is a bit of a chore. What form does this take in
C? This is rather tricky, and there are many ways to be led
astray. You will probably find that very detailed pictures and several
walk-throughs will be most helpful.
How will you determine: (a) the stack pointer initial value; (b)
the initial register contents; (c) the return value; (d) whether you
can exec the program at all?
How will you manage file accesses? When your shell invokes the cat
command, and the cat command starts to read file1, what will happen if
the shell also tries to read file1? What would you like to happen?
How you will keep track of running processes. For which system
calls is this useful?
How you will implement the execv system call. How is the
argument passing in this function different from that of other system
calls?
Now generate a file containing the diff. (NOTE: How this works depends on whether you have set up branches in SVN. See above.)