Assignment 2: System Calls and Processes
- Due dates and mark distribution
- Code Walk-Through
- Basic Assignment
- Advanced Assignment
1. Due Dates and Mark Distribution
Due Date: 8am, Monday September 27th.
Marks: Worth 30 marks (of the 100 available for the class mark component of the course)
- The 10% bonus for one week early is available for the basic assignment. Deadline: 8am, Monday September 20th
- The extra 5% bonus for a submitted, working assignment within 48 hours of release, is also available for the basic assignment. Deadline: Midnight, Wednesday September 8th. See course intro for exact details. Note the demonstration requirement.
- Up to a 20% bonus is available for the advanced part of the
To attempt the advanced part, you must finish the standard assignment at least one week early, and get permission from the lecturer in charge, Gabriele Keller.
Any bonus awarded for the advanced part can be used to make up for lost marks in any class mark component of the course.
- The familiarisation questions contained herein or the subject of your week 8 tutorial. Please answer the questions and bring them to your tutorial.
In this assignment you will be implementing a set of file- and
process-related system calls. Upon completion, your operating system
will be able to run multiple copies of a single application at
user-level and perform some basic file I/O.
A substantial part of this assignment is understanding how OS/161 works and determining what code is required to implement the required functionality. Expect to spend at least as long browsing and digesting OS/161 code as actually writing and debugging your own code.
If you attempt the advanced part, you will add the ability to run multiple applications.
Your current OS/161 system has minimal support for running executables, nothing that could be considered a true process. Assignment 2 starts the transformation of OS/161 into something closer to a true multi-tasking operating system. After this assignment, it will be capable of running multiple processes from actual compiled programs stored in your account. The program will be loaded into OS/161 and executed in user mode by System/161; this will occur under the control of your kernel. First, however, you must implement part of the interface between user-mode programs ("userland") and the kernel. As usual, we provide part of the code you will need. Your job is to design and build the missing pieces.
Our code can run one user-level C program at a time as long as it doesn't want to do anything but shut the system down. We have provided sample user programs that do this (reboot, halt, poweroff), as well as others that make use of features you might be adding in this and future assignments. So far, all the code you have written for OS/161 has only been run within, and only been used by, the operating system kernel. In a real operating system, the kernel's main function is to provide support for user-level programs. Most such support is accessed via "system calls." We give you one system call, reboot(), which is implemented in the function sys_reboot() in main.c. In GDB, if you put a breakpoint on sys_reboot and run the "reboot" program, you can use "backtrace" to see how it got there.
For those attempting the advanced version of the assignment, you will also be implementing the subsystem that keeps track of multiple tasks. You must decide what data structures you will need to hold the data pertinent to a "process" (hint: look at kernel include files of your favorite operating system for suggestions, specifically the proc structure.) The first step is to read and understand the parts of the system that we have written for you.
3. Code walk-through
Bring your answers to the code walk-through questions to your week 8
loadelf.cThis file contains the functions responsible for loading an ELF executable from the filesystem and into virtual memory space. Of course, at this point this virtual memory space does not provide what is normally meant by virtual memory, although there is translation between the addresses that executables "believe" they are using and physical addresses, there is no mechanism for providing more memory than exists physically.
runprogram.cThis file contains only one function, runprogram(), which is the function that is responsible for running a program from the kernel menu. Once you have designed your file system calls, a program started by runprogram() should have the standard file descriptors (stdout, stderr) available while it's running.
In the advanced assignment, runprogram() is a good base for writing the execv() system call, but only a base. When writing your design doc, you should determine what more is required for execv() that runprogram() does not need to worry about. Once you have design your process framework, runprogram() should be altered to start processes properly within this framework.
uio.cThis file contains functions for moving data between kernel and user space. Knowing when and how to cross this boundary is critical to properly implementing userlevel programs, so this is a good file to read very carefully. You should also examine the code in lib/copyinout.c.
1. What are the ELF magic numbers?
2. What is the difference between UIO_USERISPACE and UIO_USERSPACE? When should one use UIO_SYSSPACE instead?
3. Why can the struct uio that is used to read in a segment be allocated on the stack in load_segment() (i.e., where does the memory read actually go)?
4. In runprogram(), why is it important to call vfs_close before going to usermode?
5. What function forces the processor to switch into usermode? Is this function machine dependent?
6. In what file are copyin and copyout defined? memmove? Why can't copyin and copyout be implemented as simply as memmove?
7. What (briefly) is the purpose of userptr_t?
trap.cmips_trap() is the key function for returning control to the operating system. This is the C function that gets called by the assembly exception handler. md_usermode() is the key function for returning control to user programs. kill_curthread() is the function for handling broken user programs; when the processor is in usermode and hits something it can't handle (say, a bad instruction), it raises an exception. There's no way to recover from this, so the OS needs to kill off the process. The advance part of this assignment will include writing a useful version of this function.
syscall.cmips_syscall() is the function that delegates the actual work of a system call off to the kernel function that implements it. Notice that reboot is the only case currently handled. You will also find a function, md_forkentry, which is a stub where you will place your code to implement the fork system call.
8. What is the numerical value of the exception code for a MIPS system call?
9. Why does mips_trap() set curspl to SPL_HIGH "manually", instead of using splhigh()?
10. How many bytes is an instruction in MIPS? (Answer this by reading mips_syscall() carefully, not by looking somewhere else.)
11. What is the contents of the struct trapframe? Where the struct trapframe that is passed into mips_syscall stored?
12. What would be required to implement a system call that took more than 4 arguments?
13. What is the purpose of userptr_t?
errno.cThis is where the global variable errno is defined.
syscalls-mips.SThis file contains the machine-dependent code necessary for implementing the userlevel side of MIPS system calls.
syscalls.SThis file is created from syscalls-mips.S at compile time and is the actual file assembled to put into the C library. The actual names of the system calls are placed in this file using a script called callno-parse.sh that reads them from the kernel's header files. This avoids having to make a second list of the system calls. In a real system, typically each system call stub is placed in its own source file, to allow selectively linking them in. OS/161 puts them all together to simplify the makefiles.
14. What is the purpose of the SYSCALL macro?
15. What is the MIPS instruction that actually triggers a system call? (Answer this by reading the source in this directory, not looking somewhere else.)
16. How are vfs_open, vfs_close used? What other vfs_() calls are relevant?
17. What are VOP_READ, VOP_WRITE? How are they used?
18. What does VOP_TRYSEEK do?
19. Where is the struct thread defined? What does this structure contain?
fork()Answer these questions by reading the fork() man page and the sections dealing with fork() in the textbook.
20. What is the purpose of the fork() system call?
21. What process state is shared between the parent and child?
22. What process state is copied between the parent and child?
4. Basic Assignment
% cp -r ~cs3231/assigns/asst2/src ~/cs3231You should now have a src directory to work on.
Building and Testing Your Assignment
Configure OS/161 for Assignment 2
Before proceeding further, configure your new sources.
% cd ~/cs3231/src % ./configure
Unlike previous the previous assignment, you will need to build and install the user-level programs that will be run by your kernel in this assignment.
% cd ~/cs3231/src % makeNote: "make" in this directory does both "make" and "make install".
For your kernel development, again we have provided you with a framework for you to run your solutions for ASST2.
You have to reconfigure your kernel before you can use this framework. The procedure for configuring a kernel is the same as in ASST0 and ASST1, except you will use the ASST2 configuration file:
% cd ~/cs3231/src/kern/conf % ./config ASST2You should now see an ASST2 directory in the compile directory.
Building for ASST2When you built OS/161 for ASST1, you ran make from compile/ASST1 . In ASST2, you run make from (you guessed it) compile/ASST2.
% cd ../compile/ASST2 % make depend % make % make installIf you are told that the compile/ASST2 directory does not exist, make sure you ran config for ASST2.
Command Line Arguments to OS/161Your solutions to ASST2 will be tested by running OS/161 with command line arguments that correspond to the menu options in the OS/161 boot menu.
IMPORTANT: Please DO NOT change these menu option strings!
Running "asst2"For this assignment, we have supplied a user-level OS/161 program that you can use for testing. It is called asst2, and its sources live in src/testbin/asst2.
You can test your assignment by typing p /testbin/asst2
at the OS/161 menu prompt. As a short cut, you can also specify menu
arguments on the command line, example: sys161 kernel "p /testbin/asst2".
Note: On cygwin, you need to type p /testbin/asst2.exe.
Note: If you don't have a sys161.conf file, you can use this one.
Running the program produces output similar to the following prior to starting the assignment.
Unknown syscall 6 Unknown syscall 6 Unknown syscall 6 Unknown syscall 6 Unknown syscall 6 Unknown syscall 6 Unknown syscall 0asst2 produces the following output on a (maybe partially) working assignment.
OS/161 kernel [? for menu]: p /testbin/asst2 Operation took 0.000212160 seconds OS/161 kernel [? for menu]: ********** * File Tester ********** * write() works for stdout ********** * write() works for stderr ********** * opening new file "test.file" * open() got fd 3 * writing test string * wrote 45 bytes * writing test string again * wrote 45 bytes * closing file ********** * opening old file "test.file" * open() got fd 3 * reading entire file into buffer * attemping read of 500 bytes * read 90 bytes * attemping read of 410 bytes * read 0 bytes * reading complete * file content okay ********** * testing lseek * reading 10 bytes of file into buffer * attemping read of 10 bytes * read 10 bytes * reading complete * file lseek okay * closing file ********** * testing fork * Forked, in parent * Forked, in child Unknown syscall 0
The Assignment: File System CallsImplement the following file-based system calls. The full range of system calls that we think you might want over the course of the semester is listed in kern/include/kern/callno.h. For this assignment you should implement: open, read, write, lseek, close, dup2. You should also implement the fork system call. Note: You are implementing the kernel code that implements the system call functionality within the kernel. The C stubs that user-level applications call to invoke the system calls are already automatically generated when you build OS/161.
Note: the file-system related system calls are worth 85% of this assignment. You should only implement fork when you are confident your file-system syscalls works. If you find that you are having trouble with this assignment, you can still get a good mark by concentrating on the file-system related system calls. Note, however, that even if you do not implement fork, your implementation should not assume a single process.
It's crucial that your syscalls handle all error conditions gracefully (i.e., without crashing OS/161.) You should consult the OS/161 man pages included in the distribution and understand fully the system calls that you must implement. You must return the error codes as decribed in the man pages. Additionally, your syscalls must return the correct value (in case of success) or error code (in case of failure) as specified in the man pages. Some of the auto-marking scripts rely on the return of appropriate error codes; adherence to the guidelines is as important as the correctness of the implementation. The file include/unistd.h contains the user-level interface definition of the system calls that you will be writing for OS/161 (including ones you will implement in later assignments). This interface is different from that of the kernel functions that you will define to implement these calls. You need to design this interface and put it in kern/include/syscall.h. As you discovered (ideally) in Assignment 0, the integer codes for the calls are defined in kern/include/kern/callno.h. You need to think about a variety of issues associated with implementing system calls. Perhaps, the most obvious one is: can two different user-level processes (or user-level threads, if you choose to implement them) find themselves running a system call at the same time?
open(), read(), write(), lseek(), close(), and dup2()For any given process, the first file descriptors (0, 1, and 2) are considered to be standard input (stdin), standard output (stdout), and standard error (stderr) respectively. For this basic assignment, the file descriptors 1 (stdout) and 2 (stderr) must start out attached to the console device ("con:"). You will probably modify runprogram() to achieve this. Your implementation must allow programs to use dup2() to change stdin, stdout, stderr to point elsewhere.
Although these system calls may seem to be tied to the filesystem, in fact, these system calls are really about manipulation of file descriptors, or process-specific filesystem state. A large part of this assignment is designing and implementing a system to track this state. Some of this information (such as the cwd) is specific only to the process, but others (such as offset) is specific to the process and file descriptor. Don't rush this design. Think carefully about the state you need to maintain, how to organise it, and when and how it has to change.
While this assignment requires you to implement file-system-related system calls, you actually have to write virtually no low-level file system code in this assignment. You will use the existing VFS layer to do most of the work. Your job is to construct the subsystem that implements the interface expected by user-level programs by invoking the appropriate VFS and vnode operations.
While you are not restricted to only modifying these files, please place most of your implementation in the following files: function prototypes and data types for your file subsystem in src/kern/include/file.h, and the function implementations and variable instantiations in src/kern/userprog/file.c.
fork()For the basic assignment, you will implement a simplified form of the fork() system call. Your implementation of fork should be the same as that described in the man page, except that it should return 1 to the parent (rather than the child's PID).
The amount of code to implement fork is quite small; the main challenge is to understand what needs to be done. We strongly encourage you to implement the file-related system calls first, with fork in mind.
- Read the comments above mips_usermode() in kern/arch/mips/mips/trap.c
- Read the comments in kern/include/addrspace.h, particularly as_copy.
- You will need to copy the trapframe from the parent to the child. You should be careful how you do this, as there is a possible race condition (where?/why?).
- You may wish to base your implementation on the thread_fork() function in kern/thread/thread.c.
A note on errors and error handling of system callsThe man pages in the OS/161 distribution contain a description of the error return values that you must return. If there are conditions that can happen that are not listed in the man page, return the most appropriate error code from kern/errno.h. If none seem particularly appropriate, consider adding a new one. If you're adding an error code for a condition for which Unix has a standard error code symbol, use the same symbol if possible. If not, feel free to make up your own, but note that error codes should always begin with E, should not be EOF, etc. Consult Unix man pages to learn about Unix error codes. Note that if you add an error code to src/kern/include/kern/errno.h, you need to add a corresponding error message to the file src/lib/libc/strerror.c.
Design QuestionsHere are some additional questions and issues to aid you in developing your design. They are by no means comprehensive, but they are a reasonable place to start developing your solution.
What primitive operations exist to support the transfer of data to and from kernel space? Do you want to implement more on top of these?
You will need to "bullet-proof" the OS/161 kernel from user program errors. There should be nothing a user program can do to crash the operating system when invoking the file system calls. It is okay in the basic assignment for the kernel to panic for an unimplemented system call (e.g. execv()), or a user-level program error.
Decide which functions you need to change and which structures you may need to create to implement the system calls.
How you will keep track of open files? For which system calls is this useful?
For additional background, consult one or more of the following texts for details how similar existing operating systems structure their file system management:
- Section 10.6.3 and "NFS implementation" in Section 10.6.4, Tannenbaum, Modern Operating Systems .
- Section 6.4 and Section 6.5, McKusick et al., The Design and Implementation of the 4.4 BSD Operating System.
- Chapter 8, Vahalia, Unix Internals: the new frontiers.
- The original VFS paper is available here.
Basic Assignment SubmissionOnce your assignment works, you now need to generate a diff of the changes you made to OS/161. Firstly, you need to clean the source tree.
% cd ~/cs3231/src % make clean % cd kern/compile/ASST2 % make cleanYou will be submitting a diff of your changes to the original tree. So generate a file containing this diff.
% cd ~/cs3231 % diff -N -r -u -X ~cs3231/assigns/asst2/src/.diffex ~cs3231/assigns/asst2/src src > ~/asst2.diffThe arguments to diff are as follows
- -N: Include any new files you add to diff's output.
- -r: Recursively compare directories
- -u: Generate unified diff output.
- -X: Exclude files listed in .diffex from being included in the diff.
Testing Your SubmissionLook here for information on testing and resubmitting your assignment.
Submitting Your AssignmentNow submit the diff as your assignment.
% cd ~ % give cs3231 asst2 asst2.diffYou're now done.
Even though the generated diff output should represent all the changes you have made to the supplied code, occasionally students do something "ingenious" and generate non representative diff output.
We suggest keeping a copy of the source tree as a compressed tar file until you get your results back. After getting your results back, you should be confident the diff was representative and you can safely keep the diff file as a record of your work, and consequently remove the tar file.
To create a compressed tar file and remove the sources to asst2:
% cd ~/cs3231 % tar cvzf asst2.tar.gz srcYou can confirm the tar file contains your assignment by listing its contents.
% tar tvfz asst2.tar.gzIf all looks well, you can remove the sources if you're not going to attempt the advanced assignment.
% rm -rf srcYou're now really done.
5. Advanced Assignment
The advanced assignment is to complete the basic assignment, plus one of either 5.1 or 5.2 below. Remember that you must finish the basic assignment, then get approval from the lecturer in charge, Gabriele Keller, at least one week before the due date. Whether you attempt the first or second advanced assignment, you should submit using the instructions at the bottom. [If you attempt both parts, submit a diff containing the code for both - they should not conflict.]
This syscall returns the number of seconds that have passed since the kernel was started. This is not a real system call, though similar ones exist. Unlike the other calls you have implemented so far it doesn't exist in the os/161 framework, so you will need to add support for it to the user-level side as well as all the relevant kernel code.
A pid, or process ID, is a unique number that identifies a process. The implementation of getpid() is not terribly challenging, but pid allocation and reclamation are the important concepts that you must implement. It is not OK for your system to crash because over the lifetime of its execution you've used up all the pids. Design your pid system; implement all the tasks associated with pid maintenance, and only then implement getpid().
execv(), waitpid(), _exit()These system calls are probably the most difficult part of the whole assignment, but also the most rewarding. They enable multiprogramming and make OS/161 a much more useful entity. fork() is your mechanism for creating new processes. It should make a copy of the invoking process and make sure that the parent and child processes each observe the correct return value (that is, 0 for the child and the newly created pid for the parent). You will want to think carefully through the design of fork and consider it together with execv to make sure that each system call is performing the correct functionality. execv(), although "only" a system call, is really the heart of this assignment. It is responsible for taking newly created processes and make them execute something useful (i.e., something different than what the parent is executing). Essentially, it must replace the existing address space with a brand new one for the new executable (created by calling as_create in the current dumbvm system) and then run it. While this is similar to starting a process straight out of the kernel (as runprogram() does), it's not quite that simple. Remember that this call is coming out of userspace, into the kernel, and then returning back to userspace. You must manage the memory that travels across these boundaries very carefully. (Also, notice that runprogram() doesn't take an argument vector, but these must of course be handled correctly in execv()).
Although it may seem simple at first, waitpid() requires a fair bit of design. Read the specification carefully to understand the semantics, and consider these semantics from the ground up in your design. You may also wish to consult the UNIX man page, though keep in mind that you are not required to implement all the things UNIX waitpid() supports, nor is the UNIX parent/child model of waiting the only valid or viable possibility. The implementation of _exit() is intimately connected to the implementation of waitpid(). They are essentially two halves of the same mechanism. Most of the time, the code for _exit() will be simple and the code for waitpid() relatively complicated, but it's perfectly viable to design it the other way around as well. If you find both are becoming extremely complicated, it may be a sign that you should rethink your design.
kill_curthread()Feel free to write kill_curthread() in as simple a manner as possible. Just keep in mind that essentially nothing about the current thread's userspace state can be trusted if it has suffered a fatal exception—it must be taken off the processor in as judicious a manner as possible, but without returning execution to the user level.
Write and test a simple user shellA shell is a user program that reads in user comands and calls on the kernel to execute them. When you log into your CSE account, you are probably running a shell called bash. Your OS/161 shell should provide a prompt (e.g. "wallaby> ") and accept user commands. Each comand and its argument list (an array of character pointers) should be passed to the execv() system call, but only after calling fork() to get a new thread of execution. The argument list, argv, should contain the command name as its first string, argv. Once the kernel has finished executing the command, your shell should provide another prompt and wait for input. You should be able to exit the shell by entering "exit".
Your shell should also support the basic job control function: &, which executes a process in the background. That is, a prompt appears, allowing the user to enter another command, even before the last command has finished executing. Your shell should not waste an excessive amount of space remembering exit codes if you run a lot of jobs in the background. The skeleton for your shell is in bin/sh/sh.c. Since the OS/161 system calls needed by your shell are remarkably similar to those supported by Linux, you should be able to write your shell and test it by compiling for Linux. (The supplied makefile already does this.) This way, you can write your shell before the rest of Assignment 2 is finished. Keep in mind that this is the same shell you will use later as the user interface for OS/161; it must run as a user program and leave sufficient space for other user programs to run. Consequently, be careful to keep your shell lean. Avoid adding the ultimate X-interface you've always wanted to have; however, it must include the following built-in commands: cd exit
Test your shell.Under OS/161, once you have the system calls for this assignment, you should be able to use the following user programs from the bin directory: cat, cp, false, pwd, and true. You may also find some of the programs in the testbin directory helpful. Under Linux, you should be able to run anything as long as you don't try to use wildcards, fancy shell quoting, pipes, etc. Note: You may co-develop or share your shell with anyone you wish. The shell is not an accessible component of the advanced assignment, but is required to test your system.
char *filename = "/bin/cp"; char *args; pid_t pid; args = "cp"; args = "file1"; args = "file2"; args = NULL; pid = fork(); if (pid == 0) execv(filename, argv);which will load the executable file cp from, install it as a new process, and execute it. The new process will then find file1 on the disk and copy it to file2.
Some questions to think about:
Passing arguments from one user program, through the kernel, into another user program, is a bit of a chore. What form does this take in C? This is rather tricky, and there are many ways to be led astray. You will probably find that very detailed pictures and several walk-throughs will be most helpful.
How will you determine: (a) the stack pointer initial value; (b) the initial register contents; (c) the return value; (d) whether you can exec the program at all?
What new data structures will you need to manage multiple processes?
What relationships do these new structures have with the rest of the system?
How will you manage file accesses? When your shell invokes the cat command, and the cat command starts to read file1, what will happen if the shell also tries to read file1? What would you like to happen?
How you will keep track of running processes. For which system calls is this useful?
How you will implement the execv system call. How is the argument passing in this function different from that of other system calls?
What functionality your shell should have.
Advanced Assignment SubmissionSubmission for the advanced assignment is similar to the basic assignment, except the advance component is given to a distinguished assignment name: asst2_adv Again, you need to clean the source tree.
% cd ~/cs3231/src % make clean % cd kern/compile/ASST2 % make cleanGenerate a file containing this diff.
% cd ~/cs3231 % diff -N -r -u -X ~cs3231/assigns/asst2/src/.diffex ~cs3231/assigns/asst2/src src > ~/asst2_adv.diff
% cd ~ % give cs3231 asst2_adv asst2_adv.diffYou're now done, but it would be wise to create a backup (tar archive as above or similar) of your source tree just in case.