Questions
and Answers
Files and file systems
Q2: The Unix inode structure contains a reference count. What is the reference count for? Why can't we just remove the inode without checking the reference count when a file is deleted?
Inodes contain a reference count due to hard links. The reference count is equal to the number of directory entries that reference the inode. For hard-linked files, multiple directory entries reference a single inode. The inode must not be removed until no directory entries are left (ie, the reference count is 0) to ensure that the filesystem remains consistent.
Q3: Inode-based filesystems typically use block groups. Each block group consists of a number of contiguous physical disk blocks. Inodes for a given block group are stored in the same physical location as the block groups. What are the advantages of this scheme? Are they any disadvantages?
Block groups keep the inodes physically closer to the files they refer to than they would be (on average) on a system without block groups. Since accessing and updating files also involves accessing or updating its inode, having the inode and the file's block close together reduces disk seek time, and thus improves performance. The OS must take care that all blocks remain within the block group of their inode.
Q4: Assume an inode with 10 direct blocks, as well as single, double and triple indirect block pointers. Taking into account creation and accounting of the indirect blocks themselves, what is the largest possible number of block reads and writes in order to:
- Read 1 byte
- Write 1 byte
Assume the inode is cached in memory.
To write 1 byte, in the worst case:
- 4 writes: create single indirect block, create double
indirect block, create triple indirect block, write data block.
- 3 reads, 2 writes: read single indirect, read double indirect,
read triple indirect, write triple indirect, write data block
- Other combinations are possible
To read 1 byte, in the worst case:
- 4 reads: read single indirect, read double indirect, read
triple indirect, read data block
Q5: Consider a file currently consisting of 100 records
of 400 bytes. The filesystem uses fixed blocking,
i.e. one 400 byte record is stored per 512 byte block. Assume that
the file control block (and the index block, in the case of indexed
allocation) is already in memory. Calculate how many disk I/O
operations are required for contiguous, linked, and indexed
(single-level) allocation strategies, if, for one record, the
following conditions hold. In the contiguous-allocation case,
assume that there is no room to grow at the beginning, but there is
room to grow at the end of the file. Assume that the record
information to be added is stored in memory.
- The record is added at the beginning.
- The record is added in the middle.
- The record is added at the end.
- The record is removed from the beginning.
- The record is removed from the middle.
- The record is removed from the end.
| | Contiguous | Linked | Indexed |
| a. | 100r/101w | 0r/1w | 0r/1w |
| b. | 50r/51w | 50r/2w | 0r/1w |
| c. | 0r/1w | 100r/2w | 0r/1w |
| d. | 0r/0w | 1r/0w | 0r/0w |
| e. | 49r/49w | 50r/1w | 0r/0w |
| | 51r/1w | |
| f. | 0r/0w | 99r/1w | 0r/0w |
Q6:
Old versions of UNIX allowed you to write to directories. Newer ones
do not even allow the superuser to write to them? Why? Note that many
unices allow you read directories.
To prevent total corruption of the fs.
eg cat /dev/zero > /
Q7:
What persmissions would you have on the following files:
om:[/tmp]% ls -ld t* .
drwxrwxrwt 6 root root 4096 May 21 12:19 .
-rw-rw---- 1 nash stud 216 May 18 18:59 t1
-rw--w---- 1 nash stud 260 May 18 18:59 t2
-rw------- 1 nash stud 458 May 18 18:59 t3
-rwsrwsr-x 1 nash stud 138 May 21 12:19 t4
-rwsrwxr-x 1 nash stud 285 May 21 12:19 t5
The following are the 'interesting' things to note about each entry in
the listing.
- '.': Everybody can create, list and access files (depending on
individual file permissions ) in the
directory. Note the "sticky" bit 't' is set on the directory.
From man 2 stat on linux:
When the sticky bit is set on a directory, files in that directory may
be unlinked or renamed only by root or their owner. Without the
sticky bit, anyone able to write to the directory can delete or rename
files. The sticky bit is commonly found on directories, such as /tmp,
that are world-writable.
- t1: nash and any member of the group stud can read and write this
file.
- t2: nash can read/write, group stud can only write.
- t3: Only nash can read/write the file.
- t4: Everybody can read and execute the file, only nash and group
stud can modify it. Both the setuid and setgid bits ('s') are set. The
executable t4 when executed while change the effective user id to nash
and the effective group id to stud for the duration of execution.
- t5: As above, except the effect group id will be unchanged and
default to the executor's default group.
Q8:
Why is there VFS Layer in Unix?
- It provides a framework to support multiple file system types
concurrently without requiring each file system to be aware of other
file system types.
- Provides transparent access to all supported file systems
including network file systems (e.g. NFS, CODA)
- It provides a clean interface between the file system independent
kernel code and the file system specific kernel code.
- Provide support for special file system types like
/proc.
Q9:
Assume you have an inode-based filesystem. The filesystem has 512 byte
blocks. Each inode has 10 direct, 1 single indirect, 1 double indirect, and 1
triple indirect block pointer. Block pointers are 4 bytes each. Assume the
inode and any block free list is always in memory. Blocks are not cached.
- What is the maximum file size that can be stored before
- the single indirect pointer is needed?
- the double indirect pointer is needed?
- the triple indirect pointer is needed?
- What is the maximum file size supported?
- What is the number of disk block reads required to read 1 byte
from a file
- in the best case?
- in the worst case?
- What is the number of disk block reads and writes required to
write 1 byte to a file
- in the best case?
- in the worst case?
-
- 5K
- 69K
- 8261K
- 1056837K
-
- 1
- 4
- What is the number of disk block reads and writes required to
write 1 byte to a file
- 1w
- 4r/1w
Q10:
How does choice of block size affect file system performance. You
should consider both sequential and random access.
- Sequential Access
The larger the block size, the fewer I/O operations required and
the more contiguous the disk accesses. Compare loading a single 16K
block with loading 32 512-byte blocks.
- Random Access
The larger the block size, the more unrelated data loaded.
Spatial locality of access can improve the situation.
Q11:
A typical UNIX inode stores both the file's size and the number of
blocks currently used to store the file. Why store both? Should not
blocks = size / block size?
Blocks used to store the file are only indirectly related to file
size.
- The blocks used to store a file includes and indirect blocks used
by the filesystem to keep track of the file data blocks themselves.
- File systems only store blocks that actually contain file
data. Sparsely populated files can have large regions that are unused
within a file.
Q12:
Why does Linux pre-allocate up to 8 blocks on a write to a file.
Pre-allocating provides better locality when many writes to
independent files are interleaved.
Q13:
Why does Linux divide a partition up into smaller block groups?
- Each group contains a redundant superblock. This make the file
system more robust to disk block failures.
- The inodes and data block within a group are closer (better
locality) than in the case of a single 'group' with the inodes at one
extreme of the disk and the data blocks at the other.
Q14:
Linux uses a buffer cache to improve performance. What is the
drawback of such a cache? In what scenario is it problematic? What
alternative would be more appropriate where a buffer cache is inappropriate?
The buffering writes in the buffer cache provides the opportunity
for data to be lost if the system stops prior to the cache being
flushed.
Removable storage devices are particular problematic if users
don't "unmout" them first.
Robustness can be improved by using a write-through cache at the
expense of poor write performance.
|