

#### Copyright Notice UNSW

#### These slides are distributed under the Creative Commons Attribution 3.0 License

- You are free:
  - to share to copy, distribute and transmit the work
  - to remix to adapt the work
- → Under the following conditions:
  - Attribution. You must attribute the work (but not in any way that suggests that the author endorses you or your use of the work) as follows:
    - "Courtesy of Gernot Heiser, [Institution]", where [Institution] is one of
    - "UNSW", "NICTA", or "Open Kernel Labs"
- → The complete license text can be found at http://creativecommons.org/licenses/by/3.0/legalcode

@2008 Gernot Heiser UNSW/NICTA/OKL. Distributed under Creative Commons Attribution License

Motivation

 Early operating systems had very little structure
 A strictly layered approach was promoted by Dijkstra
 THE Operating System [Dij68]
 Later OS (more or less) followed that approach (e.g., Unix).
 Such systems are known as monolithic kernels

### Advantages: Nernel has access to everything: all optimisations possible all techniques/mechanisms/concepts implementable Kernel can be extended by adding more code, e.g. for: new services support for new hardwdare Problems: Widening range of services and applications OS bigger, more complex, slower, more error prone. Need to support same OS on different hardware. Like to support various OS environments. Distribution Impossible to provide all services from same (local) kernel





























### Mach Tasks and Threads "New Mach Tasks and Threads "New Mach Tasks and Threads "Index Mach Tasks and Threads "Active entity (basic unit of CPU utilisation). "Own stack, kernel scheduled "Own stack, kernel sche



































#### Interpretation Observations: Mach memory penalty higher i.e. cache misses or write stalls Mach VM system executes more instructions than Ultrix But has more functionality Claim: Degraded performance is (intrinsic?) result of OS structure IPC cost is not a major factor [Ber92] IPC cost known to be high in Mach







# Other Experience with Microkernel Performance System call costs are (inherently?) high Typically hundreds of cycles, 900 for Machi/486 Context (address-space) switching costs (inherently?) high Getting worse (in terms of cycles) with increasing CPU/memory speed ratios [Ous90] IPC (involving system calls and context switches) is inherently expensive Microkernels heavily depend on IPC IPC is expensive Is the microkernel idea flawed? Should some code never leave the kernel? Do we have to buy flexibility with performance?

| A Critique of the Critique                                                                                                                                                                                                                                                 | UNSW |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| <ul> <li>→ Data presented earlier:         <ul> <li>Are specific to one (or a few) system,</li> <li>Results cannot be generalised without thorough analysis</li> <li>No such analysis had been done</li> </ul> </li> <li>→ Cannot trust the conclusions [Lie95]</li> </ul> |      |
| ©2008 Gernot Heiser UNSW/NICTA/OKL. Distributed under Creative Commons Attribution License                                                                                                                                                                                 | 42   |









## Microkernel Design Principles [Lie96] Minimality: If it doesn't have to be in the kernel, it shouldn't be in the kernel Appropriate abstractions which can be made fast and allow efficient implementation of services Well written: It pays to shave a few cycles off TLB refill handler or the IPC path Unportable: must be targeted to specific hardware no problem if it's small, and higher layers are portable Example: Liedtke reports significant rewrite of memory management when porting from 486 to Pentium Eg size and associativity of cache, TLB Hardware abstraction layer is too costly We'll revisit those principles later

























