# Virtual Memory THE UNIVERSITY OF REW SOUTH WALES 1

Learning Outcomes

• An understanding of page-based virtual memory in depth.

– Including the R3000's support for virtual memory.

Memory Management Unit (or TLB)

The CPU sends virtual addresses to the MMU

Package

CPU

Memory

Memory

Memory

Memory

The MMU sends physical addresses to the memory

The position and function of the MMU

THE UNIVERSITY OF

THE UNIVERSITY OF

THE UNIVERSITY OF

Virtual Address Page-based VM 15 Space 14 13 Virtual Memory · Physical Memory 12 Divided into equal-11 Divided into sized pages D 10 equal-sized A mapping is a 9 frames translation between · A page and a frame 8 · A page and invalid 7 Mappings defined at 6 runtime · They can change 5 Address space can have holes 4 4 3 3 Process does not have to be 2 В Physical Address contiguous in 1 Space 4 physical memory 0

THE UNIVERSITY OF NEW SOUTH WALES

2

4

6

3

5

Virtual Address **Typical Address** Space **Space Layout** Kernel · Stack region is at top, and can grow down Stack · Heap has free space to Shared Libraries · Text is typically read-only · Kernel is in a reserved, **BSS** protected, shared region (heap) Ε 0-th page typically not Data used, why? B A Text (Code) UNIVERSITY OF V SOUTH WALES

Programmer's perspective: Virtual Address logically present Space System's perspective: Not T mapped, data on disk S A process may be only partially resident T - Allows OS to store individual pages on disk Disk - Saves memory for infrequently used data & code D What happens if we access non-В resident Physical Address memory? Space





Virtual Address 6 15 Space 15 Page 14 T 14 Table S 13 0 13 Page table for 12 12 resident part of 11 11 address space 10 10 9 9 8 8 7 7 F 6 6 5 Ε 3 5 5 D 4 4 4 3 3 3 1 В 2 2 2 Physical 1 1 7 1 <sup>9</sup>0 THE UNIVERSITY OF NEW SOUTH WALES 0 0 Address Space

9

11

**Shared Pages**  Private code and data
 Shared code - Single copy of code - Each process has own copy of code and data shared between all processes executing it Code and data can - Code must not be self appear anywhere in the address space modifying Code must appear at same address in all processes 10 THE UNIVERSITY OF NEW SOUTH WALES

Proc 2 Address Proc 1 Address Space Space T 0 S 5 X Physical Address Spag В Χ N Two (or more) processes running the same program D N 4 and sharing S 1 M the text section B A 7 В 7 Page Page 2 2 Table Table

Page Table Structure · Page table is (logically) an array of 5 frame numbers - Index by page number • Each page-table entry (PTE) also has other bits Caching disabled Modified Present/absent Page frame number 4 7 Page 2 Table 12 THE UNIVERSITY OF NEW SOUTH WALES

12

10

### PTE Attributes (bits)

- · Present/Absent bit
  - Also called valid bit, it indicates a valid mapping for the page
- Modified bit
  - Also called *dirty bit*, it indicates the page may have been modified in memory
- Reference bit
  - Indicates the page has been accessed
- · Protection bits
  - Read permission, Write permission, Execute permission
  - Or combinations of the above
- Caching bit
  - Use to indicate processor should bypass the cache when accessing memory
    - · Example: to access device registers or memory



13

13

\_\_\_\_\_\_

• In a page-based system, translation

memory

frame number
THE UNIVERSITY OF NEW SOUTH WALES

14

14



virtual and physical mem chopped up in pages/frames

- programs use virtual addresses
- virtual to physical mapping by MMU
  - -first check if page present
    (present/absent bit)
  - -if yes: address in page table form MSBs in physical address -if no: bring in the page from disk
  - if no: bring in the page from disk→ page fault



15

### Page Tables

**Address Translation** 

· Every (virtual) memory address issued by

the CPU must be translated to physical

involves replacing the page number with a

- Every *load* and every *store* instruction

- Every instruction fetch

Need Translation Hardware

- · Assume we have
  - 32-bit virtual address (4 Gbyte address space)
  - 4 KByte page size
  - How many page table entries do we need for one process?

THE UNIVERSITY OF NEW SOUTH WALES

16

### Page Tables

- Assume we have
  - 64-bit virtual address (humungous address space)
  - 4 KByte page size
  - How many page table entries do we need for one process?
- Problem:
  - Page table is very large
  - Access has to be fast, lookup for every memory reference
  - Where do we store the page table?
    - Registers?
    - · Main memory?

THE UNIVERSITY OF NEW SOUTH WALES 17

### Page Tables

- Page tables are implemented as data structures in main memory
- Most processes do not use the full 4GB address space
   e.g., 0.1 1 MB text, 0.1 10 MB data, 0.1 MB stack
- We need a compact representation that does not waste space
  - But is still very fast to search
- Three basic schemes
  - Use data structures that adapt to sparsity
  - Use data structures which only represent resident pages
     Use VM techniques for page tables (details left to extended OS)

THE UNIVERSITY OF NEW SOUTH WALES 18





19





21

## Summarising Two-level Page Tables • Translating a 32-bit virtual address into a 32-bit physical • Recall: • the level 1 page table node has 2<sup>10</sup> entries • 2<sup>10</sup> \* 4 = 4 KiB node • the level 2 page table node have 2<sup>10</sup> entries • 2<sup>10</sup> \* 4 = 4 KiB node • the level 3 page table node have 2<sup>10</sup> entries • 2<sup>10</sup> \* 4 = 4 KiB node



23 24





Intel 4-Level Page Tables

Total Page Tables

Total Page Tables

Total Table Tables Table Table



27



Inverted Page Table (IPT)
 "Inverted page table" is an array of page numbers sorted (indexed) by frame number (it's a frame table).
 Algorithm

 Compute hash of page number
 Extract index from hash table
 Use this to index into inverted page table
 Match the PID and page number in the IPT entry
 If match, use the index value as frame # for translation
 If no match, get next candidate IPT entry from chain field
 If NULL chain entry ⇒ page fault

29 30

### **Properties of IPTs**

- · IPT grows with size of RAM, NOT virtual address space
- Frame table is needed anyway (for page replacement, more later)
- · Need a separate data structure for non-resident pages
- Saves a vast amount of space (especially on 64-bit systems)
- Used in some IBM and HP workstations

THE UNIVERSITY OF NEW SOUTH WALES

31

31

### Another look at sharing...

THE UNIVERSITY OF NEW SOUTH WALES

33

### Improving the IPT: Hashed Page Table

- · Retain fast lookup of IPT
  - A single memory reference in best case
- Retain page table sized based on physical memory size (not virtual)
  - Enable efficient frame sharing
  - Support more than one mapping for same frame
- · Key addition: adding frame number to HPT entry

THE UNIVERSITY OF NEW SOUTH WALES

35

35 36

### Given *n* processes

- how many page tables will the system have for
  - 'normal' page tables
  - inverted page tables?

THE UNIVERSITY OF NEW SOUTH WALES

32



34







37

### Sizing the Hashed Page Table

- · HPT sized based on physical memory size
- · With sharing
  - Each frame can have more than one PTE
  - More sharing increases number of slots used
    - · Increases collision likelihood
- · However, we can tune HPT size based on:
  - · Physical memory size
  - · Expected sharing
  - · Hash collision avoidance.
  - HPT a power of 2 multiple of number of physical memory frame



39

39

### VM Implementation Issue

- · Performance?
  - Each virtual memory reference can cause two physical memory accesses
    - · One to fetch the page table entry
    - One to fetch/store the data
    - ⇒Intolerable performance impact!!
- · Solution:
  - High-speed cache for page table entries (PTEs)
    - Called a translation look-aside buffer (TLB)
    - Contains recently used page table entries
    - · Associative, high-speed memory, similar to cache memory
    - · May be under OS control (unlike memory cache)

THE UNIVERSITY OF NEW SOUTH WALES

40

40

### On-CPU hardware device!!! Virtual Address Page # Offset Translation Lookastile Buffer TLB hit TLB miss Data structure in main memory

### Translation Lookaside Buffer

- Given a virtual address, processor examines the TLB
- If matching PTE found (*TLB hit*), the address is translated
- Otherwise (TLB miss), the page number is used to index the process's page table
  - If PT contains a valid entry, reload TLB and restart
  - Otherwise, (page fault) check if page is on disk
    - · If on disk, swap it in
    - · Otherwise, allocate a new page or raise an exception



42

41 42

### **TLB** properties

- Page table is (logically) an array of frame numbers
- · TLB holds a (recently used) subset of PT entries
  - Each TLB entry must be identified (tagged) with the page # it translates
  - Access is by associative lookup:
    - All TLB entries' tags are concurrently compared to the page #
    - TLB is associative (or content-addressable) memory

| $page \ \#$ | frame # | V | W |
|-------------|---------|---|---|
|             |         |   |   |
|             |         |   |   |

43

43

THE UNIVERSITY O NEW SOUTH WALE

### **TLB** properties

- · TLB may or may not be under direct OS control
  - Hardware-loaded TLB
    - On miss, hardware performs PT lookup and reloads TLB
    - Example: x86, ARM
  - Software-loaded TLB
    - On miss, hardware generates a TLB miss exception, and exception handler reloads TLB
    - Example: MIPS, Itanium (optionally)
- TLB size: typically 64-128 entries
- Can have separate TLBs for instruction fetch and data access
- TLBs can also be used with inverted page tables (and others)



44

44

### TLB and context switching

- · TLB is a shared piece of hardware
- · Normal page tables are per-process (address space)
- TLB entries are process-specific
  - On context switch need to *flush* the TLB (invalidate all entries)
    - high context-switching overhead (Intel x86)
  - or tag entries with address-space ID (ASID)
    - called a tagged TLB
    - used (in some form) on all modern architectures
    - TLB entry: ASID, page #, frame #, valid and write-protect bits



45

TLB effect

- Without TLB
  - Average number of physical memory references per virtual reference

= 2

- With TLB (assume 99% hit ratio)
  - Average number of physical memory references per virtual reference
    - = .99 \* 1 + 0.01 \* 2
    - = 1.01



46

46

45

47



Recap - Simplified Components of VM System

Virtual Address Spaces (3 processes)

Inverted Page Table

THE UNIVERSITY OF REW SOUTH WALES

48





<del>J</del>

| R3000 Address<br>Space Layout                                                                                     | 0xffffffff<br>0xC0000000 | kseg2 |
|-------------------------------------------------------------------------------------------------------------------|--------------------------|-------|
| kseg0:     512 megabytes                                                                                          | 0xA0000000               | kseg1 |
| - Fixed translation window to physical memory  • 0x80000000 - 0x9ffffff virtual = 0x00000000 - 0x1ffffff physical | 0x80000000               | kseg0 |
| TLB not used     Cacheable     Only kernel-mode accessible     Usually where the kernel code and data is placed   |                          | kuseg |
| THE UNIVERSITY OF NEW SOUTH WALES  Physical Memory                                                                | 0x00000000               |       |

0xFFFFFFF R3000 Address kseg2 **Space Layout** 0xC0000000 kuseg: kseg1 - 2 gigabytes 0xA0000000 - TLB translated (mapped) kseg0 - Cacheable (depending on 'N' bit) 0x80000000 - user-mode and kernel mode accessible - Page size is 4K kuseg THE UNIVERSITY OF NEW SOUTH WALES 0x00000000

51 52

| R3000 Address OXFFFFFFFFF Space Layout OXC0000000 |                 |            | kseg2           |
|---------------------------------------------------|-----------------|------------|-----------------|
| Switching processes     switches the translation  |                 | 0xA0000000 | kseg1           |
| (page table) for kuseg  0x80000000                |                 | kseg0      |                 |
| Proc 1<br>kuseg                                   | Proc 2<br>kuseg | 0x00000000 | Proc 3<br>kuseg |



53 54