Operating System Concepts

63 downloads 481 Views 2MB Size Report
What is an Operating System? ▫ A basis for application programs. ▫ An intermediary between users and hardware. ▫ Amazing variety. ▫ Mainframe, personal ...
Operating System Concepts

1

Syllabus ƒ 上課時間: Friday 19:35-22:00 ƒ 教室:M 501 ƒ 教科書: Silberschatz, Galvin, and Gagne, “Operating System Concept,” Seventh Edition, John Wiley & Sons, Inc., 2006. ƒ 成績評量:(subject to changes.): 期中考(30%), 期末考(30%), 課堂參與(40%)

2 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual Memory 11. File Systems * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

3

Chapter 1. Introduction

4

Introduction ƒ What is an Operating System? ƒ A basis for application programs ƒ An intermediary between users and hardware

ƒ Amazing variety ƒ Mainframe, personal computer (PC), handheld computer, embedded computer without any user view Convenient vs Efficient 5 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Computer System Components User

User

.................

Application Programs Operating System Hardware

User compilers, word processors, spreadsheets, browsers, etc. CPU, I/O devices, memory, etc.

ƒ OS – a government/environment provider 6 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

User View ƒ The user view of the computer varies by the interface being used! ƒ Examples: ƒ Personal Computer Æ Ease of use ƒ Mainframe or minicomputer Æ maximization of resource utilization ƒ Efficiency and fair share

ƒ Workstations Æ compromise between individual usability & resource utilization ƒ Handheld computer Æ individual usability ƒ Embedded computer without user view Æ run without user intervention 7 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System View ƒ A Resource Allocator ƒ CPU time, Memory Space, File Storage, I/O Devices, Shared Code, Data Structures, and more

ƒ A Control Program ƒ Control execution of user programs ƒ Prevent errors and misuse

ƒ OS definitions – US Dept.of Justice against Microsoft in 1998 ƒ The stuff shipped by vendors as an OS ƒ Run at all time 8 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Goals ƒ Two Conflicting Goals: ƒ Convenient for the user! ƒ Efficient operation of the computer system!

ƒ We should ƒ recognize the influences of operating systems and computer architecture on each other ƒ and learn why and how OS’s are by tracing their evolution and predicting what they will become! 9 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

UNIX Architecture User interface System call interface

useruser user user user user user Shells, compilers, X, application programs, etc. CPU scheduling, signal handling, virtual memory, paging, swapping, file system, disk drivers, caching/buffering, etc.

Kernel interface to the hardware

terminal controller, terminals, physical memory, device controller, devices such as disks, memory, etc.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

UNIX

10

Mainframe Systems ƒ The first used to tackle many commercial and scientific applications! ƒ 0th Generation – 1940?s ƒ A significant amount of set-up time in the running of a job ƒ Programmer = operator ƒ Programmed in binary Æ assembler Æ (1950) high level languages 11 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Mainframe – Batch Systems ƒ Batches sorted and submitted by the operator ƒ Simple batch systems ƒ Off-line processing ~ Replace slow input devices with faster units Æ replace card readers with disks ƒ Resident monitor ~ Automatically transfer control from one job to the next

• loader • job sequencing • control card interpreter

monitor

User Program Area

12 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Mainframe – Batch Systems ƒ Spooling (Simultaneous Peripheral Operation OnLine) ~ Replace sequential-access devices with random-access device => Overlap the I/O of one job with the computation of others e.g. card Æ disk, CPU services, disk Æ printer

ƒ Job Scheduling disks card reader

disks CPU

printer 13

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Mainframe – Multiprogrammed Systems ƒ Multiprogramming increases CPU utilization by organizing jobs so that the CPU always has one to execute – Early 1960 ƒ Multiporgrammed batched systems ƒ Job scheduling and CPU scheduling ƒ Goal : efficient use of scare resources

monitor CPU scheduling job1 job2 job3

disk 14

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Mainframe – Time-Sharing Systems ƒ Time sharing (or multitasking) is a logical extension of multiprogramming! ƒ Started in 1960s and become common in 1970s. ƒ An interactive (or handon) computer system ƒ Multics, IBM OS/360

on-line file system virtual memory sophisticated CPU scheduling job synchronization protection & security ...... and so on

disk

15 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Desktop Systems ƒ Personal Computers (PC’s) ƒ Appeared in the 1970s. ƒ Goals of operating systems keep changing ƒ Less-Powerful Hardware & Isolated EnvironmentÆ Poor Features ƒ Benefited from the development of mainframe OS’s and the dropping of hardware cost ƒ Advanced protection features

ƒ User Convenience & Responsiveness 16 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Parallel Systems ƒ Tightly coupled: have more than one processor in close communication sharing computer bus, clock, and sometimes memory and peripheral devices ƒ Loosely coupled: otherwise ƒ Advantages ƒ Speedup – Throughput ƒ Lower cost – Economy of Scale ƒ More reliable – Graceful Degradation Æ Fail Soft (detection, diagnosis, correction) • A Tandem fault-tolerance solution * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

17

Parallel Systems ƒ Symmetric multiprocessing model: each processor runs an identical copy of the OS ƒ Asymmetric multiprocessing model: a masterslave relationship ~ Dynamically allocate or pre-allocate tasks ~ Commonly seen in extremely large systems ~ Hardware and software make a difference?

ƒ Trend: the dropping of microporcessor cost Î OS functions are offloaded to slave processors (back-ends) 18 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Distributed Systems ƒ Definition: Loosely-Coupled Systems – processors do not share memory or a clock ƒ Heterogeneous vs Homogeneous

ƒ Advantages or Reasons ƒ Resource sharing: computation power, peripheral devices, specialized hardware ƒ Computation speedup: distribute the computation among various sites – load sharing ƒ Reliability: redundancy Æ reliability ƒ Communication: X-window, email * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

19

Distributed Systems ƒ Distributed systems depend on networking for their functionality. ƒ Networks vary by the protocols used. ƒ TCP/IP, ATM, etc.

ƒ Types – distance ƒ Local-area network (LAN) ƒ Wide-area network (WAN) ƒ Metropolitan-area network (MAN) ƒ Small-area network – distance of few feet

ƒ Media – copper wires, fiber strands, satellite wireless transmission, infrared communication,etc. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

20

Distributed Systems ƒ Client-Server Systems ƒ Compute-server systems ƒ File-server systems

ƒ Peer-to-Peer Systems ƒ Network connectivity is an essential component.

ƒ Network Operating Systems ƒ Autonomous computers ƒ A distributed operating system – a single OS controlling the network. 21 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Clustered Systems ƒ Definition: Clustered computers which share storage and are closely linked via LAN networking. ƒ Advantages: high availability, performance improvement, etc. ƒ Types ƒ Asymmetric/symmetric clustering ƒ Parallel clustering – multiple hosts that access the same data on the shared storage. ƒ Global clusters

ƒ Distributed Lock Manager (DLM) * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

22

Real-Time Systems ƒ Definition: A real-time system is a computer system where a timely response by the computer to external stimuli is vital! ƒ Hard real-time system: The system has failed if a timing constraint, e.g. deadline, is not met. ƒ All delays in the system must be bounded. ƒ Many advanced features are absent. 23 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Real-Time Systems ƒ Soft real-time system: Missing a timing constraint is serious, but does not necessarily result in a failure unless it is excessive ƒ A critical task has a higher priority. ƒ Supported in most commercial OS.

ƒ Real-time means on-time instead of fast 24 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Applications for Real-Time Systems!

25 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Real-Time Systems ƒ Applications ƒ Virtual reality ƒ Air traffic control ƒ Games ƒ Space shuttle ƒ User interface ƒ Navigation ƒ Vision and speech ƒ Multimedia systems ƒ Industrial control systems recognition (approx. 100 ~ 200ms) ƒ Home appliance ƒ PDA, telephone controller system ƒ Nuclear power plant ƒ And more 26 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Handheld Systems ƒ Handheld Systems ƒ E.g., Personal Digital Assistant (PDA)

ƒ New Challenges – convenience vs portability ƒ Limited Size and Weight ƒ Small Memory Size ƒ No Virtual Memory

ƒ Slow Processor ƒ Battery Power

ƒ Small display screen ƒ Web-clipping * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

27

Feature Migration ƒ MULTIplexed Information and Computing Services (MULTICS) ƒ 1965-1970 at MIT as a utility

ƒ UNIX ƒ Since 1970 on PDP-11

ƒ Recent OS’s ƒ MS Windows, IBM OS/2, MacOS X

ƒ OS features being scaled down to fit PC’s ƒ Personal Workstations – large PC’s 28 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Computing Environments ƒ Traditional Computing ƒ E.g., typical office environment

ƒ Web-Based Computing ƒ Web Technology ƒ Portals, network computers, etc.

ƒ Network connectivity ƒ New categories of devices ƒ Load balancers

ƒ Embedded Computing ƒ Car engines, robots, VCR’s, home automation ƒ Embedded OS’s often have limited features. 29 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual Memory 11. File Systems * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

30

Chapter 2 Computer-System Structure

31

Computer-System Structure ƒ Objective: General knowledge of the structure of a computer system. tape drivers

printer CPU printer controller

memory controller

memory

tape-drive controller disk controller

disks

ƒ Device controllers: synchronize and manage access to devices. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

32

Booting ƒ Bootstrap program: ƒ Initialize all aspects of the system, e.g., CPU registers, device controllers, memory, etc. ƒ Load and run the OS

ƒ Operating system: run init to initialize system processes, e.g., various daemons, login processes, after the kernel has been bootstrapped. (/etc/rc* & init or /sbin/rc* & init) 33 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interrupt ƒ Hardware interrupt, e.g. services requests of I/O devices ƒ Software interrupt, e.g. signals, invalid memory access, division by zero, system calls, etc – (trap) process execution

interrupt

handler

return

ƒ Procedures: generic handler or interrupt vector (MS-DOS,UNIX) 34 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interrupt Handling Procedure interrupted process fixed address per interrupt type

system stack

handler

interrupted address, registers ......

ƒ Saving of the address of the interrupted instruction: fixed locations or stacks ƒ Interrupt disabling or enabling issues: lost interrupt?! prioritized interrupts Æ masking * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

35

Interrupt Handling Procedure ƒ Interrupt Handling Î Save interrupt information Î OS determine the interrupt type (by polling) Î Call the corresponding handlers Î Return to the interrupted job by the restoring important information (e.g., saved return addr. Æ program counter) Interrupt -------

Vector indexed by a unique device number

0 1

n * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

-------------

Interrupt Handlers (Interrupt Service Routines)

36

I/O Structure ƒ Device controllers are responsible of moving data between the peripheral devices and their local buffer storages. tape drivers

printer CPU

printer controller registers buffers

memory controller

tape-drive controller DMA registers buffers

memory * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

disk

37

I/O Structure ƒ

I/O operation a. CPU sets up specific controller registers within the controller. b. Read: devices Æ controller buffers Æ memory Write: memory Æ controller buffers Æ devices

c. Notify the completion of the operation by triggering an interrupt 38 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

I/O Types a. Synchronous I/O ƒ

Issues: overlapping of computations and IO activities, concurrent I/O activities, etc.

I/O system call wait till the or • wait instruction (idle till interrupted) completion • looping or • polling • wait for an interrupt Loop: jmp Loop 39 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

I/O Types user

Requesting process

wait

user

Device driver

Device driver

Kernel

Interrupt handler

Hardware data transfer

Time Synchronous I/O * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Requesting process

Interrupt handler

Kernel Hardware data transfer

Time Asynchronous I/O

40

I/O types b. Asynchronous I/O

wait till the completion sync

wait mechanisms!!

*efficiency 41 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

I/O Types ƒ A Device-Status Table Approach card reader 1 status: idle line printer 3 status: busy disk unit 3 status: idle

Request addr. 38596 len?1372

........

Request file:xx Record Addr. len

process 1

Request file:yy Record Addr. len

process 2

•Tracking of many I/O requests •type-ahead service 42 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

DMA ƒ

Goal: Release CPU from handling excessive interrupts! ƒ

E.g. 9600-baud terminal 2-microsecond service / 1000 microseconds

High-speed device: 2-microsecond service / 4 microseconds

ƒ

Procedure ƒ ƒ ƒ ƒ

Execute the device driver to set up the registers of the DMA controller. DMA moves blocks of data between the memory and its own buffers. Transfer from its buffers to its devices. Interrupt the CPU when the job is done. 43

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Storage Structure registers

CPU

cache

HW-Managed

Primary Storage • volatile storage

memory

SW-Managed

Secondary Storage • nonvolatile storage Tertiary Storage • removable media

ƒ Access time: a cycle ƒ Access time: several cycles ƒ Access time: many cycles

Magnetic Disks CD-ROMs/DVDs

* Differences: Size, Cost, Speed, Volatility

Jukeboxes 44 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Memory ƒ ƒ R1 R2

ƒ

R3 . . .

Device Controller Memory

ƒ

Processor can have direct access! Intermediate storage for data in the registers of device controllers Memory-Mapped I/O (PC & Mac) (1) Frequently used devices (2) Devices must be fast, such as video controller, or special I/O instructions is used to move data between memory & device controller registers

Programmed I/O – polling ƒ

or interrupt-driven handling

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

45

Magnetic disks sector

spindle

r/w head

track

cylinder

platter

arm assembly

disk arm

ƒ Transfer Rate ƒ RandomAccess Time ƒ Seek time in x ms ƒ Rotational latency in y ms ƒ 60~200 times/sec 46

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Magnetic Disks ƒ Disks ƒ Fixed-head disks: ƒ More r/w heads v.s. fast track switching

ƒ Moving-head disks (hard disk) ƒ Primary concerns: ƒ Cost, Size, Speed

ƒ Computer Æ host controller Æ disk controller Æ disk drives (cache ÅÆ disks)

ƒ Floppy disk ƒ slow rotation, low capacity, low density, but less expensive

ƒ Tapes: backup or data transfer bet machines 47 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Storage Hierarchy register Speed

Cache

High hitting rate • instruction & data cache • combined cache

Main Memory Volatile Storage

Electronic Disk

Faster than magnetic disk – nonvolatile?! Alias: RAM Disks

Cost

Magnetic Disk Optical Disk Magnetic Tape

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Sequential Access 48

XX GB/350F

Storage Hierarchy ƒ Caching ƒ Information is copied to a faster storage system on a temporary basis ƒ Assumption: Data will be used again soon. ƒ Programmable registers, instr. cache, etc.

ƒ Cache Management ƒ Cache Size and the Replacement Policy

ƒ Movement of Information Between Hierarchy ƒ Hardware Design & Controlling Operating Systems 49 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Storage Hierarchy ƒ Coherency and Consistency ƒ Among several storage levels (vertical) ƒ Multitasking vs unitasking

ƒ Among units of the same storage level , (horizontal), e.g. cache coherency ƒ Multiprocessor or distributed systems CPU

Cache Memory

CPU

cache Memory

50 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Hardware Protection ƒ Goal: ƒ Prevent errors and misuse! ƒ E.g., input errors of a program in a simple batch operating system ƒ E.g., the modifications of data and code segments of another process or OS

ƒ Dual-Mode Operations – a mode bit ƒ User-mode executions except those after a trap or an interrupt occurs. ƒ Monitor-mode (system mode, privileged mode, supervisor mode) ƒ Privileged instruction:machine instructions that may cause harm * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

51

Hardware Protection ƒ System Calls – trap to OS for executing privileged instructions. ƒ Resources to protect ƒ I/O devices, Memory, CPU

ƒ I/O Protection (I/O devices are scare resources!) ƒ I/O instructions are privileged. ƒ User programs must issue I/O through OS ƒ User programs can never gain control over the computer in the system mode. 52 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Hardware Protection ƒ Memory Protection

kernel job1 job2 …… ……

ƒ Goal: Prevent a user program from modifying the code or data structures of either the OS or other users! ƒ Instructions to modify the memory space for a process are privileged. Base register Limit register

Ù Check for every memory address by hardware 53

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Hardware Protection ƒ CPU Protection ƒ Goal ƒ Prevent user programs from sucking up CPU power!

ƒ Use a timer to implement time-sharing or to compute the current time. ƒ Instructions that modify timers are privileged.

ƒ Computer control is turned over to OS for every time-slice of time! ƒ Terms: time-sharing, context switch 54 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Network Structure ƒ Local-Area Network (LAN) ƒ Characteristics: ƒ Geographically distributed in a small area, e.g., an office with different computers and peripheral devices. ƒ More reliable and better speed ƒ High-quality cables, e.g., twisted pair cables for 10BaseT Ethernet or fiber optic cables for 100BaseT Ethernet

ƒ Started in 1970s ƒ Configurations: multiaccess bus, ring, star networks (with gateways) 55 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Network Structure ƒ Wide-Area Network (WAN) ƒ Emerged in late 1960s (Arpanet in 1968)

ƒ World Wide Web (WWW) ƒ Utilize TCP/IP over ARPANET/Internet. • Definition of “Intranet”: roughly speaking for any network under one authorization, e.g., a company or a school. • Often in a Local Area Network (LAN), or connected LAN’s. • Having one (or several) gateway with the outside world. • In general, it has a higher bandwidth because of a LAN. 56 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Network Structure – WAN HINET HINET TARNET TARNET

gateway Intranet Intranet

Intranet Intranet AAIntranet Intranet Intranet AAIntranet

gateway router Intranet Intranet 57

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Network Structure – WAN ƒ Router ƒ With a Routing table ƒ Use some routing protocol, e.g., to maintain network topology by broadcasting.

ƒ Connecting several subnets (of the same IP-orhigher-layer protocols) for forwarding packets to proper subnets.

ƒ Gateway ƒ Functionality containing that of routers. ƒ Connecting several subnets (of different or the same networks, e.g., Bitnet and Internet)for forwarding packets to proper subnets. 58 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Network Structure – WAN ƒ Connections between networks ƒ T1: 1.544 mbps, T3: 45mbps (28T1) ƒ Telephone-system services over T1

ƒ Modems ƒ Conversion of the analog signal and digital signal

59 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Network Layers in Linux applications

Applications Kernel

BSD sockets INET sockets UDP

TCP

Internet Protocol (IP)

Network Layer

PPP * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

SLIP

Ethernet

ARP

60

TCP/IP ƒ IP Address: ƒ 140.123.101.1 ƒ 256*256*256*256 combinations ƒ 140.123 -> Network Address ƒ 101.1 -> Host Address

ƒ Subnet: ƒ 140.123.101 and 140.123.102

ƒ Mapping of IP addresses and host names ƒ Static assignments: /etc/hosts ƒ Dynamic acquisition: DNS (Domain Name Server) ƒ /etc/resolv.confg

ƒ If /etc/hosts is out-of-date, re-check it up with DNS!

ƒ Domain name: cs.ccu.edu.tw as a domain name for 140.123.100, 140.123. 101, and 140.123.103 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

61

TCP/IP ƒ Transmission Control Protocol (TCP) ƒ Reliable point-to-point packet transmissions. ƒ Applications which communicate over TCP/IP with each another must provide IP addresses and port numbers. ƒ /etc/services ƒ Port# 80 for a web server.

ƒ User Datagram Protocol (UDP) ƒ Unreliable point-to-point services.

ƒ Both are over IP. 62 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

TCP/IP ƒ Mapping of Ethernet physical addresses and IP addresses ƒ Each Ethernet card has a built-in Ethernet physical address, e.g., 08-01-2b-00-50-A6. ƒ Ethernet cards only recognize frames with their physical addresses. ƒ Linux uses ARP (Address Resolution Protocol) to know and maintain the mapping. ƒ Broadcast requests over Ethernet for IP address resolution over ARP. ƒ Machines with the indicated IP addresses reply with their Ethernet physical addresses. 63 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

TCP/IP A TCP packet

An IP packet

An Ethernet Ethernet header frame

TCP header + Data

IP header

Data

Data

• Each IP packet has an indicator of which protocol used, e.g., TCP or UDP 64 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual Memory 11. File Systems * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

65

Chapter 3 Operating-System Structures

66

Operating-System Structures ƒ Goals: Provide a way to understand an operating systems ƒ Services ƒ Interface ƒ System Components

ƒ The type of system desired is the basis for choices among various algorithms and strategies! 67 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – Process Management ƒ Process Management ƒ Process: An Active Entity ƒ Physical and Logical Resources ƒ Memory, I/O buffers, data, etc.

Program (code)

ƒ Data Structures Representing Current Activities: + Program Counter Stack Data Section CPU Registers …. And More 68

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – Process Management ƒ Services ƒ ƒ ƒ ƒ ƒ

Process creation and deletion Process suspension and resumption Process synchronization Process communication Deadlock handling

69 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – MainMemory Management ƒ Memory: a large array of words or bytes, where each has its own address ƒ OS must keep several programs in memory to improve CPU utilization and user response time ƒ Management algorithms depend on the hardware support ƒ Services ƒ Memory usage and availability ƒ Decision of memory assignment ƒ Memory allocation and deallocation 70 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – File Management ƒ Goal: ƒ A uniform logical view of information storage ƒ Each medium controlled by a device ƒ Magnetic tapes, magnetic disks, optical disks, etc.

ƒ OS provides a logical storage unit: File ƒ Formats: ƒ Free form or being formatted rigidly.

ƒ General Views: ƒ A sequence of bits, bytes, lines, records

71

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – File Management ƒ Services ƒ File creation and deletion ƒ Directory creation and deletion ƒ Primitives for file and directory manipulation ƒ Mapping of files onto secondary storage ƒ File Backup * Privileges for file access control * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

72

System Components – I/O System Management ƒ Goal: ƒ Hide the peculiarities of specific hardware devices from users

ƒ Components of an I/O System ƒ A buffering, caching, and spooling system ƒ A general device-driver interface ƒ Drivers 73 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – Secondary-Storage Management ƒ Goal: ƒ On-line storage medium for programs & data ƒ Backup of main memory

ƒ Services for Disk Management ƒ Free-space management ƒ Storage allocation, e.g., continuous allocation ƒ Disk scheduling, e.g., FCFS 74 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – Networking ƒ Issues ƒ Resources sharing ƒ Routing & connection strategies ƒ Contention and security

ƒ Network access is usually generalized as a form of file access ƒ World-Wide-Web over file-transfer protocol (ftp), network file-system (NFS), and hypertext transfer protocol (http) 75 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – Protection System ƒ Goal ƒ Resources are only allowed to accessed by authorized processes.

ƒ Protected Resources ƒ Files, CPU, memory space, etc.

ƒ Services ƒ Detection & controlling mechanisms ƒ Specification mechanisms

ƒ Remark: Reliability! 76 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Components – Command-Interpreter system ƒ Command Interpreter ƒ Interface between the user and the operating system ƒ Friendly interfaces ƒ Command-line-based interfaces or mused-based window-and-menu interface

ƒ e.g., UNIX shell and command.com in MS-DOS User-friendly?

Get the next command Execute the command 77

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ Program Execution ƒ Loading, running, terminating, etc

ƒ I/O Operations ƒ General/special operations for devices: ƒ Efficiency & protection

ƒ File-System Manipulation ƒ Read, write, create, delete, etc

ƒ Communications ƒ Intra-processor or inter-processor communication – shared memory or message passing * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

78

Operation-System Services ƒ Error Detection ƒ Possible errors from CPU, memory, devices, user programs Æ Ensure correct & consistent computing

ƒ Resource Allocation ƒ Utilization & efficiency

ƒ Accounting ƒ Protection & Security • user convenience or system efficiency! 79 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ System calls ƒ Interface between processes & OS

ƒ How to make system calls? ƒ Assemble-language instructions or subroutine/functions calls in high-level language such as C or Perl? ƒ Generation of in-line instructions or a call to a special run-time routine.

ƒ Example: read and copy of a file! ƒ Library Calls vs System Calls 80 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ How a system call occurs?

x register

ƒ Types and information

x: parameters for call load address x system call 13

use parameters from table x

Code for System Call 13

ƒ Parameter Passing ƒ Registers ƒ Registers pointing to blocks ƒ Linux

ƒ Stacks 81 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ System Calls ƒ ƒ ƒ ƒ ƒ

Process Control File Management Device Management Information Maintenance Communications

82 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ Process & Job Control ƒ End (normal exit) or abort (abnormal) ƒ Error level or no

ƒ Load and execute ƒ How to return control? ƒ e.g., shell load & execute commands

ƒ Creation and/or termination of processes ƒ Multiprogramming? 83 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ Process & Job Control (continued) ƒ Process Control ƒ Get or set attributes of processes

ƒ Wait for a specified amount of time or an event ƒ Signal event

ƒ Memory dumping, profiling, tracing, memory allocation & de-allocation

84 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ Examples: MS-DOS & UNIX free memory

process A

process

interpreter free memory

command interpreter

process B

kernel

kernel 85

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ File Management ƒ Create and delete ƒ Open and close ƒ Read, write, and reposition (e.g., rewinding) ƒ lseek

ƒ Get or set attributes of files ƒ Operations for directories 86 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ Device management ƒ Request or release ƒ Open and close of special files ƒ Files are abstract or virtual devices.

ƒ Read, write, and reposition (e.g., rewinding) ƒ Get or set file attributes ƒ Logically attach or detach devices 87 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ Information maintenance ƒ Get or set date or time ƒ Get or set system data, such as the amount of free memory

ƒ Communication ƒ Message Passing ƒ Open, close, accept connections ƒ Host ID or process ID

ƒ Send and receive messages ƒ Transfer status information

ƒ Shared Memory ƒ Memory mapping & process synchronization88 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operation-System Services ƒ Shared Memory ƒ Max Speed & Comm Convenience

ƒ Message Passing ƒ No Access Conflict & Easy Implementation Process A M

Process A Shared Memory

Process B M

kernel

M

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process B

kernel

89

System Programs ƒ Goal: ƒ Provide a convenient environment for program development and execution

ƒ Types ƒ ƒ ƒ ƒ

File Management, e.g., rm. Status information, e.g., date. File Modifications, e.g., editors. Program Loading and Executions, e.g., loader. ƒ Programming Language Supports, e.g., compilers. ƒ Communications, e.g., telnet. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

90

System Programs – Command Interpreter ƒ Two approaches: ƒ Contain codes to execute commands ƒ Fast but the interpreter tends to be big! ƒ Painful in revision!

del

cd

91 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Programs – Command Interpreter ƒ

Implement commands as system programs Æ Search exec files which corresponds to commands (UNIX)

ƒ Issues a. Parameter Passing ƒ Potential Hazard: virtual memory

b. Being Slow c. Inconsistent Interpretation of Parameters 92 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Structure – MS-DOS ƒ MS-DOS Layer Structure Application program Resident system program MS-DOS device drivers ROM BIOS device drivers 93 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Structure – UNIX User interface System call interface

useruser user user user user user Shells, compilers, X, application programs, etc. CPU scheduling, signal handling, virtual memory, paging, swapping, file system, disk drivers, caching/buffering, etc.

Kernel interface to the hardware

terminal controller, terminals, physical memory, device controller, devices such as disks, memory, etc.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

UNIX

94

System Structure ƒ A Layered Approach – A Myth new ops

Layer M hidden ops

Layer M-1

existing ops

Advantage: Modularity ~ Debugging & Verification Difficulty: Appropriate layer definitions, less efficiency due to overheads! * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

95

System Structure ƒ A Layer Definition Example: L5 L4 L3 L2 L1 L0

User programs I/O buffering Operator-console device driver Memory management CPU scheduling Hardware 96

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Structure – OS/2 ƒ OS/2 Layer Structure Application

Application

Application

Application-program Interface Subsystem

Subsystem

System kernel

Subsystem

‧memory management ‧task scheduling ‧device management

Device driver Device driver Device driver * Some layers of NT were from user space to kernel space in NT4.0 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

97

System Structure – Microkernels ƒ The concept of microkernels was proposed in CMU in mid 1980s (Mach). ƒ Moving all nonessential components from the kernel to the user or system programs! ƒ No consensus on services in kernel ƒ Mostly on process and memory management and communication

ƒ Benefits: ƒ Ease of OS service extensions Æ portability, reliability, security 98 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Structure – Microkernels ƒ Examples ƒ Microkernels: True64UNIX (Mach kernel), MacOS X (Mach kernel), QNX (msg passing, proc scheduling, HW interrupts, low-level networking) ƒ Hybrid structures: Windows NT Win32 Applications

Win32 Server

OS/2 Applications

OS/2 Server

kernel * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

POSIX Applications

POSIX Server

99

Virtual Machine ƒ Virtual Machines: provide an interface that is identical to the underlying bare hardware processes

interface kernel

hardware

processes

processes

processes

kernel

kernel

kernel

VM1

VM2

VM3

virtual machine implementation hardware 100

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Virtual Machine ƒ Implementation Issues: ƒ Emulation of Physical Devices ƒ E.g., Disk Systems ƒ An IBM minidisk approach

ƒ User/Monitor Modes ƒ (Physical) Monitor Mode ƒ Virtual machine software

ƒ (Physical) User Mode ƒ Virtual monitor mode & Virtual user mode 101 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Virtual Machine virtual user mode

P1/VM1 system call processes processes processes

Trap

virtual monitor kernel 1 kernel 2 kernel 3 mode monitor mode

Finish service

Service for the system call Restart VM1

virtual machine software

Set program counter & register contents, & then restart VM1

hardware

Simulate the effect of the I/O instruction

time 102 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Virtual Machine ƒ Disadvantages: ƒ Slow! ƒ Execute most instructions directly on the hardware

ƒ No direct sharing of resources ƒ Physical devices and communications * I/O could be slow (interpreted) or fast (spooling) 103 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Virtual Machine ƒ Advantages: ƒ Complete Protection – Complete Isolation! ƒ OS Research & Development ƒ System Development Time

ƒ Extensions to Multiple Personalities, such as Mach (software emulation) ƒ Emulations of Machines and OS’s, e.g., Windows over Linux

104 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Virtual Machine – Java java .class files

class loader verifier java interpreter

ƒ Sun Microsystems in late 1995 ƒ Java Language and API Library ƒ Java Virtual Machine (JVM)

ƒ Class loader (for bytecode .class files) ƒ Class verifier ƒ Java interpreter

host system

ƒ An interpreter, a just-in-time (JIT) compiler, hardware 105

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Virtual Machine – Java java .class files

ƒ JVM ƒ Garbage collection ƒ Reclaim unused objects

class loader verifier java interpreter

ƒ Implementation being specific for different systems ƒ Programs are architecture neutral and portable

host system 106 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Design & Implementation ƒ Design Goals & Specifications: ƒ User Goals, e.g., ease of use ƒ System Goals, e.g., reliable

ƒ Rule 1: Separation of Policy & Mechanism ƒ Policy:What will be done? ƒ Mechanism:How to do things? ƒ Example: timer construct and time slice

ƒ Two extreme cases: Microkernel-based OS * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Macintosh OS

107

System Design & Implementation ƒ OS Implementation in High-Level Languages ƒ E.g., UNIX, OS/2, MS NT, etc. ƒ Advantages: ƒ Being easy to understand & debug ƒ Being written fast, more compact, and portable

ƒ Disadvantages: ƒ Less efficient but more storage for code * Tracing for bottleneck identification, exploring of excellent algorithms, 108 etc.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

System Generation ƒ SYSGEN (System Generation) ƒ Ask and probe for information concerning the specific configuration of a hardware system ƒ CPU, memory, device, OS options, etc.

No recompilation & completely Linking of table-driven modules for selected OS

Recompilation of a modified source code

ƒ Issues ƒ Size, Generality, Ease of modification 109 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual Memory 11. File Systems * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

110

Chapter 4 Processes

111

Processes ƒ Objective: ƒ Process Concept & Definitions

ƒ Process Classification: ƒ Operating system processes executing system code ƒ User processes executing system code ƒ User processes executing user code 112 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes ƒ Example: Special Processes in Unix ƒ PID 0 – Swapper (i.e., the scheduler) ƒ Kernel process ƒ No program on disks correspond to this process

ƒ PID 1 – init responsible for bringing up a Unix system after the kernel has been bootstrapped. (/etc/rc* & init or /sbin/rc* & init) ƒ User process with superuser privileges

ƒ PID 2 - pagedaemon responsible for paging ƒ Kernel process * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

113

Processes ƒ Process ƒ A Basic Unit of Work from the Viewpoint of OS ƒ Types: ƒ Sequential processes: an activity resulted from the execution of a program by a processor ƒ Multi-thread processes

ƒ An Active Entity ƒ Program Code – A Passive Entity ƒ Stack and Data Segments

ƒ The Current Activity ƒ PC, Registers , Contents in the Stack and Data Segments 114 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes ƒ Process State new

terminated admitted

ready I/O or event completion

waiting

interrupt scheduled

exit

running

I/O or event wait 115

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes ƒ Process Control Block (PCB) ƒ ƒ ƒ ƒ ƒ ƒ ƒ

Process State Program Counter CPU Registers CPU Scheduling Information Memory Management Information Accounting Information I/O Status Information 116

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes ƒ PCB: The repository for any information that may vary from process to process PCB[] 0 1 2

pointer process state pc register

NPROC-1 117 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes ƒ Process Control Block (PCB) – An Unix Example ƒ proc[i] ƒ Everything the system must know when the process is swapped out. ƒ pid, priority, state, timer counters, etc.

ƒ .u ƒ Things the system should know when process is running ƒ signal disposition, statistics accounting, files[], etc. 118 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes per-process kernel stack

ƒ Example: 4.3BSD

.u argv, argc,… user stack

text structure x_caddr

proc[i] entry

p_p0br

sp

heap Data Segment

p_textp p_addr

Red Zone

page table

Code Segment

PC

u_proc 119 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Processes ƒ Example: 4.4BSD

per-process kernel stack

.u argv, argc,… user stack

process grp

… file descriptors

proc[i] entry

region lists

VM space p_addr

p_p0br

page table

Code Segment u_proc

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

heap Data Segment

120

Process Scheduling ƒ The goal of multiprogramming ƒ Maximize CPU/resource utilization!

ƒ The goal of time sharing ƒ Allow each user to interact with his/her program! PCB1 ready queue

head tail

disk unit 0

head tail

tape unit 1

head tail

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

PCB2

PCB3

121

Process Scheduling – A Queueing Diagram ready queue I/O

dispatch

I/O queue

CPU I/O request time slice expired

child terminate

child executes interrupt occurs

fork a child wait for an interrupt 122

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling – Schedulers ƒ Long-Term (/Job) Scheduler CPU

Job pool Memory

ƒ Goal: Select a good mix of I/O-bound and CPU-bound process

ƒ Remarks: 1. Control the degree of multiprogramming 2. Can take more time in selecting processes because of a longer interval between executions 3. May not exist physically 123 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling – Schedulers ƒ Short-Term (/CPU) Scheduler ƒ Goal:Efficiently allocate the CPU to one of the ready processes according to some criteria.

ƒ Mid-Term Scheduler ƒ Swap processes in and out memory to control the degree of multiprogramming

124 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling – Context Switches ƒ Context Switch ~ Pure Overheads ƒ Save the state of the old process and load the state of the newly scheduled process. ƒ The context of a process is usually reflected in PCB and others, e.g., .u in Unix.

ƒ Issues: ƒ The cost depends on hardware support ƒ e.g. processes with multiple register sets or computers with advanced memory management.

ƒ Threads, i.e., light-weight process (LWP), are

introduced to break this bottleneck! * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

125

Operations on Processes ƒ Process Creation & Termination ƒ Restrictions on resource usage ƒ Passing of Information ƒ Concurrent execution root pagedaemon

swapper user1

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

init user2

user3

126

Operations on Processes ƒ Process Duplication ƒ A copy of parent address space + context is made for child, except the returned value from fork(): ƒ Child returns with a value 0 ƒ Parent returns with process id of child

ƒ No shared data structures between parent and child – Communicate via shared files, pipes, etc. ƒ Use execve() to load a new program ƒ fork() vs vfork() (Unix) 127 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Operations on Processes ƒ Example: … if ( pid = fork() ) == 0) { /* child process */ execlp(“/bin/ls”, “ls”, NULL); } else if (pid < 0) { fprintf(stderr, “Fork Failed”); exit(-1); } else { /* parent process */ wait(NULL); } * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

128

Operations on Processes ƒ Termination of Child Processes ƒ Reasons: ƒ Resource usages, needs, etc.

ƒ Kill, exit, wait, abort, signal, etc. ƒ Cascading Termination

129 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Cooperating Processes ƒ Cooperating processes can affect or be affected by the other processes ƒ Independent Processes

ƒ Reasons: ƒ Information Sharing, e.g., files ƒ Computation Speedup, e.g., parallelism. ƒ Modularity, e.g., functionality dividing ƒ Convenience, e.g., multiple work 130 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Cooperating Processes ƒ A Consumer-Producer Example: ƒ Bounded buffer or unbounded buffer ƒ Supported by inter-process communication (IPC) or by hand coding 2 1 buffer[0…n-1] 0 Initially, n-1 in z out in=out=0; n-2 131 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Cooperating Processes Producer: while (1) { /* produce an item nextp */ while (((in+1) % BUFFER_SIZE) == out) ; /* do nothing */ buffer[ in ] = nextp; in = (in+1) % BUFFER_SIZE; }

132 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Cooperating Processes Consumer: while (1) {

while (in == out) ; /* do nothing */ nextc = buffer[ out ]; out = (out+1) % BUFFER_SIZE ; /* consume the item in nextc */ }

133 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication ƒ Why Inter-Process Communication (IPC)? ƒ Exchanging of Data and Control Information!

ƒ Why Process Synchronization? ƒ Protect critical sections! ƒ Ensure the order of executions! 134 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication ƒ IPC ƒ Shared Memory ƒ Message Passing

ƒ Logical Implementation of Message Passing ƒ Fixed/variable msg size, symmetric/asymmetric communication, direct/indirect communication, automatic/explicit buffering, send by copy or reference, etc. 135 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication ƒ Classification of Communication by Naming ƒ Processes must have a way to refer to each other!

ƒ Types ƒ Direct Communication ƒ Indirect Communication

136 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication – Direct Communication ƒ Process must explicitly name the recipient or sender of a communication ƒ Send(P, msg), Receive(Q, msg)

ƒ Properties of a Link: a. Communication links are established automatically. b. Two processes per a link c. One link per pair of processes d. Bidirectional or unidirectional 137 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication – Direct Communication ƒ Issue in Addressing: ƒ Symmetric or asymmetric addressing

Send(P, msg), Receive(id, msg) ƒ Difficulty: ƒ Process naming vs modularity

138 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication – Indirect Communication ƒ Two processes can communicate only if the process share a mailbox (or ports) send(A, msg)=>

AA

=>receive(A, msg)

ƒ Properties: 1. A link is established between a pair of processes only if they share a mailbox. 2. n processes per link for n >= 1. 3. n links can exist for a pair of processes for n >=1. 139 4. Bidirectional or unidirectional * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication – Indirect Communication ƒ

Issues: a. Who is the recipient of a message? P1

P2 msgs

?

P3

b. Owners vs Users ƒ Process Æ owner as the sole recipient? ƒ OS Æ Let the creator be the owner? Privileges can be passed? Garbage collection is needed? 140 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication – Synchronization ƒ Blocking or Nonblocking (Synchronous versus Asynchronous) ƒ ƒ ƒ ƒ

Blocking send Nonblocking send Blocking receive Nonblocking receive

ƒ Rendezvous – blocking send & receive 141 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication – Buffering ƒ

The Capacity of a Link = the # of messages could be held in the link. ƒ

Zero capacity(no buffering) ƒ

ƒ

Bounded capacity ƒ

ƒ

Sender can continue execution without waiting till the link is full

Unbounded capacity ƒ

ƒ

Msg transfer must be synchronized – rendezvous!

Sender is never delayed!

The last two items are for asynchronous communication and may need acknowledgement 142

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication – Buffering ƒ

Special cases: a. Msgs may be lost if the receiver can not catch up with msg sending Æ synchronization b. Senders are blocked until the receivers have received msgs and replied by reply msgs Æ A Remote Procedure Call (RPC) framework 143

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication – Exception Conditions ƒ

Process termination a. Sender TerminationÆ Notify or terminate the receiver! b. Receiver Termination a. No capacity Æ sender is blocked. b. BufferingÆ messages are accumulated.

144 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Interprocess Communication – Exception Conditions ƒ

Ways to Recover Lost Messages (due to hardware or network failure): ƒ ƒ ƒ

ƒ

OS detects & resends messages. Sender detects & resends messages. OS detects & notify the sender to handle it.

Issues: a. Detecting methods, such as timeout! b. Distinguish multiple copies if retransmitting is possible

ƒ

Scrambled Messages: ƒ

Usually OS adds checksums, such as CRC, inside messages & resend them as necessary! 145

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Example - Mach ƒ Mach – A message-based OS from the Carnegie Mellon University ƒ When a task is created, two special mailboxes, called ports, are also created. ƒ The Kernel mailbox is used by the kernel to communication with the tasks ƒ The Notify mailbox is used by the kernel sends notification of event occurrences. 146 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Example - Mach ƒ

Three system calls for message transfer: ƒ

msg_send: ƒ a. b. c. d.

Options when mailbox is full: Wait indefinitely Return immediately Wait at most for n ms Temporarily cache a message. a. A cached message per sending thread for a mailbox

* One task can either own or receive from a mailbox. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

147

Example - Mach ƒ msg_receive ƒ

To receive from a mailbox or a set of mailboxes. Only one task can own & have a receiving privilege of it * options when mailbox is empty: a. Wait indefinitely b. Return immediately c. Wait at most for n ms

ƒ

msg_rpc ƒ

Remote Procedure Calls 148

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Example - Mach ƒ port_allocate ƒ create a mailbox (owner) ƒ port_status ~ .e.g, # of msgs in a link ƒ All messages have the same priority and are served in a FIFO fashion. ƒ Message Size ƒ A fixed-length head + a variable-length data + two mailbox names ƒ Message copying: message copying Æ remapping of addressing space ƒ System calls are carried out by messages. 149 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Example – Windows 2000 ƒ

Local Procedure Call (LPC) – Message Passing on the Same Processor 1. The client opens a handle to a subsystem’s connection port object. 2. The client sends a connection request. 3. The server creates two private communication ports, and returns the handle to one of them to the client. 4. The client and server use the corresponding port handle to send messages or callbacks and to listen for replies. 150

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Example – Windows 2000 ƒ Three Types of Message Passing Techniques ƒ Small messages ƒ Message copying

ƒ Large messages – section object ƒ To avoid memory copy ƒ Sending and receiving of the pointer and size information of the object

ƒ A callback mechanism ƒ When a response could not be made immediately. 151 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Communication in ClientServer Systems ƒ Socket ƒ An endpoint for communication identified by an IP address concatenated with a port number Host X

ƒ A client-server architecture

Socket Socket

146.86.5.2:1652 146.86.5.2:1652

Web server Socket Socket

161.25.19.8:80 161.25.19.8:80 152 * /etc/services: Port # under 1024 ~ 23-telnet, 21-ftp, 80-web server, etc.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Communication in ClientServer Systems ƒ Three types of sockets in Java

Server

ƒ Connection-oriented (TCP) – Socket class ƒ Connectionless (UDP) – DatagramSocket class ƒ MulticastSocket class – DatagramSocket subclass

sock = new ServerSocket(5155); … client = sock.accept(); pout = new PrintWriter(client.getOutputStream(), true); … Pout.println(new java.util.Date().toString()); pout.close(); client.close();

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Client sock = new Socket(“127.0.0.1”,5155); … in = sock.getInputStream(); bin = new BufferReader(new InputStreamReader(in)); … sock.close(); 153

Communication in ClientServer Systems ƒ Remote Procedure Call (RPC) ƒ A way to abstract the procedure-call mechanism for use between systems with network connection. ƒ Needs: ƒ Ports to listen from the RPC daemon site and to return results, identifiers of functions to call, parameters to pack, etc. ƒ Stubs at the client site ƒ One for each RPC ƒ Locate the proper port and marshall parameters. 154 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Communication in ClientServer Systems ƒ Needs (continued) ƒ Stubs at the server site ƒ Receive the message ƒ Invoke the procedure and return the results.

ƒ Issues for RPC ƒ Data representation ƒ External Data Representation (XDR) ƒ Parameter marshalling

ƒ Semantics of a call ƒ History of all messages processed

ƒ Binding of the client and server port ƒ Matchmaker – a rendezvous mechanism * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

155

Communication in ClientServer Systems Client Call kernel to send RPC msg to Procedure X

Messages Kernel sends msg to matchmaker

Port: matchaker Re: addr. to X

Server Matchmaker receives msg

Kernel places port P in usr RPC msg

Port: kernel Re: port P to X

Matchmaker replies to client with port P

Kernel sends RPC

Port: P

Daemon listens to port P and receives msg

Port: kernel

Daemon processes request and sends output

Kernel receives reply and passes to user * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

156

Communication in ClientServer Systems ƒ An Example for RPC ƒ A Distributed File System (DFS) ƒ A set of RPC daemons and clients ƒ DFS port on a server on which a file operation is to take place: ƒ Disk operations: read, write, delete, status, etc – corresponding to usual system calls 157 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Communication in ClientServer Systems ƒ Remote Method Invocation (RMI) ƒ Allow a thread to invoke a method on a remote object. ƒ boolean val = Server.someMethod(A,B)

ƒ Implementation ƒ Stub – a proxy for the remote object ƒ Parcel – a method name and its marshalled parameters, etc.

ƒ Skeleton – for the unmarshalling of parameters and invocation of the method and the sending of a parcel back 158 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Communication in ClientServer Systems ƒ Parameter Passing ƒ Local (or Nonremote) Objects ƒ Pass-by-copy – an object serialization

ƒ Remote Objects – Reside on a different Java virtual machine (JVM) ƒ Pass-by-reference

ƒ Implementation of the interface – java.io.Serializable 159 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual Memory 11. File Systems * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

160

Chapter 5 Threads

161

Threads ƒ Objectives: ƒ Concepts and issues associated with multithreaded computer systems.

ƒ Thread – Lightweight process(LWP) ƒ a basic unit of CPU utilization ƒ A thread ID, program counter, a register set, and a stack space

ƒ Process – heavyweight process ƒ A single thread of control 162 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Threads code segment

ƒ Motivation ƒ A web browser ƒ Data retrieval ƒ Text/image displaying

ƒ A word processor stack stack stack registers

registers

registers

data segment filesfiles

ƒ Displaying ƒ Keystroke reading ƒ Spelling and grammar checking

ƒ A web server ƒ Clients’ services ƒ Request listening 163

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Threads ƒ Benefits ƒ Responsiveness ƒ Resource Sharing ƒ Economy ƒ Creation and context switching ƒ 30 times slower in process creation in Solaris 2 ƒ 5 times slower in process context switching in Solaris 2

ƒ Utilization of Multiprocessor Architectures 164 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

User-Level Threads ƒ User-level threads are implemented by a thread library at the user level. ƒ Examples: ƒ Advantages

ƒ POSIX Pthreads, Mach C-threads, Solaris 2 UI-threads

ƒ Context switching among them is extremely fast

ƒ Disadvantages ƒ Blocking of a thread in executing a system call can block the entire process. 165 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Kernel-Level Threads ƒ Kernel-level threads are provided a set of system calls similar to those of processes ƒ Examples ƒ Windows 2000, Solaris ƒ Advantage

2, True64UNIX

ƒ Blocking of a thread will not block its entire task.

ƒ Disadvantage ƒ Context switching cost is a little bit higher because the kernel must do the switching. 166 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multithreading Models ƒ Many-to-One Model ƒ Many user-level threads to one kernel thread ƒ Advantage: k

ƒ Efficiency

ƒ Disadvantage: ƒ One blocking system call blocks all. ƒ No parallelism for multiple processors

ƒ Example: Green threads for Solaris 2 167 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multithreading Models ƒ One-to-One Model

k

ƒ One user-level thread to one kernel thread ƒ Advantage: One system call blocks one thread. ƒ Disadvantage: Overheads in creating a kernel thread. ƒ Example: Windows NT, Windows 2000, OS/2 168

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Multithreading Models ƒ Many-to-Many Model ƒ Many user-level threads to many kernel threads ƒ Advantage: k k k

ƒ A combination of parallelism and efficiency

ƒ Example: Solaris 2, IRIX, HPUX,Tru64 UNIX

169 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Threading Issues ƒ Fork and Exec System Calls ƒ Fork: Duplicate all threads or create a duplicate with one thread? ƒ Exec: Replace the entire process, including all threads and LWPs. ƒ Fork Æ exec?

170 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Threading Issues ƒ Thread Cancellation ƒ Target thread ƒ Two scenarios: ƒ Asynchronous cancellation ƒ Deferred cancellation ƒ Cancellation points in Pthread.

ƒ Difficulty ƒ Resources have been allocated to a cancelled thread. ƒ A thread is cancelled while it is updating data. 171 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Threading Issues ƒ Signal Handling ƒ Signal ƒ Synchronous – delivered to the same process that performed the operation causing the signal, ƒ e.g., illegal memory access or division by zero

ƒ Asynchronous ƒ e.g., ^C or timer expiration

ƒ Default or user-defined signal handler ƒ Signal masking * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

172

Threading Issues ƒ Delivery of a Signal ƒ To the thread to which the signal applies ƒ e.g., division-by-zero

ƒ To every thread in the process ƒ e.g., ^C

ƒ To certain threads in the process ƒ Assign a specific thread to receive all threads for the process ƒ Solaris 2

ƒ Asynchronous Procedure Calls (APCs) ƒ To a particular thread rather than a process 173 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Threading Issues ƒ Thread Pools ƒ Motivations ƒ Dynamic creation of threads ƒ Limit on the number of active threads

ƒ Awake and pass a request to a thread in the pool ƒ Benefits ƒ Faster for service delivery and limit on the # of threads

ƒ Dynamic or static thread pools 174 ƒ Thread-specific data – Win32 & Pthreads * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Pthreads ƒ Pthreads (IEEE 1003.1c) ƒ API Specification for Thread Creation and Synchronization ƒ UNIX-Based Systems, Such As Solaris 2.

ƒ User-Level Library ƒ Header File: ƒ pthread_attr_init(), pthread_create(), pthread_exit(), pthread_join(), etc. 175 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Pthreads #include main(int argc, char *argv[]) { … pthread_attr_init(&attr); pthread_create(&tid, &attr, runner, argv[1]); pthread_join(tid, NULL); …} void *runner(void *param) { int i, upper = atoi(param), sum = 0; if (upper > 0) for(i=1;i process priorities ƒ Time: Completion of higher priority ISR, context switch, disabling of certain interrupts, starting of the right ISR (urgent/low-level work, set events) ƒ Usually done by preemptible threads

ƒ Remark: Reducing of non-preemptible code, Priority Tracking/Inheritance (LynxOS), etc. 229

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

A General Architecture ƒ Scheduler ƒ A central part in the kernel ƒ The scheduler is usually driven by a clock interrupt periodically, except when voluntary context switches occur – thread quantum?

ƒ Timer Resolution ƒ Tick size vs Interrupt Frequency ƒ 10ms? 1ms? 1us? 1ns?

ƒ Fine-Grained hardware clock 230 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

A General Architecture ƒ Memory Management ƒ No protection for many embedded systems ƒ Memory-locking to avoid paging

ƒ Process Synchronization ƒ Sources of Priority Inversion ƒ Nonpreemptible code ƒ Critical sections

ƒ A limited number of priority levels, etc. 231 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Algorithm Evaluation ƒ A General Procedure ƒ Select criteria that may include several measures, e.g., maximize CPU utilization while confining the maximum response time to 1 second ƒ Evaluate various algorithms

ƒ Evaluation Methods: ƒ ƒ ƒ ƒ

Deterministic modeling Queuing models Simulation Implementation 232

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deterministic Modeling ƒ A Typical Type of Analytic Evaluation ƒ Take a particular predetermined workload and defines the performance of each algorithm for that workload

ƒ Properties ƒ Simple and fast ƒ Through excessive executions of a number of examples, treads might be identified ƒ But it needs exact numbers for inputs, and its answers only apply to those cases ƒ Being too specific and requires too exact knowledge to be useful! * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

233

Deterministic Modeling FCFC

P2

P1 0 process CPU Burst time P1 10 P2 29 P3 3 P4 7 P5 12

10

P3 P4 39 42

P5 49

61

Average Waiting Time (AWT)=(0+10+39+42+49)/5=28 Nonpreemptive Shortest Job First

P3 P4 P1 P5 0 3 10 20 32

P2 61

AWT=(10+32+0+3+20)/5=13 Round Robin (quantum =10)

P1 0

P2 P3 P4 P5 P2 P5 P2 10 2023 30 40 50 52 61

AWT=(0+(10+20+2)+20+23+(30+10))/5=23

Queueing Models ƒ

Motivation: ƒ

ƒ

Workloads vary, and there is no static set of processes

Models (~ Queueing-Network Analysis) ƒ

Workload: a. Arrival rate: the distribution of times when processes arrive. b. The distributions of CPU & I/O bursts

ƒ

Service rate 235

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Queueing Models ƒ Model a computer system as a network of servers. Each server has a queue of waiting processes ƒ Compute average queue length, waiting time, and so on.

ƒ Properties: ƒ Generally useful but with limited application to the classes of algorithms & distributions ƒ Assumptions are made to make problems solvable => inaccurate results 236 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Queueing Models ƒ Example: Little’s formula

n = λ∗w

λ

w

steady state!

λ

n = # of processes in the queue λ = arrival rate ω = average waiting time in the queue ƒ If n =14 & λ =7 processes/sec, then w = 2 seconds. 237 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Simulation ƒ Motivation: ƒ Get a more accurate evaluation.

ƒ Procedures: ƒ Program a model of the computer system ƒ Drive the simulation with various data sets ƒ Randomly generated according to some probability distributions => inaccuracy occurs because of only the occurrence frequency of events. Miss the order & the relationships of events.

ƒ Trace tapes: monitor the real system & record the sequence of actual events. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

238

Simulation ƒ Properties: ƒ Accurate results can be gotten, but it could be expensive in terms of computation time and storage space. ƒ The coding, design, and debugging of a simulator can be a big job.

239 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Implementation ƒ Motivation: ƒ Get more accurate results than a simulation!

ƒ Procedure: ƒ Code scheduling algorithms ƒ Put them in the OS ƒ Evaluate the real behaviors 240 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Implementation ƒ Difficulties: ƒ Cost in coding algorithms and modifying the OS ƒ Reaction of users to a constantly changing the OS ƒ The environment in which algorithms are used will change ƒ For example, users may adjust their behaviors according to the selected algorithms => Separation of the policy and mechanism! 241 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling Model ƒ Process Local Scheduling ƒ E.g., those for user-level threads ƒ Thread scheduling is done locally to each application.

ƒ System Global Scheduling ƒ E.g., those for Kernel-level threads ƒ The kernel decides which thread to run. 242 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling Model – Solaris 2 ƒ Priority-Based Process Scheduling ƒ Real-Time ƒ System ƒ Kernel-service processes low

ƒ Time-Sharing ƒ A default class

ƒ Interactive

ƒ Each LWP inherits its class from its parent process 243 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling Model – Solaris 2 ƒ Real-Time ƒ A guaranteed response

ƒ System ƒ The priorities of system processes are fixed.

ƒ Time-Sharing ƒ Multilevel feedback queue scheduling – priorities inversely proportional to time slices

ƒ Interactive ƒ Prefer windowing process * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

244

Process Scheduling Model – Solaris 2 ƒ The selected thread runs until one of the following occurs: ƒ It blocks. ƒ It uses its time slice (if it is not a system thread). ƒ It is preempted by a higher-priority thread.

ƒ RR is used when several threads have the same priority. 245 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling Model – Windows 2000 ƒ Priority-Based Preemptive Scheduling ƒ Priority Class/Relationship: 0..31 ƒ Dispatcher: A process runs until ƒ It is preempted by a higher-priority process. ƒ It terminates ƒ Its time quantum ends ƒ It calls a blocking system call

ƒ Idle thread

ƒ A queue per priority level 246 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling Model – Windows 2000 ƒ Each thread has a base priority that represents a value in the priority range of its class. ƒ A typical class – Normal_Priority_Class ƒ Time quantum – thread ƒ Increased after some waiting ƒ Different for I/O devices.

ƒ Decreased after some computation ƒ The priority is never lowered below the base priority.

ƒ Favor foreground processes (more time quantum) * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

247

Process Scheduling Model – Windows 2000 A Typical Class

Base Priority

Realtime

High

Above Normal normal

Below Idle normal priority

Timecritical

31

15

15

15

15

15

Highest

26

15

12

10

8

6

Above normal

25

14

11

9

7

5

Normal

24

13

10

8

6

4

Below normal

23

12

9

7

5

3

Lowest

22

11

8

6

4

2

Idle

16

1

1

1

1

1

Real-Time Class * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Variable Class (1..15)

248

Process Scheduling Model – Linux ƒ Three Classes (POSIX.1b) ƒ Time-Sharing ƒ Soft Real-Time: FCFS, and RR

ƒ Real-Time Scheduling Algorithms ƒ FCFS & RR always run the highest priority process. ƒ FCFS runs a process until it exits or blocks.

ƒ No scheduling in the kernel space for conventional Linux 249 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Scheduling Model – Linux ƒ A Time-Sharing Algorithm for Fairness ƒ Credits = (credits / 2) + priority ƒ Recrediting when no runnable process has any credits. ƒ Mixture of a history and its priority

ƒ Favor interactive or I/O-bound processes ƒ Background processes could be given lower priorities to receive less credits. ƒ nice in UNIX 250 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual Memory 11. File Systems * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

251

Chapter 7 Process Synchronization

252

Process Synchronization ƒ Why Synchronization? ƒ To ensure data consistency for concurrent access to shared data!

ƒ Contents: ƒ Various mechanisms to ensure the orderly execution of cooperating processes

253 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Synchronization ƒ A Consumer-Producer Example ƒ Producer

ƒ Consumer:

while (1) { while (counter == BUFFER_SIZE) ; produce an item in nextp; …. buffer[in] = nextp; in = (in+1) % BUFFER_SIZE; counter++; }

while (1) { while (counter == 0) ; nextc = buffer[out]; out = (out +1) % BUFFER_SIZE; counter--; consume an item in nextc; }

Process Synchronization ƒ

counter++ vs counter— r1 = counter r1 = r1 + 1 counter = r1

ƒ

r2 = counter r2 = r2 - 1 counter = r2

Initially, let counter = 5. 1. 2. 3. 4. 5. 6.

P: r1 = counter P: r1 = r1 + 1 C: r2 = counter C: r2 = r2 – 1 P: counter = r1 C: counter = r2

A Race Condition!

Process Synchronization ƒ A Race Condition: ƒ A situation where the outcome of the execution depends on the particular order of process scheduling.

ƒ The Critical-Section Problem: ƒ Design a protocol that processes can use to cooperate. ƒ Each process has a segment of code, called a critical section, whose execution must be mutually exclusive. 256 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Process Synchronization ƒ A General Structure for the CriticalSection Problem do { permission request

entry section; critical section;

exit notification

exit section; remainder section; } while (1);

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

257

The Critical-Section Problem ƒ

Three Requirements

1. Mutual Exclusion a. Only one process can be in its critical section.

2. Progress a. b.

Only processes not in their remainder section can decide which will enter its critical section. The selection cannot be postponed indefinitely.

3. Bounded Waiting a. A waiting process only waits for a bounded number of processes to enter their critical sections. 258 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

The Critical-Section Problem – A Two-Process Solution ƒ Notation ƒ Processes Pi and Pj, where j=1-i;

do {

ƒ Assumption

while (turn != i) ;

ƒ Every basic machine-language instruction is atomic. ƒ Algorithm 1

critical section

ƒ Idea: Remember which process is allowed to enter its critical section, That is, process i can enter its critical section if turn = i.

turn=j; remainder section } while (1); 259

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

The Critical-Section Problem – A Two-Process Solution ƒ Algorithm 1 fails the progress requirement: suspend or Time quit!

P0 turn=0

exit turn=1

Time

P1 exit turn=0 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

blocked on P1’s entry section 260

The Critical-Section Problem – A Two-Process Solution ƒ Algorithm 2

Initially, flag[0]=flag[1]=false

ƒ Idea: Remember the state of each process. ƒ flag[i]==true Æ Pi is ready to enter its critical section. ƒ Algorithm 2 fails the progress requirement when flag[0]==flag[1]==true;

ƒ the exact timing of the two processes? * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

do { flag[i]=true; while (flag[j]) ; critical section flag[i]=false; remainder section } while (1); 261

* The switching of “flag[i]=true” and “while (flag[j]);”.

The Critical-Section Problem – A Two-Process Solution ƒ Algorithm 3 ƒ Idea: Combine the ideas of Algorithms 1 and 2 ƒ When (flag[i] && turn=i), Pj must wait. ƒ Initially, flag[0]=flag[1]=false, and turn = 0 or 1

do { flag[i]=true; turn=j; while (flag[j] && turn==j) ; critical section flag[i]=false; remainder section } while (1); 262

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

The Critical-Section Problem – A Two-Process Solution ƒ Properties of Algorithm 3 ƒ Mutual Exclusion ƒ The eventual value of turn determines which process enters the critical section.

ƒ Progress ƒ A process can only be stuck in the while loop, and the process which can keep it waiting must be in its critical sections.

ƒ Bounded Waiting ƒ Each process wait at most one entry by the other process. 263 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

The Critical-Section Problem – A Multiple-Process Solution ƒ Bakery Algorithm ƒ Originally designed for distributed systems ƒ Processes which are ready to enter their critical section must take a number and wait till the number becomes the lowest.

ƒ int number[i]: Pi’s number if it is nonzero. ƒ boolean choosing[i]: Pi is taking a number. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

264

The Critical-Section Problem – A Multiple-Process Solution do { choosing[i]=true; number[i]=max(number[0], …number[n-1])+1; choosing[i]=false; for (j=0; j < n; j++) while choosing[j] ; while (number[j] != 0 && (number[j],j) 0) signal(next); else signal(mutex); * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

293

Monitor – Implementation by Semaphores ƒ For every condition x ƒ A semaphore x-sem ƒ An integer variable x-count ƒ Implementation of x.wait() and x.signal : ƒ x.wait() x-count++; if (next-count > 0) signal(next); else signal(mutex); wait(x-sem); x-count--; * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

ƒ x.signal if (x-count > 0) { next-count++; signal(x-sem); wait(next); next-count--; } 294

* x.wait() and x.signal() are invoked within a monitor.

Monitor ƒ Process-Resumption Order ƒ Queuing mechanisms for a monitor and its condition variables. ƒ A solution: x.wait(c);

monitor ResAllc { boolean busy; condition x; void acquire(int time) { if (busy) x.wait(time); busy=true; } … }

ƒ where the expression c is evaluated to determine its process’s resumption order. R.acquire(t); … access the resource; R.release; 295

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Monitor ƒ Concerns: ƒ Processes may access resources without consulting the monitor. ƒ Processes may never release resources. ƒ Processes may release resources which they never requested. ƒ Process may even request resources twice. 296 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Monitor ƒ Remark: Whether the monitor is correctly used? => Requirements for correct computations ƒ Processes always make their calls on the monitor in correct order. ƒ No uncooperative process can access resource directly without using the access protocols.

ƒ Note: Scheduling behavior should consult the built-in monitor scheduling algorithm if resource access RPC are built inside the monitor. 297 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

OS Synchronization – Solaris 2 ƒ Semaphores and Condition Variables ƒ Adaptive Mutex ƒ Spin-locking if the lock-holding thread is running; otherwise, blocking is used.

ƒ Readers-Writers Locks ƒ Expensive in implementations.

ƒ Turnstile ƒ A queue structure containing threads blocked on a lock. ƒ Priority inversion Æ priority inheritance 298 protocol for kernel threads * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

OS Synchronization – Windows 2000 ƒ General Mechanism

ƒ Spin-locking for short code segments in a multiprocessor platform. ƒ Interrupt disabling when access to global variables is done in a uniprocessor platform.

ƒ Dispatcher Object ƒ State: signaled or non-signaled ƒ Mutex – select one process from its waiting queue to the ready queue. ƒ Events – select all processes waiting for the event. 299 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions ƒ Why Atomic Transactions? ƒ Critical sections ensure mutual exclusion in data sharing, but the relationship between critical sections might also be meaningful! Æ Atomic Transactions

ƒ Operating systems can be viewed as manipulators of data! 300 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions – System Model ƒ Transaction – a logical unit of computation ƒ A sequence of read and write operations followed by a commit or an abort.

ƒ Beyond “critical sections” 1. Atomicity: All or Nothing ƒ An aborted transaction must be rolled back. ƒ The effect of a committed transaction must persist and be imposed as a logical unit of operations. 301 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions – System Model 2. Serializability: ƒ

T0 R(A) W(A)

The order of transaction executions must be equivalent to a serial schedule.

T1 R(A) W(A)

R(B) W(B)

Two operations Oi & Oj conflict if 1. Access the same object 2. One of them is write

R(B) W(B) 302 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions – System Model ƒ Conflict Serializable: ƒ S is conflict serializable if S can be transformed into a serial schedule by swapping nonconflicting operations. T0 R(A) W(A)

T1 R(A) W(A)

R(B) W(B) R(B) W(B) * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

T0 R(A) W(A) R(B) W(B)

T1

R(A) W(A) R(B) W(B)

303

Atomic Transactions – Concurrency Control ƒ Locking Protocols ƒ Lock modes (A general approach!) ƒ 1. Shared-Mode: “Reads”. ƒ 2. Exclusive-Mode: “Reads” & “Writes“

ƒ General Rule ƒ A transaction must receive a lock of an appropriate mode of an object before it accesses the object. The lock may not be released until the last access of the object is done. 304 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions – Concurrency Control Lock Request

Locked?

Yes

Request compatible with the current lock?

No

No Lock is granted

Yes

WAIT 305

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions – Concurrency Control ƒ When to release locks w/o violating serializability R0(A) W0(A) R1(A) R1(B) R0(B) W0(B)

ƒ Two-Phase Locking Protocol (2PL) – Not Deadlock-Free Growing Phase

Shrinking Phase

serializable schedules 2PL schedules

ƒ How to improve 2PL? ƒ Semantics, Order of Data, Access Pattern, etc. 306 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Atomic Transactions – Concurrency Control ƒ Timestamp-Based Protocols ƒ A time stamp for each transaction TS(Ti) ƒ Determine transactions’ order in a schedule in advance!

ƒ A General Approach: ƒ TS(Ti) – System Clock or Logical Counter ƒ Unique?

ƒ Scheduling Scheme – deadlock-free & serializable ƒ W − timestamp (Q ) = Max Ti −W ( Q ) (TS (Ti )) ƒ

R − timestamp(Q) = MaxTi − R (Q ) (TS (Ti ))

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

307

Atomic Transactions – Concurrency Control ƒ R(Q) requested by Ti Æ check TS(Ti) ! Rejected

Time

Granted

W-timestamp(Q)

ƒ W(Q) requested by Ti Æ check TS(Ti) ! Rejected

Time

Granted

R-timestamp(Q) Time Rejected Granted W-timestamp(Q)

ƒ Rejected transactions are rolled back and restated with a new time stamp. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

308

Failure Recovery – A Way to Achieve Atomicity ƒ Failures of Volatile and Nonvolatile Storages! ƒ Volatile Storage: Memory and Cache ƒ Nonvolatile Storage: Disks, Magnetic Tape, etc. ƒ Stable Storage: Storage which never fail.

ƒ Log-Based Recovery ƒ Write-Ahead Logging ƒ Log Records < Ti starts > < Ti commits > < Ti aborts > < Ti, Data-Item-Name, Old-Value, New-Value> 309 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Failure Recovery ƒ Two Basic Recovery Procedures: Time restart

crash

ƒ undo(Ti): restore data updated by Ti ƒ redo(Ti): reset data updated by Ti

ƒ Operations must be idempotent! ƒ Recover the system when a failure occurs: ƒ “Redo” committed transactions, and “undo” aborted transactions. 310 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Failure Recovery ƒ Why Checkpointing? ƒ The needs to scan and rerun all log entries to redo committed transactions.

ƒ CheckPoint ƒ Output all log records, Output DB, and Write to stable storage! ƒ Commit: A Force Write Procedure

checkpoint * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

crash

Time

311

Contents 1. Introduction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threads 6. CPU Scheduling 7. Process Synchronization 8. Deadlocks 9. Memory Management 10. Virtual Memory 11. File Systems * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

312

Chapter 8 Deadlocks

313

Deadlocks ƒ A set of process is in a deadlock state when every process in the set is waiting for an event that can be caused by only another process in the set. ƒ A System Model ƒ Competing processes – distributed? ƒ Resources: ƒ Physical Resources, e.g., CPU, printers, memory, etc. ƒ Logical Resources, e.g., files, semaphores, etc. 314 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlocks ƒ

A Normal Sequence 1. Request: Granted or Rejected 2. Use 3. Release

ƒ

Remarks ƒ ƒ

No request should exceed the system capacity! Deadlock can involve different resource types! ƒ

Several instances of the same type! 315

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Characterization ƒ

Necessary Conditions

(deadlock Æ conditions or ¬ conditions Æ ¬ deadlock)

1. Mutual Exclusion – At least one resource must be held in a nonsharable mode! 2. Hold and Wait – Pi is holding at least one resource and waiting to acquire additional resources that are currently held by other processes! 316 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Characterization 3. No Preemption – Resources are nonpreemptible! 4. Circular Wait – There exists a set {P0, P1, …, Pn} of waiting process such that P0 wait P1, P1wait P2, …, Pn-1 wait Pn, and Pn wait P0. ƒ

Remark: ƒ ƒ

Condition 4 implies Condition 2. The four conditions are not completely independent! 317

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Resource Allocation Graph System Resource-Allocation Graph R1 R3

P1

P2

R2

P3

R4

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Vertices Processes: {P1,…, Pn} Resource Type : {R1,…, Rm} Edges Request Edge: Pi Æ Rj Assignment Edge: Ri Æ Pj 318

Resource Allocation Graph ƒ Example R1

R3

ƒ No-Deadlock ƒ Vertices ƒ P = { P1, P2, P3 } ƒ R = { R1, R2, R3, R4 }

P1

P2

P3

ƒ Edges ƒ E = { P1ÆR1, P2ÆR3, R1ÆP2, R2ÆP2, R2ÆP1, R3ÆP3 }

R2

R4

ƒ Resources ƒ R1:1, R2:2, R3:1, R4:3

ƒ Æ results in a deadlock. 319 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Resource Allocation Graph ƒ Observation ƒ The existence of a cycle ƒ One Instance per Resource Type Æ Yes!! ƒ Otherwise Æ Only A Necessary Condition!! R1

P1

R2

P2 P3 P4

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

320

Methods for Handling Deadlocks ƒ

Solutions: 1. Make sure that the system never enters a deadlock state! ƒ ƒ

Deadlock Prevention: Fail at least one of the necessary conditions Deadlock Avoidance: Processes provide information regarding their resource usage. Make sure that the system always stays at a “safe” state!

321 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Methods for Handling Deadlocks 2. Do recovery if the system is deadlocked. ƒ ƒ

Deadlock Detection Recovery

3. Ignore the possibility of deadlock occurrences! ƒ

ƒ

Restart the system “manually” if the system “seems” to be deadlocked or stops functioning. Note that the system may be “frozen” temporarily! 322

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Prevention ƒ Observation: ƒ Try to fail anyone of the necessary condition! ∵ ¬ (∧ i-th condition) → ¬ deadlock

ƒ Mutual Exclusion ?? Some resources, such as a printer, are intrinsically non-sharable?? 323 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Prevention ƒ Hold and Wait ƒ Acquire all needed resources before its execution. ƒ Release allocated resources before request additional resources! [ Tape Drive Æ Disk ]

[ Disk & Printer ]

Hold Them All Disk & Printer Tape Drive & Disk

ƒ Disadvantage: ƒ Low Resource Utilization ƒ Starvation * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

324

Deadlock Prevention ƒ No Preemption ƒ Resource preemption causes the release of resources. ƒ Related protocols are only applied to resources whose states can be saved and restored, e.g., CPU register & memory space, instead of printers or tape drives.

ƒ Approach 1: Resource Request

Satisfied?

No

Allocated resources are released

Yes granted 325 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Prevention ƒ Approach 2 Resource Request

Satisfied?

Yes

granted

No Requested Resources are held by “Waiting” processes?

Yes

Preempt those Resources.

No “Wait” and its allocated resources may be preempted. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

326

Deadlock Prevention ƒ Circular Wait A resource-ordering approach: F:RÆN Resource requests must be made in an increasing order of enumeration.

ƒ Type 1 – strictly increasing order of resource requests. ƒ Initially, order any # of instances of Ri ƒ Following requests of any # of instances of Rj must satisfy F(Rj) > F(Ri), and so on. * A single request must be issued for all needed instances of the same resources. 327 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Prevention ƒ Type 2 ƒ Processes must release all Ri’s when they request any instance of Rj if F(Ri) ≥ F(Rj)

ƒ F : R Æ N must be defined according to the normal order of resource usages in a system, e.g., F(tape drive) = 1 F(disk drive) = 5 ?? feasible ?? F(printer) = 12 328 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Avoidance ƒ Motivation: ƒ Deadlock-prevention algorithms can cause low device utilization and reduced system throughput! Î Acquire additional information about how resources are to be requested and have better resource allocation! ƒ Processes declare their maximum number of resources of each type that it may need.

329

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Avoidance ƒ A Simple Model ƒ A resource-allocation state

ƒ A deadlock-avoidance algorithm dynamically examines the resource-allocation state and make sure that it is safe. ƒ e.g., the system never satisfies the circularwait condition. 330 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Deadlock Avoidance ƒ Safe Sequence ƒ A sequence of processes is a safe sequence if

∀ Pi , need ( Pi ) ≤ Available + ∑ allocated ( Pj ) j 10 spare frames

ƒ Most of the time, the average memory usage is close to the physical memory size if we increase a system’s multiprogramming level! 434 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement ƒ Q: Should we run the 7th processes? ƒ How if the six processes start to ask their shares?

ƒ What to do if all memory is in use, and more memory is needed? ƒ Answers ƒ Kill a user process! ƒ But, paging should be transparent to users?

ƒ Swap out a process! ƒ Do page replacement! 435 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement ƒ A Page-Fault Service ƒ Find the desired page on the disk! ƒ Find a free frame ƒ Select a victim and write the victim page out when there is no free frame! ƒ Read the desired page into the selected frame. ƒ Update the page and frame tables, and restart the user process. 436 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement Logical Memory P1 PC

0

H

1 Load M 2

J

3 v

0

OS

4 v

1

OS

2

D

3

H

5 v i

3

P2

Page Table

4

M/B

i

5

J

D

2 v

6

A

E

7 v

7

E

0

A

1

B

2 3

6 v

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

B M

437

Page Replacement ƒ Two page transfers per page fault if no frame is available! Page Table 6 4 3 7

V V V V

Valid-Invalid Bit

N N Y Y

Modify Bit is set by the hardware automatically! Modify (/Dirty) Bit! To “eliminate” ‘swap out” => Reduce I/O time by one-half

438

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement ƒ Two Major Pieces for Demand Paging ƒ Frame Allocation Algorithms ƒ How many frames are allocated to a process? ƒ Page Replacement Algorithms ƒ When page replacement is required, select the frame that is to be replaced! ƒ Goal: A low page fault rate!

ƒ Note that a bad replacement choice does not cause any incorrect execution! 439 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement Algorithms ƒ Evaluation of Algorithms ƒ Calculate the number of page faults on strings of memory references, called reference strings, for a set of algorithms

ƒ Sources of Reference Strings ƒ Reference strings are generated artificially. ƒ Reference strings are recorded as system traces: ƒ How to reduce the number of data? 440 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement Algorithms ƒ Two Observations to Reduce the Number of Data: ƒ Consider only the page numbers if the page size is fixed. ƒ Reduce memory references into page references ƒ If a page p is referenced, any immediately following references to page p will never cause a page fault. ƒ Reduce consecutive page references of page p into one page reference. 441 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Replacement Algorithms ƒ Example XX

XX

page# offset 0100, 0432, 0101, 0612, 0103, 0104, 0101, 0611 01, 04, 01, 06, 01, 01, 01, 06 01, 04, 01, 06, 01, 06

Does the number of page faults decrease when the number of page frames available increases? 442 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

FIFO Algorithm ƒ

A FIFO Implementation 1. Each page is given a time stamp when it is brought into memory. 2. Select the oldest page for replacement!

reference 7 string

0

1

2

7

7

7

0

page frames

FIFO queue

7

7 0

0

3

0

4

2

3

0

2

2

2

4

4

4

0

0

3

3

3

2

1

1

1

0

0

7 0 1

0 1 2

1 2 3

2 3 0

3 0 4

3

2

1

2

0

0

2

2

0

3

0 4 2

4 2 3

0

1

7

0

0

7

7

7

1

1

1

0

0

3

3

2

2

2

1

2 3 0

3 0 1

0 1 2

1 2 7

2 7 0

7 0 1 443

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

1

FIFO Algorithm ƒ The Idea behind FIFO ƒ The oldest page is unlikely to be used again. ??Should we save the page which will be used in the near future??

ƒ Belady’s anomaly ƒ For some page-replacement algorithms, the page fault rate may increase as the number of allocated frames increases. 444 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

FIFO Algorithm Run the FIFO algorithm on the following reference: string: 1 2 3 4 1 2 5 1 2 3 4 5 3 frames

4 frames

1

1 2

1 2 3

2 3 4

3 4 1

4 1 2

1 2 5

1 2 5

1 2 5

2 5 3

5 3 4

5 3 4

1

1 2

1 2 3

1 2 3 4

1 2 3 4

1 2 3 4

2 3 4 5

3 4 5 1

4 5 1 2

5 1 2 3

1 2 3 4

2 3 4 5

Push out pages that will be used later! 445 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Optimal Algorithm (OPT) ƒ Optimality ƒ One with the lowest page fault rate.

ƒ Replace the page that will not be used for the longest period of time. ÅÆ Future Prediction reference 7 string

0

1

2

7

7

7

2

2

2

2

2

7

0

0

0

0

4

0

0

0

1

1

3

3

3

1

1

page frames

0

3

0

next 0

4

2

3

3

2

1

2

0

1

7

0

next 7 next 1

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

0

446

1

Least-Recently-Used Algorithm (LRU) ƒ The Idea: ƒ OPT concerns when a page is to be used! ƒ “Don’t have knowledge about the future”?!

ƒ Use the history of page referencing in the past to predict the future! S ? SR ( SR is the reverse of S !) 447 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

LRU Algorithm ƒ Example reference 7 string

0

1

2

7

7

7

2

0

0

page frames

LRU queue

0

0 7

0

4

2

3

0

2

4

4

4

0

1

1

7

0

0

0

0

3

3

3

0

0

1

1

3

3

2

2

2

2

2

7

1 0 7

2 1 0

4 0 3

2 4 0

3 2 4

0 3 2

0 2 1

3

3 0 2

0

0 3 2

3

3 0 2

2

2 3 0

1

1 2 3

2

2 1 3

0

0 2 1

1

1 0 2

7

7 1 0

0

1

0 1 7 0 1 7

a wrong prediction!

Remark: LRU is like OPT which “looks backward” in time.448 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

LRU Implementation – Counters Logical Address

CPU

p

f d

f

Physical Memory

A Logical Clock

p



cnt++

d

time

frame # v/i tag ……

Update the “time-of-use” field

Page Table for Pi * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Time of Last Use!

Disk 449

LRU Implementation – Counters ƒ Overheads ƒ The logical clock is incremented for every memory reference. ƒ Update the “time-of-use” field for each page reference. ƒ Search the LRU page for replacement. ƒ Overflow prevention of the clock & the maintenance of the “time-of-use” field of each page table. 450 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

LRU Implementation – Stack Logical Address

CPU

p

f d

f

Physical Memory

Head



p

frame #

v/i

……



A LRU Stack

d

move Page Table Tail (The LRU page!)

Overheads: Stack maintenance per memory reference ~ no search for page replacement!

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Disk 451

A Stack Algorithm memoryresident pages

n frames available



memoryresident pages

(n +1) frames available

ƒ Need hardware support for efficient implementations. ƒ Note that LRU maintenance needs to be done for every memory reference. 452 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

LRU Approximation Algorithms ƒ Motivation ƒ No sufficient hardware support ƒ Most systems provide only “reference bit” which only indicates whether a page is used or not, instead of their order.

ƒ ƒ ƒ ƒ

Additional-Reference-Bit Algorithm Second-Chance Algorithm Enhanced Second Chance Algorithm Counting-Based Page Replacement 453

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Additional-Reference-Bits Algorithm ƒ Motivation ƒ Keep a history of reference bits 1 0

01101101 10100011 …

… 0 1 reference bit

11101010 00000001

OS shifts all history registers right by one bit at each regular interval!!

one byte per page in memory 454

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Additional-Reference-Bits Algorithm ƒ History Registers LRU (smaller value!)

00000000 00000001

MRU

11111110 11111111

Not used for 8 times

Used at least once every time

ƒ But, how many bits per history register should be used? ƒ Fast but cost-effective! ƒ The more bits, the better the approximation is.

455

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Second-Chance (Clock) Algorithm Reference Bit

Page

0

Reference Bit

Page

0

0

0

1

0

1

0

0

0 1

1

1

ƒ Use the reference bit only

ƒ Basic Data Structure: ƒ Circular FIFO Queue

ƒ Basic Mechanism ƒ When a page is selected …





… 1

ƒ Motivation

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

ƒ Take it as a victim if its reference bit = 0 ƒ Otherwise, clear the bit and advance to the next page 456

Enhanced Second-Chance Algorithm ƒ Motivation: ƒ Consider the cost in swapping out” pages.

ƒ 4 Classes (reference bit, modify bit) low priority high priority

ƒ ƒ ƒ ƒ

(0,0) – not recently used and not “dirty” (0,1) – not recently used but “dirty” (1,0) – recently used but not “dirty” (1,1) – recently used and “dirty”

457 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Enhanced Second-Chance Algorithm ƒ Use the second-chance algorithm to replace the first page encountered in the lowest nonempty class. => May have to scan the circular queue several times before find the right page.

ƒ Macintosh Virtual Memory Management

458 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Counting-Based Algorithms ƒ Motivation:

ƒ Count the # of references made to each page, instead of their referencing times.

ƒ Least Frequently Used Algorithm (LFU) ƒ LFU pages are less actively used pages! ƒ Potential Hazard: Some heavily used pages may no longer be used ! ƒ A Solution – Aging

ƒ Shift counters right by one bit at each regular interval. 459 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Counting-Based Algorithms ƒ Most Frequently Used Algorithm (MFU) ƒ Pages with the smallest number of references are probably just brought in and has yet to be used!

ƒ LFU & MFU replacement schemes can be fairly expensive! ƒ They do not approximate OPT very well!

460 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Page Buffering ƒ

Basic Idea a. Systems keep a pool of free frames b. Desired pages are first “swapped in” some pages in the pool. c. When the selected page (victim) is later written out, its frame is returned to the pool.

ƒ

Variation 1 a. Maintain a list of modified pages. b. Whenever the paging device is idle, a modified page is written out and reset its “modify bit”.

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

461

Page Buffering ƒ Variation 2 a. Remember which page was in each frame of the pool. b. When a page fault occurs, first check whether the desired page is there already. ƒ Pages which were in frames of the pool must be “clean”. ƒ “Swapping-in” time is saved!

ƒ VAX/VMS with the FIFO replacement algorithm adopt it to improve the performance of the FIFO algorithm. 462 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Frame Allocation – Single User ƒ Basic Strategy: ƒ User process is allocated any free frame. ƒ User process requests free frames from the free-frame list. ƒ When the free-frame list is exhausted, page replacement takes place. ƒ All allocated frames are released by the ending process.

ƒ Variations ƒ O.S. can share with users some free frames for special purposes. ƒ Page Buffering - Frames to save “swapping” time 463 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Frame Allocation – Multiple Users ƒ Fixed Allocation a. Equal Allocation m frames, n processes Æ m/n frames per process

b. Proportional Allocation 1. Ratios of Frames ∝ Size S = Σ Si, Ai ∝ (Si / S) x m, where (sum = minimum # of frames required)

2. Ratios of Frames ∝ Priority Si : relative importance

3. Combinations, or others. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

464

Frame Allocation – Multiple Users ƒ Dynamic Allocation a. Allocated frames ∝ the multiprogramming level b. Allocated frames ∝ Others

ƒ The minimum number of frames required for a process is determined by the instruction-set architecture. ƒ ADD A,B,C Æ 4 frames needed ƒ ADD (A), (B), (C) Æ 1+2+2+2 = 7 frames, where (A) is an indirect addressing. * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

465

Frame Allocation – Multiple Users ƒ Minimum Number of Frames (Continued) ƒ How many levels of indirect addressing should be supported? 16 bits

address 0

1

direct indirect

ƒ It may touch every page in the logical address space of a process => Virtual memory is collapsing!

ƒ A long instruction may cross a page boundary. MVC

X, Y, 256 Æ 2 + 2 + 2 = 6 frames

ƒ The spanning of the instruction and the operands. 466 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Frame Allocation – Multiple Users ƒ Global Allocation ƒ Processes can take frames from others. For example, high-priority processes can increase its frame allocation at the expense of the low-priority processes!

ƒ Local Allocation ƒ Processes can only select frames from their own allocated frames Æ Fixed Allocation ƒ The set of pages in memory for a process is affected by the paging behavior of only that process. 467 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Frame Allocation – Multiple Users ƒ Remarks a.Global replacement generally results in a better system throughput b.Processes can not control their own page fault rates such that a process can affect each another easily.

468 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Thrashing ƒ Thrashing – A High Paging Activity: ƒ A process is thrashing if it is spending more time paging than executing.

ƒ Why thrashing? ƒ Too few frames allocated to a process! CPU utilization

Thrashing

thrashing

under a global pagereplacement algorithm Dispatch a new process

low CPU utilization

degree of multiprogramming 469 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Thrashing ƒ Solutions: ƒ Decrease the multiprogramming level Æ Swap out processes! ƒ Use local page-replacement algorithms ƒ Only limit thrashing effects “locally” ƒ Page faults of other processes also slow down.

ƒ Give processes as many frames as they need! ƒ But, how do you know the right number of frames for a process? 470 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Locality Model localityi = {Pi,1,Pi,2,…,Pi,ni}

localityj = {Pj,1,Pj,2,…,Pj,nj} control flow

ƒ A program is composed of several different (overlapped) localities. ƒ Localities are defined by the program structures and data structures (e.g., an array, hash tables)

ƒ How do we know that we allocate enough frames to a process to accommodate its current locality? * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

471

Working-Set Model Page references …2 6 1 5 7 7 7 7 5 1 6 2 3 4 1 2 3 4 4 4 3 4 3 4 4 4 Δ Δ working-set window

working-set window

t1

working-set(t1) = {1,2,5,6,7}

t2

working-set(t2) = {3,4}

ƒ The working set is an approximation of a program’s locality. The minimum allocation

Δ

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.



All touched pages may cover several localities.

472

Working-Set Model D = ∑ working − set − sizei ≤ M where M is the total number of available frames. D>M D>M Suspend some processes and swap out their pages.

“Safe”

Extra frames are available, and initiate new processes.

D≦M 473 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Working-Set Model ƒ The maintenance of working sets is expensive! ƒ Approximation by a timer and the reference bit 0 1

timer!

1 0 reference bit

…… …… …… ……

shift or copy

in-memory history

ƒ Accuracy v.s. Timeout Interval! * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

474

Page-Fault Frequency ƒ Motivation ƒ Control thrashing directly through the observation on the page-fault rate! page-fault rate

increase # of frames!

upper bound lower bound decrease # of frames!

number of frames *Processes are suspended and swapped out if the number of available frames is reduced to that under the minimum needs.475 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

OS Examples – NT ƒ Virtual Memory – Demand Paging with Clustering ƒ Clustering brings in more pages surrounding the faulting page!

ƒ Working Set ƒ A Min and Max bounds for a process ƒ Local page replacement when the max number of frames are allocated.

ƒ Automatic working-set trimming reduces allocated frames of a process to its min when the system threshold on the available frames is reached. 476 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

OS Examples – Solaris ƒ Process pageout first clears the reference bit of all pages to 0 and then later returns all pages with the reference bit = 0 to the system (handspread).

8192 fastscan

ƒ 4HZ Æ 100HZ when desfree is reached!

100 slowscan minfree desfree

lotsfree

ƒ Swapping starts when desfree fails for 30s.

ƒ pageout runs for every request to a new page when minfree is reached. 477 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Other Considerations ƒ Pre-Paging ƒ Bring into memory at one time all the pages that will be needed! ready processes

ƒ Issue

swapped out resumed

suspended processes

Do pre-paging if the working set is known!

Pre-Paging Cost

Cost of Page Fault Services

Not every page in the working set will be used! * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

478

Other Considerations ƒ Page Size Better Page Size Resolution for Locality & small large p d Internal 9)~16,384B(212) 512B(2 Fragmentation

Smaller Page Table Size & Better I/O Efficiency

ƒ Trends - Large Page Size ∵ The CPU speed and the memory capacity grow much faster than the disk speed! 479 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Other Considerations ƒ TLB Reach ƒ TLB-Entry-Number * Page-Size

ƒ Wish ƒ The working set is stored in the TLB! ƒ Solutions ƒ Increase the page size ƒ Have multiple page sizes – UltraSparc II (8KB - 4MB) + Solaris 2 (8KB or 4MB) 480 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Other Considerations ƒ Inverted Page Table ƒ The objective is to reduce the amount of physical memory for page tables, but they are needed when a page fault occurs! ƒ More page faults for page tables will occur!!!

481 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Other Considerations ƒ Program Structure ƒ Motivation – Improve the system performance by an awareness of the underlying demand paging.

128 words

A(1,1) A(1,2) . . A(1,128)

var A: array [1..128,1..128] of integer; for j:=1 to 128 for i:=1 to 128 A(i,j):=0 A(2,1) A(128,1) A(2,2) A(128,2) 128x128 page . . …… faults if the . . process has A(2,128) A(128,128) less than 128 frames!! 128 pages 482

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Other Considerations ƒ Program Structures: ƒ Data Structures ƒ Locality: stack, hash table, etc. ƒ Search speed, # of memory references, # of pages touched, etc.

ƒ Programming Language ƒ Lisp, PASCAL, etc.

ƒ Compiler & Loader ƒ Separate code and data ƒ Pack inter-related routines into the same page ƒ Routine placement (across page boundary?) 483 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

I/O Interlock

buffer

Physical Memory

Drive

• DMA gets the following information of the buffer: • Base Address in Memory • Chunk Size • Could the buffer-residing pages be swapped out?

484

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

I/O Interlock ƒ Solutions ƒ I/O Device ÅÆ System Memory ÅÆ User Memory ƒ Extra Data Copying!!

ƒ Lock pages into memory ƒ The lock bit of a page-faulting page is set until the faulting process is dispatched! ƒ Lock bits might never be turned off! ƒ Multi-user systems usually take locks as “hints” only! 485 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Real-Time Processing Predictable Behavior

Virtual memory introduces unexpected, long-term delays in the execution of a program.

ƒ Solution: ƒ Go beyond locking hints Î Allow privileged users to require pages being locked into memory! 486 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.

Demand Segmentation ƒ Motivation ƒ Segmentation captures better the logical structure of a process! ƒ Demand paging needs a significant amount of hardware!

ƒ Mechanism ƒ Like demand paging! ƒ However, compaction may be needed! ƒ Considerable overheads! 487 * All rights reserved, Tei-Wei Kuo, National Taiwan University, 2004.