This is the printout version of my lecture slides for the OS course. It includes more details (quations from books, references, etc.) than the slides version.
This document provides an overview of sharding in MongoDB. It introduces sharding as a method for horizontally scaling databases across multiple machines to support large datasets and high throughput. The key components of a sharded MongoDB cluster are shards, which store data, and config servers, which maintain cluster metadata. The document discusses various sharding techniques for partitioning data and how the cluster maintains a balanced distribution through operations like splitting and balancing. It also outlines common cluster architectures and behaviors related to query routing, availability and more. Finally, it lists tutorials for deploying, managing and troubleshooting sharded clusters.
This document is a Python tutorial that provides an overview of the Python programming language. It covers topics like using the Python interpreter, basic syntax like variables and data types, control flow tools like if/else statements and for loops, defining functions, working with data structures like lists, tuples and dictionaries, modules and packages, input/output functions, exceptions and errors, and an introduction to classes. The tutorial is intended to help new Python programmers get started with the basics of the language.
This document discusses replication in MongoDB and replica sets. It introduces the purpose of replication, which provides redundancy and increases data availability. A replica set in MongoDB is a group of mongod processes that maintain the same data set. The primary mongod instance receives all write operations and replicates the write operations to the secondary members via an oplog. Replica sets provide redundancy, high availability, and the ability to recover from hardware failures through automatic failover of a secondary to primary role. The document covers concepts like replica set members, deployment architectures, replication processes, and tutorials for common administrative tasks.
This document provides an overview of SystemTap, an instrumentation tool for the Linux kernel. It describes SystemTap's architecture and technical details, how to install it, and examples of using it to analyze performance and functional problems in the kernel. Key topics covered include SystemTap's probing capabilities, scripting language, common use cases like call graph generation and function timing, and debugging techniques for issues like TCP/IP, page faults, and NFS.
Ibm info sphere datastage data flow and job designdivjeev
This document provides an overview of IBM InfoSphere DataStage and discusses its key functions and best practices. It contains chapters that describe various IBM InfoSphere DataStage stages and components, present a retail industry scenario to demonstrate how to design and implement ETL jobs, and include additional reference material.
This document is the user manual for Oracle VM VirtualBox version 4.3.18. It provides information on installing VirtualBox on various host operating systems, creating and configuring virtual machines, installing Guest Additions to enhance the guest experience, and managing virtual storage and disk images. The manual covers topics such as supported host/guest systems, emulated hardware, general and system settings for VMs, shared folders, USB and network configuration, and virtual storage controllers and disk image formats.
This document is a textbook about the science of computing. It is divided into chapters that cover various topics in computing including: logic circuits, data representation, computational circuits, computer architecture, operating systems, artificial intelligence, and language and computation. The textbook is copyrighted to Carl Burch and is intended to provide an introduction to fundamental concepts in computer science.
This document is the user manual for Oracle VM VirtualBox version 5.0.2. It provides information on installing VirtualBox on various host operating systems like Windows, Mac OS X and Linux. It also covers configuring virtual machines by setting options for hardware, storage, network, audio and more. Additionally, it describes using features like snapshots, cloning VMs, importing/exporting and Guest Additions to enhance VM functionality.
This document discusses database partitioning, table partitioning, and multi-dimensional clustering (MDC) in DB2 9. It begins with an introduction to these partitioning technologies and their concepts. It then covers the benefits and considerations of each approach. The remainder of the document provides implementation examples and best practices for database partitioning, table partitioning, and their administration.
This document provides an overview of the Maxima computer algebra system, including its history and development from Macsyma. It describes several interfaces for interacting with Maxima, such as the terminal interface, Emacs interface, and Xmaxima graphical interface. It also covers basic functions and operations in Maxima like trigonometric functions, differentiation, integration, and solving ordinary differential equations. The document consists of multiple parts that describe additional packages, installing Maxima, resources and troubleshooting.
This document provides instructions for installing and configuring IBM's Tivoli Intelligent ThinkDynamic Orchestrator software. It guides the reader through planning a demonstration of the software, installing necessary components on Windows systems, designing a sample data center model using XML, and loading and testing the model. The final chapter describes demonstrating the software's capabilities to monitor and manage resources and applications in the simulated data center.
The document provides an introduction and overview of NVIDIA's CUDA programming model and architecture. It discusses CUDA as an extension to C/C++ that allows developers to leverage the parallel compute engine of NVIDIA GPUs. The document covers the programming model, hardware implementation, application programming interface, and performance guidelines for CUDA programming.
This document is a tutorial for the Python programming language. It covers topics such as using Python as a calculator, basic programming concepts like variables and functions, built-in data types like lists and dictionaries, modules and packages, input/output functions, exceptions and errors, classes and object-oriented programming, and an overview of the Python standard library. The tutorial is intended for new Python programmers to help them learn the fundamentals of the language.
This document is a tutorial for the Python programming language. It covers topics such as using Python as a calculator, basic programming concepts like variables and functions, built-in data types like lists and dictionaries, modules and packages, input/output functions, exceptions and errors, classes and object-oriented programming, and an overview of the Python standard library. The tutorial is intended for new Python programmers to help them learn the fundamentals of the language.
Faronics Deep Freeze Enterprise User GuideFaronics
This document is the user guide for Deep Freeze Enterprise, which allows administrators to centrally manage and deploy software configurations on multiple computers. It describes how to install and configure Deep Freeze Configuration Administrator and Enterprise Console to customize settings like frozen drives, ThawSpace locations, scheduled tasks and more. The Enterprise Console then enables monitoring and managing Deep Freeze across a network from a single interface.
This document provides an introduction to security on mainframe systems. It discusses fundamental security concepts like confidentiality, integrity and availability. It also covers security elements such as identification, authentication, authorization, encryption and auditing. Additionally, it examines the System z architecture and how the hardware and operating system provide security features. The document uses a case study about securing an online bookstore to illustrate how these concepts apply in a business context. It is intended to help readers understand mainframe security.
This document provides a user's guide for GNU PSPP statistical analysis software. It includes information on invoking and using PSPP, preparing data files, conducting statistical tests and analyses, and the PSPP scripting language. The guide covers topics such as defining variables, reading and writing data files, screening data, performing hypothesis tests and linear regression, and the syntax and commands of the PSPP language.
BOOK - IBM Implementing ibm system directory 6.1Satya Harish
This document provides an overview and guide to implementing IBM Systems Director 6.1. It discusses the key features and components of IBM Systems Director 6.1. It also covers planning considerations for hardware, software, security and other aspects. The document aims to help readers get the most out of IBM Systems Director 6.1 through practical implementation guidance and real-world scenarios.
The document discusses the Linux kernel buffer cache. It describes the structure of buffer headers and the buffer pool. It outlines 5 scenarios for retrieving a buffer, including if the block is found in the hash queue, a free buffer is available, or if a delayed write buffer needs to be written first. It also covers reading and writing blocks to disk using functions like bread(), breada(), bwrite(), and brelse(). The advantages of the buffer cache in reducing disk access and ensuring integrity are presented.
This document appears to be a contact sheet from a photography student containing 10 black and white photos exploring different elements of design. The photos are labeled with their subjects and the design element they relate to, such as "Analogous Colors", "Line", "Shape", "Space", and "Texture". The contact sheet shows a student's black and white photos taken in 2015 for a class project focusing on various elements of design in photography.
The document describes the different types of files in Unix/Linux systems. It discusses regular files, directories, FIFO files, character device files, and block device files. It also outlines some of the key attributes of files like permissions, owners, sizes, and times. The document explains how files are uniquely identified by their inode number and file system ID in the Unix file system.
Unix file systems 2 in unix internal systems senthilamul
The document discusses how UNIX organizes and accesses files on disk. It describes the file system structure, including inodes which contain metadata about each file, directories which map filenames to inodes, and block allocation which determines how file data is physically stored across disk blocks. It also covers subdirectories, hard and soft links, and comparisons of different file allocation strategies like contiguous, block, and extent-based allocation.
The document discusses the UNIX file system. It describes how the file system is organized in a tree structure that can be arbitrarily deep. Files include regular files, directories, device files, UNIX domain sockets, and named pipes. File permissions are managed through permission bits and special flags like setuid and setgid. Inodes store metadata about files like timestamps, ownership, and size. The file system is mounted to map directories to storage resources and unmounted to detach them.
The document discusses inodes and how they describe files in a file system by storing metadata such as ownership, times, size and permissions. It also discusses how data blocks store the actual contents of files. Inodes can directly reference up to 12 data blocks, and additionally store addresses of indirect, double indirect, and triple indirect blocks to reference additional data blocks and allow file sizes up to 64.375 TB in size, though there is some overhead for storing the block addresses.
One of the tutorials I gave at University of Wollongong on inodes.
Describes direct, single, double and triple linking, and how that all ties together with addressable space.
This presentation discusses system calls and provides an overview of their key aspects:
System calls provide an interface between processes and the operating system. They allow programs to request services from the OS like reading/writing files. There are different methods of passing parameters to the OS, such as via registers, parameter blocks, or pushing to the stack. System calls fall into categories including process control, file management, device management, information maintenance, and communication. An example is given of how system calls would be used in a program to copy data between two files.
This presentation covers the understanding of system calls for various resource management and covers system calls for file management in details. The understanding of using system calls helps to start with working with device driver programming on Unix/Linux OS.
This document provides an overview of Linux performance and tuning guidelines. It discusses Linux processes, memory, file systems, I/O subsystems, networking, and performance monitoring tools. The document is intended to help readers understand how Linux works and how to optimize system performance.
eclipse is an open source programming tool.
s an open-source software system
whose aim is to serve as a platform for integrating various Logic Programming extensions
This document is a 241-page tutorial on the Perl 5 programming language. It was written by Chan Bernard Ki Hong and published in 2003 under a Creative Commons license. The document provides an introduction to Perl and teaches the fundamentals of Perl programming, including data structures, operators, conditionals, loops, subroutines, references and more. It aims to help readers learn Perl and further improve the quality of the tutorial through reader feedback.
This document outlines an internal Barco training on embedded Linux for engineering. It covers topics like cross-compilation toolchains, the Linux boot process, bootloaders, and the Linux kernel, including building a kernel, device trees, device drivers, and a real-life Barco example. Hands-on sections provide examples for exploring U-Boot, replacing a bootloader, building a kernel, and more.
This document provides an overview and tutorial for HADES, a digital circuit design and simulation tool. It covers installing and using HADES, including creating components, wiring circuits, simulating designs, and advanced features like hierarchical designs, scripting, and writing new components. The document is the HADES Tutorial version 0.92 from December 21, 2006 by Norman Hendrich of the University of Hamburg.
Efficient algorithms for sorting and synchronizationrmvvr143
This document is the thesis of Andrew Tridgell submitted for the degree of Doctor of Philosophy at The Australian National University in February 1999. The thesis presents efficient algorithms for internal and external parallel sorting and for remote data update (rsync). The internal sorting algorithm approaches the problem by first using an incorrect but fast algorithm to almost sort the data before performing a cleanup phase. The external sorting algorithm partitions data across disks before performing sorting within each partition. The rsync algorithm operates by exchanging block signatures followed by a simple hash search algorithm to efficiently synchronize remote files. Performance results are presented for each algorithm along with comparisons to other related work.
This document is the manual for the GNU debugger gdb version 7.0.50. It describes how to invoke and use gdb to debug programs. Gdb can be used to run programs, set breakpoints, examine the stack and variables, search source code, and more. The manual covers basics like starting and stopping a program as well as more advanced topics like debugging multi-threaded programs and recording program execution.
This document provides an introduction to Java web programming. It covers topics like HTML, HTTP protocol, servlets, JavaServer Pages (JSP), tag libraries, and best practices. The document is divided into 8 chapters that progress from basic concepts to more advanced topics such as session management, building web applications, and custom tag libraries. It includes examples and lab activities to help readers learn Java web development.
This document describes the implementation of an Android kernel rootkit for unrooted stock Android images. It discusses the structure of the Linux kernel, existing kernel rootkits and related work. The thesis then presents a concept for an Android kernel rootkit and infection method. It details the implementation process, including building a customized kernel, developing the rootkit module and creating an exploit tool and infected app. Problems encountered during implementation are also discussed. The work evaluates the practical usage of the rootkit, potential defenses and areas for future improvement.
This document is a Python tutorial that provides an overview of the Python programming language. It covers topics like using the Python interpreter, basic syntax, data structures, modules, input/output, exceptions, classes and inheritance, and the standard library. The tutorial is intended for new Python programmers to help them learn the essential aspects of the language.
The document provides an overview of MongoDB administration concepts and tutorials. It covers key topics such as operational strategies, data management, optimization strategies, and administration tutorials for configuration, maintenance, backup/recovery, and scripting. The operational strategies section outlines approaches for MongoDB backups, monitoring, database configuration, importing/exporting data, and production notes. Backup methods that are described include copying underlying data files, using mongodump, and MongoDB Management Service cloud/on-prem backup software.
This document is the Software Guide for version 3.20 of the ORFEO Toolbox (OTB). OTB is a set of algorithms encapsulated in a software library developed by CNES to efficiently exploit results from methodological remote sensing research and development studies. It is implemented in C++ and based on the Insight Toolkit (ITK). The guide provides an introduction to OTB, instructions for downloading and installing it, and overviews of the system organization and essential concepts like the data processing pipeline and spatial objects.
Python is an easy to learn, powerful programming language. It has efficient high-level data structures and
a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic
typing, together with its interpreted nature, make it an ideal language for scripting and rapid application
development in many areas on most platforms.
The Python interpreter and the extensive standard library are freely available in source or binary form for all
major platforms from the Python Web site, https://www.python.org/, and may be freely distributed. The
same site also contains distributions of and pointers to many free third-party Python modules, programs
and tools, and additional documentation.
This document is a tutorial for the Python programming language. It covers topics such as using Python as a calculator, basic programming concepts like variables and functions, built-in data types like lists and dictionaries, modules and packages, input/output functions, exceptions and errors, classes and inheritance, and an overview of the Python standard library. The tutorial is intended for new Python programmers to help them learn the fundamentals of the language.
This document is a tutorial for the Python programming language. It covers topics such as using the Python interpreter, basic syntax, data types, control flow, functions, modules, input/output, exceptions, object-oriented programming with classes, and an overview of the standard library. The tutorial is intended for new Python programmers to help them learn the fundamentals of the language.
Kali Linux Revealed - Mastering the Penetration Testing (Raphaël Hertzog, Jim...SomiMukerjee
This document provides an overview and introduction to the Kali Linux operating system. It discusses Kali Linux's history and relationship to Debian Linux. The document outlines Kali Linux's main features such as being a live system, using a customized Linux kernel, and its focus on penetration testing tools. It also covers Kali Linux's policies around disabling network services by default and curating included applications. The table of contents previews that the document will cover topics like getting started with Kali Linux, Linux fundamentals, and installing Kali Linux.
This document provides an overview of HTML, CSS, Bootstrap, JavaScript and jQuery. It covers topics such as basic HTML tags and attributes, CSS selectors and properties, Bootstrap's grid system and components, JavaScript keywords and data types, and how to use jQuery. The document is intended as a reference for learning these web development languages and frameworks.
This document provides an overview of IBM InfoSphere Streams V3.0, which is a stream computing platform for performing real-time analytics on big data. It discusses key concepts of stream computing and InfoSphere Streams architecture. New features in V3.0 include improved configuration, administration, integration capabilities, and analytics toolkits. The document also covers deployment planning and creating Streams instances.
Credit limit improvement system in odoo 17Celine George
In Odoo 17, confirmed and uninvoiced sales orders are now factored into a partner's total receivables. As a result, the credit limit warning system now considers this updated calculation, leading to more accurate and effective credit management.
Webinar Innovative assessments for SOcial Emotional SkillsEduSkills OECD
Presentations by Adriano Linzarini and Daniel Catarino da Silva of the OECD Rethinking Assessment of Social and Emotional Skills project from the OECD webinar "Innovations in measuring social and emotional skills and what AI will bring next" on 5 July 2024
Satta Matka Dpboss Kalyan Matka Results Kalyan ChartMohit Tripathi
SATTA MATKA DPBOSS KALYAN MATKA RESULTS KALYAN CHART KALYAN MATKA MATKA RESULT KALYAN MATKA TIPS SATTA MATKA MATKA COM MATKA PANA JODI TODAY BATTA SATKA MATKA PATTI JODI NUMBER MATKA RESULTS MATKA CHART MATKA JODI SATTA COM INDIA SATTA MATKA MATKA TIPS MATKA WAPKA ALL MATKA RESULT LIVE ONLINE MATKA RESULT KALYAN MATKA RESULT DPBOSS MATKA 143 MAIN MATKA KALYAN MATKA RESULTS KALYAN CHART
Kalyan Matka Kalyan Result Satta Matka Result Satta Matka Kalyan Satta Matka Kalyan Open Today Satta Matka Kalyan
Kalyan today kalyan trick kalyan trick today kalyan chart kalyan today free game kalyan today fix jodi kalyan today matka kalyan today open Kalyan jodi kalyan jodi trick today kalyan jodi trick kalyan jodi ajj ka.
Principles of Roods Approach!!!!!!!.pptxibtesaam huma
Principles of Rood’s Approach
Treatment technique used in physiotherapy for neurological patients which aids them to recover and improve quality of life
Facilitatory techniques
Inhibitory techniques
Join educators from the US and worldwide at this year’s conference, themed “Strategies for Proficiency & Acquisition,” to learn from top experts in world language teaching.
Beyond the Advance Presentation for By the Book 9John Rodzvilla
In June 2020, L.L. McKinney, a Black author of young adult novels, began the #publishingpaidme hashtag to create a discussion on how the publishing industry treats Black authors: “what they’re paid. What the marketing is. How the books are treated. How one Black book not reaching its parameters casts a shadow on all Black books and all Black authors, and that’s not the same for our white counterparts.” (Grady 2020) McKinney’s call resulted in an online discussion across 65,000 tweets between authors of all races and the creation of a Google spreadsheet that collected information on over 2,000 titles.
While the conversation was originally meant to discuss the ethical value of book publishing, it became an economic assessment by authors of how publishers treated authors of color and women authors without a full analysis of the data collected. This paper would present the data collected from relevant tweets and the Google database to show not only the range of advances among participating authors split out by their race, gender, sexual orientation and the genre of their work, but also the publishers’ treatment of their titles in terms of deal announcements and pre-pub attention in industry publications. The paper is based on a multi-year project of cleaning and evaluating the collected data to assess what it reveals about the habits and strategies of American publishers in acquiring and promoting titles from a diverse group of authors across the literary, non-fiction, children’s, mystery, romance, and SFF genres.
How to Install Theme in the Odoo 17 ERPCeline George
With Odoo, we can select from a wide selection of attractive themes. Many excellent ones are free to use, while some require payment. Putting an Odoo theme in the Odoo module directory on our server, downloading the theme, and then installing it is a simple process.
AI Risk Management: ISO/IEC 42001, the EU AI Act, and ISO/IEC 23894PECB
As artificial intelligence continues to evolve, understanding the complexities and regulations regarding AI risk management is more crucial than ever.
Amongst others, the webinar covers:
• ISO/IEC 42001 standard, which provides guidelines for establishing, implementing, maintaining, and continually improving AI management systems within organizations
• insights into the European Union's landmark legislative proposal aimed at regulating AI
• framework and methodologies prescribed by ISO/IEC 23894 for identifying, assessing, and mitigating risks associated with AI systems
Presenters:
Miriama Podskubova - Attorney at Law
Miriama is a seasoned lawyer with over a decade of experience. She specializes in commercial law, focusing on transactions, venture capital investments, IT, digital law, and cybersecurity, areas she was drawn to through her legal practice. Alongside preparing contract and project documentation, she ensures the correct interpretation and application of European legal regulations in these fields. Beyond client projects, she frequently speaks at conferences on cybersecurity, online privacy protection, and the increasingly pertinent topic of AI regulation. As a registered advocate of Slovak bar, certified data privacy professional in the European Union (CIPP/e) and a member of the international association ELA, she helps both tech-focused startups and entrepreneurs, as well as international chains, to properly set up their business operations.
Callum Wright - Founder and Lead Consultant Founder and Lead Consultant
Callum Wright is a seasoned cybersecurity, privacy and AI governance expert. With over a decade of experience, he has dedicated his career to protecting digital assets, ensuring data privacy, and establishing ethical AI governance frameworks. His diverse background includes significant roles in security architecture, AI governance, risk consulting, and privacy management across various industries, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: June 26, 2024
Tags: ISO/IEC 42001, Artificial Intelligence, EU AI Act, ISO/IEC 23894
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
How to Store Data on the Odoo 17 WebsiteCeline George
Here we are going to discuss how to store data in Odoo 17 Website.
It includes defining a model with few fields in it. Add demo data into the model using data directory. Also using a controller, pass the values into the template while rendering it and display the values in the website.
3. REFERENCES
8 Input/Output 150
8.1 Principles of I/O Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
8.1.1 Programmed I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.1.2 Interrupt-Driven I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.1.3 Direct Memory Access (DMA) . . . . . . . . . . . . . . . . . . . . . . . . . 157
8.2 I/O Software Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
8.2.1 Interrupt Handlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.2.2 Device Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.2.3 Device-Independent I/O Software . . . . . . . . . . . . . . . . . . . . . . . 161
8.2.4 User-Space I/O Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.3 Disks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
8.3.1 Disk Scheduling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
8.3.2 RAID Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
References
[1] M.J. Bach. The design of the UNIX operating system. Prentice-Hall software series.
Prentice-Hall, 1986.
[2] D.P. Bovet and M. Cesatı́. Understanding The Linux Kernel. 3rd ed. O’Reilly, 2005.
[3] Randal E. Bryant and David R. O’Hallaron. Computer Systems: A Programmer’s Per-
spective. 2nd ed. USA: Addison-Wesley Publishing Company, 2010.
[4] Rémy Card, Theodore Ts’o, and Stephen Tweedie. “Design and Implementation of
the Second Extended Filesystem”. In: Dutch International Symposium on Linux (1996).
[5] Allen B. Downey. The Little Book of Semaphores. greenteapress.com, 2008.
[6] M. Gorman. Understanding the Linux Virtual Memory Manager. Prentice Hall, 2004.
[7] Research Computing Support Group. “Understanding Memory”. In: University of
Alberta (2010). http://cluster.srv.ualberta.ca/doc/.
[8] Sandeep Grover. “Linkers and Loaders”. In: Linux Journal (2002).
[9] Intel. INTEL 80386 Programmer’s Reference Manual. 1986.
[10] John Levine. Linkers and Loaders. Morgan-Kaufman, Oct. 1999.
[11] R. Love. Linux Kernel Development. Developer’s Library. Addison-Wesley, 2010.
[12] W. Mauerer. Professional Linux Kernel Architecture. John Wiley & Sons, 2008.
[13] David Morgan. Analyzing a filesystem. 2012.
[14] Abhishek Nayani, Mel Gorman, and Rodrigo S. de Castro. Memory Management in
Linux: Desktop Companion to the Linux Source Code. Free book, 2002.
[15] Dave Poirier. The Second Extended File System Internal Layout. Web, 2011.
[16] David A Rusling. The Linux Kernel. Linux Documentation Project, 1999.
[17] Silberschatz, Galvin, and Gagne. Operating System Concepts Essentials. John Wiley
& Sons, 2011.
[18] Wiliam Stallings. Operating Systems: Internals and Design Principles. 7th ed. Pren-
tice Hall, 2011.
[19] Andrew S. Tanenbaum. Modern Operating Systems. 3rd. Prentice Hall Press, 2007.
[20] K. Thompson. “Unix Implementation”. In: Bell System Technical Journal 57 (1978),
pp. 1931–1946.
3
4. REFERENCES
[21] Wikipedia. Assembly language — Wikipedia, The Free Encyclopedia. [Online; ac-
cessed 11-May-2015]. 2015.
[22] Wikipedia. Compiler — Wikipedia, The Free Encyclopedia. [Online; accessed 11-
May-2015]. 2015.
[23] Wikipedia. Dining philosophers problem — Wikipedia, The Free Encyclopedia. [On-
line; accessed 11-May-2015]. 2015.
[24] Wikipedia. Directed acyclic graph — Wikipedia, The Free Encyclopedia. [Online;
accessed 12-May-2015]. 2015.
[25] Wikipedia. Dynamic linker — Wikipedia, The Free Encyclopedia. 2012.
[26] Wikipedia. Executable and Linkable Format — Wikipedia, The Free Encyclopedia.
[Online; accessed 12-May-2015]. 2015.
[27] Wikipedia. File Allocation Table — Wikipedia, The Free Encyclopedia. [Online; ac-
cessed 12-May-2015]. 2015.
[28] Wikipedia. File descriptor — Wikipedia, The Free Encyclopedia. [Online; accessed
12-May-2015]. 2015.
[29] Wikipedia. Inode — Wikipedia, The Free Encyclopedia. [Online; accessed 21-February-
2015]. 2015.
[30] Wikipedia. Linker (computing) — Wikipedia, The Free Encyclopedia. [Online; ac-
cessed 11-May-2015]. 2015.
[31] Wikipedia. Loader (computing) — Wikipedia, The Free Encyclopedia. 2012.
[32] Wikipedia. Open (system call) — Wikipedia, The Free Encyclopedia. [Online; ac-
cessed 12-May-2015]. 2014.
[33] Wikipedia. Page replacement algorithm — Wikipedia, The Free Encyclopedia. [On-
line; accessed 11-May-2015]. 2015.
[34] Wikipedia. Process (computing) — Wikipedia, The Free Encyclopedia. [Online; ac-
cessed 21-February-2015]. 2014.
[35] 邹恒明. 计算机的心智:操作系统之哲学原理. 机械工业出版社, 2009.
4
5. 1 Introduction
Course Web Site
Course web site: http://cs2.swfu.edu.cn/moodle
Lecture slides: http://cs2.swfu.edu.cn/∼wx672/lecture_notes/os/slides/
Source code: http://cs2.swfu.edu.cn/∼wx672/lecture_notes/os/src/
Sample lab report: http://cs2.swfu.edu.cn/∼wx672/lecture_notes/os/sample-report/
1.1 What’s an Operating System
What’s an Operating System?
• “Everything a vendor ships when you order an operating system”
• It’s the program that runs all the time
• It’s a resource manager
- Each program gets time with the resource
Each program gets space on the resource
• It’s a control program
- Controls execution of programs to prevent errors and improper use of the com-
puter
• No universally accepted definition
5
6. 1.1 What’s an Operating System
Hardware
Hardware control
Device drivers
Character
devices
Block
devices
Buffer
cache
File subsystem
VFS
NFS · · · Ext2 VFAT
Process control
subsystem
Inter-process
communication
Scheduler
Memory
management
System call interface
Libraries
Kernel level
Hardware level
User level
Kernel level
User programs
trap
trap
6
7. 1.1 What’s an Operating System
Abstractions
To hide the complexity of the actual implementationsSection 1.10 Summary 25
Figure 1.18
Some abstractions pro-
vided by a computer
system. A major theme
in computer systems is to
provide abstract represen-
tations at different levels to
hide the complexity of the
actual implementations.
Main memory I/O devicesProcessorOperating system
Processes
Virtual memory
Files
Virtual machine
Instruction set
architecture
sor that performs just one instruction at a time. The underlying hardware is far
more elaborate, executing multiple instructions in parallel, but always in a way
that is consistent with the simple, sequential model. By keeping the same execu-
tion model, different processor implementations can execute the same machine
code, while offering a range of cost and performance.
On the operating system side, we have introduced three abstractions: files as
an abstraction of I/O, virtual memory as an abstraction of program memory, and
processes as an abstraction of a running program. To these abstractions we add
a new one: the virtual machine, providing an abstraction of the entire computer,
including the operating system, the processor, and the programs. The idea of a
virtual machine was introduced by IBM in the 1960s, but it has become more
prominent recently as a way to manage computers that must be able to run
programs designed for multiple operating systems (such as Microsoft Windows,
MacOS, and Linux) or different versions of the same operating system.
We will return to these abstractions in subsequent sections of the book.
1.10 Summary
A computer system consists of hardware and systems software that cooperate
to run application programs. Information inside the computer is represented as
groups of bits that are interpreted in different ways, depending on the context.
Programs are translated by other programs into different forms, beginning as
ASCII text and then translated by compilers and linkers into binary executable
files.
Processors read and interpret binary instructions that are stored in main
memory. Since computers spend most of their time copying data between memory,
I/O devices, and the CPU registers, the storage devices in a system are arranged
in a hierarchy, with the CPU registers at the top, followed by multiple levels
of hardware cache memories, DRAM main memory, and disk storage. Storage
devices that are higher in the hierarchy are faster and more costly per bit than
those lower in the hierarchy. Storage devices that are higher in the hierarchy serve
as caches for devices that are lower in the hierarchy. Programmers can optimize
the performance of their C programs by understanding and exploiting the memory
hierarchy.
See also: [3, Sec. 1.9.2, The Importance of Abstractions in Computer Systems]
System Goals
Convenient vs. Efficient
• Convenient for the user — for PCs
• Efficient — for mainframes, multiusers
• UNIX
- Started with keyboard + printer, none paid to convenience
- Now, still concentrating on efficiency, with GUI support
History of Operating Systems
1401 7094 1401
(a) (b) (c) (d) (e) (f)
Card
reader
Tape
drive Input
tape
Output
tape
System
tape
Printer
Fig. 1-2. An early batch system. (a) Programmers bring cards to
1401. (b) 1401 reads batch of jobs onto tape. (c) Operator carries
input tape to 7094. (d) 7094 does computing. (e) Operator carries
output tape to 1401. (f) 1401 prints output.
1945 - 1955 First generation
- vacuum tubes, plug boards
1955 - 1965 Second generation
- transistors, batch systems
1965 - 1980 Third generation
7
8. 1.2 OS Services
- ICs and multiprogramming
1980 - present Fourth generation
- personal computers
Multi-programming is the first instance where the OS must make decisions for
the users
Job scheduling — decides which job should be loaded into the memory.
Memory management — because several programs in memory at the same time
CPU scheduling — choose one job among all the jobs are ready to run
Process management — make sure processes don’t offend each other
Job 3
Job 2
Job 1
Operating
system
Memory
partitions
Fig. 1-4. A multiprogramming system with three jobs in memory.
The Operating System Zoo
• Mainframe OS
• Server OS
• Multiprocessor OS
• Personal computer OS
• Real-time OS
• Embedded OS
• Smart card OS
1.2 OS Services
OS Services
Like a government
Helping the users:
• User interface
• Program execution
• I/O operation
• File system manipulation
• Communication
• Error detection
8
9. 1.3 Hardware
Keeping the system efficient:
• Resource allocation
• Accounting
• Protection and security
A Computer System
Banking
system
Airline
reservation
Operating system
Web
browser
Compilers Editors
Application programs
Hardware
System
programs
Command
interpreter
Machine language
Microarchitecture
Physical devices
Fig. 1-1. A computer system consists of hardware, system pro-
grams, and application programs.
1.3 Hardware
CPU Working Cycle
Fetch
unit
Decode
unit
Execute
unit
1. Fetch the first instruction from memory
2. Decode it to determine its type and operands
3. execute it
Special CPU Registers
Program counter(PC): keeps the memory address of the next instruction to be fetched
Stack pointer(SP): points to the top of the current stack in memory
Program status(PS): holds
- condition code bits
- processor state
9
10. 1.3 Hardware
System Bus
Monitor
Keyboard
Floppy
disk drive
Hard
disk drive
Hard
disk
controller
Floppy
disk
controller
Keyboard
controller
Video
controller
MemoryCPU
Bus
Fig. 1-5. Some of the components of a simple personal computer.
Address Bus: specifies the memory locations (addresses) for the data transfers
Data Bus: holds the data transfered. bidirectional
Control Bus: contains various lines used to route timing and control signals throughout
the system
Controllers and Peripherals
• Peripherals are real devices controlled by controller chips
• Controllers are processors like the CPU itself, have control registers
• Device driver writes to the registers, thus control it
• Controllers are connected to the CPU and to each other by a variety of buses
ISA
bridge
Modem
Mouse
PCI
bridgeCPU
Main
memory
SCSI USB
Local bus
Sound
card
Printer Available
ISA slot
ISA bus
IDE
disk
Available
PCI slot
Key-
board
Mon-
itor
Graphics
adaptor
Level 2
cache
Cache bus Memory bus
PCI bus
Fig. 1-11. The structure of a large Pentium systemMotherboard Chipsets
10
11. 1.3 Hardware
Intel Core 2
(CPU)
Front
Side
Bus
North bridge
Chip
DDR2System RAM Graphics Card
DMI
Interface
South bridge
Chip
Serial ATA Ports Clock Generation
BIOS
USB Ports
Power Management
PCI Bus
See also: Motherboard Chipsets And The Memory Map1
• The CPU doesn’t know what it’s connected to
- CPU test bench? network router? toaster? brain implant?
• The CPU talks to the outside world through its pins
- some pins to transmit the physical memory address
- other pins to transmit the values
• The CPU’s gateway to the world is the front-side bus
Intel Core 2 QX6600
• 33 pins to transmit the physical memory address
- so there are 233
choices of memory locations
• 64 pins to send or receive data
- so data path is 64-bit wide, or 8-byte chunks
This allows the CPU to physically address 64GB of memory (233
× 8B)
1http://duartes.org/gustavo/blog/post/motherboard-chipsets-memory-map
11
12. 1.4 Bootstrapping
See also: Datasheet for Intel Core 2 Quad-Core Q6000 Sequence2
Some physical memory addresses are mapped away!
• only the addresses, not the spaces
• Memory holes
- 640KB ∼ 1MB
- /proc/iomem
Memory-mapped I/O
• BIOS ROM
• video cards
• PCI cards
• ...
This is why 32-bit OSes have problems using 4 gigs of RAM.
0xFFFFFFFF +--------------------+ 4GB
Reset vector | JUMP to 0xF0000 |
0xFFFFFFF0 +--------------------+ 4GB - 16B
| Unaddressable |
| memory, real mode |
| is limited to 1MB. |
0x100000 +--------------------+ 1MB
| System BIOS |
0xF0000 +--------------------+ 960KB
| Ext. System BIOS |
0xE0000 +--------------------+ 896KB
| Expansion Area |
| (maps ROMs for old |
| peripheral cards) |
0xC0000 +--------------------+ 768KB
| Legacy Video Card |
| Memory Access |
0xA0000 +--------------------+ 640KB
| Accessible RAM |
| (640KB is enough |
| for anyone - old |
| DOS area) |
0 +--------------------+ 0
What if you don’t have 4G RAM?
the northbridge
1. receives a physical memory request
2. decides where to route it
- to RAM? to video card? to ...?
- decision made via the memory address map
• When is the memory address map built? setup().
1.4 Bootstrapping
Bootstrapping
Can you pull yourself up by your own bootstraps?
A computer cannot run without first loading software but must be running before any
software can be loaded.
BIOS
Initialization
MBR Boot Loader
Earily
Kernel
Initialization
Full
Kernel
Initialization
First
User Mode
Process
BIOS Services Kernel Services
Hardware
CPU in Real Mode
Time flow
CPU in Protected Mode
Switch to
Protected Mode
2http://download.intel.com/design/processor/datashts/31559205.pdf
12
13. 1.5 Interrupt
Intel x86 Bootstrapping
1. BIOS (0xfffffff0)
« POST « HW init « Find a boot device (FD,CD,HD...) « Copy sector zero (MBR) to RAM
(0x00007c00)
2. MBR – the first 512 bytes, contains
• Small code (< 446 Bytes), e.g. GRUB stage 1, for loading GRUB stage 2
• the primary partition table (= 64 Bytes)
• its job is to load the second-stage boot loader.
3. GRUB stage 2 — load the OS kernel into RAM
4. startup
5. init — the first user-space program
|<-------------Master Boot Record (512 Bytes)------------>|
0 439 443 445 509 511
+----//-----+----------+------+------//---------+---------+
| code area | disk-sig | null | partition table | MBR-sig |
| 440 | 4 | 2 | 16x4=64 | 0xAA55 |
+----//-----+----------+------+------//---------+---------+
$ sudo hd -n512 /dev/sda
1.5 Interrupt
Why Interrupt?
While a process is reading a disk file, can we do...
while(!done_reading_a_file())
{
let_CPU_wait();
// or...
lend_CPU_to_others();
}
operate_on_the_file();
Modern OS are Interrupt Driven
HW INT by sending a signal to CPU
SW INT by executing a system call
Trap (exception) is a software-generated INT coursed by an error or by a specific request
from an user program
Interrupt vector is an array of pointers pointing to the memory addresses of interrupt
handlers. This array is indexed by a unique device number
$ less /proc/devices
$ less /proc/interrupts
13
14. 1.6 System Calls
Programmable Interrupt Controllers
Interrupt
INT
IRQ 0 (clock)
IRQ 1 (keyboard)
IRQ 3 (tty 2)
IRQ 4 (tty 1)
IRQ 5 (XT Winchester)
IRQ 6 (floppy)
IRQ 7 (printer)
IRQ 8 (real time clock)
IRQ 9 (redirected IRQ 2)
IRQ 10
IRQ 11
IRQ 12
IRQ 13 (FPU exception)
IRQ 14 (AT Winchester)
IRQ 15
ACK
Master
interrupt
controller
INT
ACK
Slave
interrupt
controller
INT
CPU
INTA
Interrupt
ack
s
y
s
t
^
e
m
d
a
t
^
a
b
u
s
Figure 2-33. Interrupt processing hardware on a 32-bit Intel PC.
Interrupt Processing
CPU
Interrupt
controller
Disk
controller
Disk drive
Current instruction
Next instruction
1. Interrupt
3. Return
2. Dispatch
to handler
Interrupt handler
(b)(a)
1
3
4 2
Fig. 1-10. (a) The steps in starting an I/O device and getting an
interrupt. (b) Interrupt processing involves taking the interrupt,
running the interrupt handler, and returning to the user program.
Detailed explanation: in [19, Sec. 1.3.5, I/O Devices].
Interrupt Timeline 1.2 Computer-System Organization 9
user
process
executing
CPU
I/O interrupt
processing
I/O
request
transfer
done
I/O
request
transfer
done
I/O
device
idle
transferring
Figure 1.3 Interrupt time line for a single process doing output.
the interrupting device. Operating systems as different as Windows and UNIX
dispatch interrupts in this manner.
The interrupt architecture must also save the address of the interrupted
instruction. Many old designs simply stored the interrupt address in a
fixed location or in a location indexed by the device number. More recent
architectures store the return address on the system stack. If the interrupt
routine needs to modify the processor state—for instance, by modifying
register values—it must explicitly save the current state and then restore that
state before returning. After the interrupt is serviced, the saved return address
is loaded into the program counter, and the interrupted computation resumes
as though the interrupt had not occurred.
1.2.2 Storage Structure
The CPU can load instructions only from memory, so any programs to run must
be stored there. General-purpose computers run most of their programs from
rewriteable memory, called main memory (also called random-access memory
or RAM). Main memory commonly is implemented in a semiconductor
technology called dynamic random-access memory (DRAM). Computers use
other forms of memory as well. Because the read-only memory (ROM) cannot
be changed, only static programs are stored there. The immutability of ROM
is of use in game cartridges. EEPROM cannot be changed frequently and so
1.6 System Calls
System Calls
A System Call
• is how a program requests a service from an OS kernel
• provides the interface between a process and the OS
14
15. 1.6 System Calls
Program 1 Program 2 Program 3
+---------+ +---------+ +---------+
| fork() | | vfork() | | clone() |
+---------+ +---------+ +---------+
| | |
+--v-----------v-----------v--+
| C Library |
+--------------o--------------+ User space
-----------------|------------------------------------
+--------------v--------------+ Kernel space
| System call |
+--------------o--------------+
| +---------+
| : ... :
| 3 +---------+ sys_fork()
o------>| fork() |---------------.
| +---------+ |
| : ... : |
| 120 +---------+ sys_clone() |
o------>| clone() |---------------o
| +---------+ |
| : ... : |
| 289 +---------+ sys_vfork() |
o------>| vfork() |---------------o
+---------+ |
: ... : v
+---------+ do_fork()
System Call Table
User program 2
User program 1
Kernel call
Service
procedure
Dispatch table
User programs
run in
user mode
Operating
system
runs in
kernel mode
4
3
1
2
Mainmemory
Figure 1-16. How a system call can be made: (1) User pro-
gram traps to the kernel. (2) Operating system determines ser-
vice number required. (3) Operating system calls service pro-
cedure. (4) Control is returned to user program.
The 11 steps in making the system call read(fd,buffer,nbytes)
15
16. 1.6 System Calls
Return to caller
4
10
6
0
9
7 8
3
2
1
11
Dispatch
Sys call
handler
Address
0xFFFFFFFF
User space
Kernel space
(Operating system)
Library
procedure
read
User program
calling read
Trap to the kernel
Put code for read in register
Increment SP
Call read
Push fd
Push buffer
Push nbytes
5
Fig. 1-17. The 11 steps in making the system call
read(fd, buffer, nbytes).
Process management
Call Description
pid = fork( ) Create a child process identical to the parent
pid = waitpid(pid, statloc, options) Wait for a child to terminate
s = execve(name, argv, environp) Replace a process’ core image
exit(status) Terminate process execution and return status
File management
Call Description
fd = open(file, how, ...) Open a file for reading, writing or both
s = close(fd) Close an open file
n = read(fd, buffer, nbytes) Read data from a file into a buffer
n = write(fd, buffer, nbytes) Write data from a buffer into a file
position = lseek(fd, offset, whence) Move the file pointer
s = stat(name, buf) Get a file’s status information
Directory and file system management
Call Description
s = mkdir(name, mode) Create a new directory
s = rmdir(name) Remove an empty directory
s = link(name1, name2) Create a new entry, name2, pointing to name1
s = unlink(name) Remove a directory entry
s = mount(special, name, flag) Mount a file system
s = umount(special) Unmount a file system
Miscellaneous
Call Description
s = chdir(dirname) Change the working directory
s = chmod(name, mode) Change a file’s protection bits
s = kill(pid, signal) Send a signal to a process
seconds = time(seconds) Get the elapsed time since Jan. 1, 1970
Fig. 1-18. Some of the major POSIX system calls. The return code
s is −1 if an error has occurred. The return codes are as follows:
pid is a process id, fd is a file descriptor, n is a byte count, position
is an offset within the file, and seconds is the elapsed time. The
parameters are explained in the text.
System Call Examples
fork()
16
17. 1.6 System Calls
1 #include stdio.h
2 #include unistd.h
3
4 int main ()
5 {
6 printf(Hello World!n);
7 fork();
8 printf(Goodbye Cruel World!n);
9 return 0;
10 }
$ man 2 fork
exec()
1 #include stdio.h
2 #include unistd.h
3
4 int main ()
5 {
6 printf(Hello World!n);
7 if(fork() != 0 )
8 printf(I am the parent process.n);
9 else {
10 printf(A child is listing the directory contents...n);
11 execl(/bin/ls, ls, -al, NULL);
12 }
13 return 0;
14 }
$ man 3 exec
Hardware INT vs. Software INT
Device:
Send electrical signal to interrupt controller.
Controller:
1. Interrupt CPU.
2. Send digital identification of interrupting
device.
Kernel:
1. Save registers.
2. Execute driver software to read I/O device.
3. Send message.
4. Restart a process (not necessarily
interrupted process).
Caller:
1. Put message pointer and destination of
message into CPU registers.
2. Execute software interrupt instruction.
Kernel:
1. Save registers.
2. Send and/or receive message.
3. Restart a process (not necessarily calling
process).
(a) (b)
Figure 2-34. (a) How a hardware interrupt is processed. (b)
How a system call is made.
17
18. References
[1] Wikipedia. Interrupt — Wikipedia, The Free Encyclopedia. [Online; accessed 21-
February-2015]. 2015.
[2] Wikipedia. System call — Wikipedia, The Free Encyclopedia. [Online; accessed 21-
February-2015]. 2015.
2 Process And Thread
2.1 Processes
2.1.1 What’s a Process
Process
A process is an instance of a program in execution
Processes are like human beings:
« they are generated
« they have a life
« they optionally generate one or more child processes,
and
« eventually they die
A small difference:
• sex is not really common among processes
• each process has just one parent
Stack
Gap
Data
Text
0000
FFFF
The term ”process” is often used with several different meanings. In this book, we stick
to the usual OS textbook definition: a process is an instance of a program in execution.
You might think of it as the collection of data structures that fully describes how far the
execution of the program has progressed[2, Sec. 3.1, Processes, Lightweight Processes,
and Threads].
Processes are like human beings: they are generated, they have a more or less signifi-
cant life, they optionally generate one or more child processes, and eventually they die. A
small difference is that sex is not really common among processes each process has just
one parent.
From the kernel’s point of view, the purpose of a process is to act as an entity to which
system resources (CPU time, memory, etc.) are allocated.
In general, a computer system process consists of (or is said to ’own’) the following
resources[34]:
• An image of the executable machine code associated with a program.
• Memory (typically some region of virtual memory); which includes the executable
code, process-specific data (input and output), a call stack (to keep track of active
subroutines and/or other events), and a heap to hold intermediate computation data
generated during run time.
18
19. 2.1 Processes
• Operating system descriptors of resources that are allocated to the process, such as
file descriptors (Unix terminology) or handles (Windows), and data sources and sinks.
• Security attributes, such as the process owner and the process’ set of permissions
(allowable operations).
• Processor state (context), such as the content of registers, physical memory address-
ing, etc. The state is typically stored in computer registers when the process is exe-
cuting, and in memory otherwise.
The operating system holds most of this information about active processes in data struc-
tures called process control blocks.
Any subset of resource, but typically at least the processor state, may be associated
with each of the process’ threads in operating systems that support threads or ’daughter’
processes.
The operating system keeps its processes separated and allocates the resources they
need, so that they are less likely to interfere with each other and cause system failures
(e.g., deadlock or thrashing). The operating system may also provide mechanisms for
inter-process communication to enable processes to interact in safe and predictable ways.
2.1.2 PCB
Process Control Block (PCB)
Implementation
A process is the collection of data structures that fully describes
how far the execution of the program has progressed.
• Each process is represented by a PCB
• task_struct in
+-------------------+
| process state |
+-------------------+
| PID |
+-------------------+
| program counter |
+-------------------+
| registers |
+-------------------+
| memory limits |
+-------------------+
| list of open files|
+-------------------+
| ... |
+-------------------+
To manage processes, the kernel must have a clear picture of what each process is
doing. It must know, for instance, the process’s priority, whether it is running on a CPU or
blocked on an event, what address space has been assigned to it, which files it is allowed to
address, and so on. This is the role of the process descriptor a task_struct type structure
whose fields contain all the information related to a single process. As the repository of so
much information, the process descriptor is rather complex. In addition to a large number
of fields containing process attributes, the process descriptor contains several pointers to
other data structures that, in turn, contain pointers to other structures[2, Sec. 3.2, Process
Descriptor].
2.1.3 Process Creation
Process Creation
19
20. 2.1 Processes
exit()exec()
fork() wait()
anything()
parent
child
• When a process is created, it is almost identical to its parent
– It receives a (logical) copy of the parent’s address space, and
– executes the same code as the parent
• The parent and child have separate copies of the data (stack and heap)
When a process is created, it is almost identical to its parent. It receives a (logical) copy
of the parent’s address space and executes the same code as the parent, beginning at the
next instruction following the process creation system call. Although the parent and child
may share the pages containing the program code (text), they have separate copies of the
data (stack and heap), so that changes by the child to a memory location are invisible to
the parent (and vice versa) [2, Sec. 3.1, Processes, Lightweight Processes, and Threads].
While earlier Unix kernels employed this simple model, modern Unix systems do not.
They support multi-threaded applications user programs having many relatively indepen-
dent execution flows sharing a large portion of the application data structures. In such
systems, a process is composed of several user threads (or simply threads), each of which
represents an execution flow of the process. Nowadays, most multi-threaded applications
are written using standard sets of library functions called pthread (POSIX thread) libraries.
Traditional Unix systems treat all processes in the same way: resources owned by
the parent process are duplicated in the child process. This approach makes process
creation very slow and inefficient, because it requires copying the entire address space
of the parent process. The child process rarely needs to read or modify all the resources
inherited from the parent; in many cases, it issues an immediate execve() and wipes out
the address space that was so carefully copied [2, Sec. 3.4, Creating Processes].
Modern Unix kernels solve this problem by introducing three different mechanisms:
• Copy On Write
• Lightweight processes
• The vfork() system call
Forking in C
1 #include stdio.h
2 #include unistd.h
3
4 int main ()
5 {
6 printf(Hello World!n);
7 fork();
8 printf(Goodbye Cruel World!n);
9 return 0;
10 }
20
21. 2.1 Processes
$ man fork
exec()
1 int main()
2 {
3 pid_t pid;
4 /* fork another process */
5 pid = fork();
6 if (pid 0) { /* error occurred */
7 fprintf(stderr, Fork Failed);
8 exit(-1);
9 }
10 else if (pid == 0) { /* child process */
11 execlp(/bin/ls, ls, NULL);
12 }
13 else { /* parent process */
14 /* wait for the child to complete */
15 wait(NULL);
16 printf (Child Complete);
17 exit(0);
18 }
19 return 0;
20 }
$ man 3 exec
2.1.4 Process State
Process State Transition
1 23
4
Blocked
Running
Ready
1. Process blocks for input
2. Scheduler picks another process
3. Scheduler picks this process
4. Input becomes available
Fig. 2-2. A process can be in running, blocked, or ready state.
Transitions between these states are as shown.
See also [2, Sec. 3.2.1, Process State].
2.1.5 CPU Switch From Process To Process
CPU Switch From Process To Process
21
22. 2.2 Threads
See also: [2, Sec. 3.3, Process Switch].
2.2 Threads
2.2.1 Processes vs. Threads
Process vs. Thread
a single-threaded process = resource + execution
a multi-threaded process = resource + executions
Thread Thread
Kernel Kernel
Process 1 Process 1 Process 1 Process
User
space
Kernel
space
(a) (b)
Fig. 2-6. (a) Three processes each with one thread. (b) One process
with three threads.
A process = a unit of resource ownership, used to group resources together;
A thread = a unit of scheduling, scheduled for execution on the CPU.
Process vs. Thread
multiple threads running in one pro-
cess:
multiple processes running in one
computer:
share an address space and other re-
sources
share physical memory, disk, printers ...
No protection between threads
impossible — because process is the minimum unit of resource management
unnecessary — a process is owned by a single user
22
23. 2.2 Threads
Threads
+------------------------------------+
| code, data, open files, signals... |
+-----------+-----------+------------+
| thread ID | thread ID | thread ID |
+-----------+-----------+------------+
| program | program | program |
| counter | counter | counter |
+-----------+-----------+------------+
| register | register | register |
| set | set | set |
+-----------+-----------+------------+
| stack | stack | stack |
+-----------+-----------+------------+
2.2.2 Why Thread?
A Multi-threaded Web Server
Dispatcher thread
Worker thread
Web page cache
Kernel
Network
connection
Web server process
User
space
Kernel
space
Fig. 2-10. A multithreaded Web server.
while (TRUE) { while (TRUE) {
get next request(buf); wait for work(buf)
handoff work(buf); look for page in cache(buf, page);
} if (page not in cache(page))
read page from disk(buf, page);
return page(page);
}
(a) (b)
Fig. 2-11. A rough outline of the code for Fig. 2-10. (a) Dispatcher
thread. (b) Worker thread.
A Word Processor With 3 Threads
Kernel
Keyboard Disk
Four score and seven
years ago, our fathers
brought forth upon this
continent a new nation:
conceived in liberty,
and dedicated to the
proposition that all
men are created equal.
Now we are engaged
in a great civil war
testing whether that
nation, or any nation
so conceived and so
dedicated, can long
endure. We are met on
a great battlefield of
that war.
We have come to
dedicate a portion of
that field as a final
resting place for those
who here gave their
lives that this nation
might live. It is
altogether fitting and
proper that we should
do this.
But, in a larger sense,
we cannot dedicate, we
cannot consecrate we
cannot hallow this
ground. The brave
men, living and dead,
who struggled here
have consecrated it, far
above our poor power
to add or detract. The
world will little note,
nor long remember,
what we say here, but
it can never forget
what they did here.
It is for us the living,
rather, to be dedicated
here to the unfinished
work which they who
fought here have thus
far so nobly advanced.
It is rather for us to be
here dedicated to the
great task remaining
before us, that from
these honored dead we
take increased devotion
to that cause for which
they gave the last full
measure of devotion,
that we here highly
resolve that these dead
shall not have died in
vain that this nation,
under God, shall have
a new birth of freedom
and that government of
the people by the
people, for the people
Fig. 2-9. A word processor with three threads.23
24. 2.2 Threads
• Responsiveness
– Good for interactive applications.
– A process with multiple threads makes a great server (e.g. a web server):
Have one server process, many ”worker” threads – if one thread blocks (e.g.
on a read), others can still continue executing
• Economy – Threads are cheap!
– Cheap to create – only need a stack and storage for registers
– Use very little resources – don’t need new address space, global data, program
code, or OS resources
– switches are fast – only have to save/restore PC, SP, and registers
• Resource sharing – Threads can pass data via shared memory; no need for IPC
• Can take advantage of multiprocessors
2.2.3 Thread Characteristics
Thread States Transition
Same as process states transition
1 23
4
Blocked
Running
Ready
1. Process blocks for input
2. Scheduler picks another process
3. Scheduler picks this process
4. Input becomes available
Fig. 2-2. A process can be in running, blocked, or ready state.
Transitions between these states are as shown.
Each Thread Has Its Own Stack
• A typical stack stores local data and call information for (usually nested) procedure
calls.
• Each thread generally has a different execution history.
Kernel
Thread 3's stack
Process
Thread 3Thread 1
Thread 2
Thread 1's
stack
Fig. 2-8. Each thread has its own stack.
24
25. 2.2 Threads
Thread Operations
Start
with one
thread
running
thread1
thread2
thread3
end
release
CPU
wait for
a thread
to exit
thread_create()
thread_create()
thread_create()
thread_exit()
thread_yield()
thread_exit()
thread_exit()
thread_join()
2.2.4 POSIX Threads
POSIX Threads
IEEE 1003.1c The standard for writing portable threaded programs. The threads pack-
age it defines is called Pthreads, including over 60 function calls, supported by most
UNIX systems.
Some of the Pthreads function calls
Thread call Description
pthread_create Create a new thread
pthread_exit Terminate the calling thread
pthread_join Wait for a specific thread to exit
pthread_yield Release the CPU to let another thread run
pthread_attr_init Create and initialize a thread’s attribute structure
pthread_attr_destroy Remove a thread’s attribute structure
Pthreads
Example 1
25
26. 2.2 Threads
1 #include pthread.h
2 #include stdlib.h
3 #include unistd.h
4 #include stdio.h
5
6 void *thread_function(void *arg){
7 int i;
8 for( i=0; i20; i++ ){
9 printf(Thread says hi!n);
10 sleep(1);
11 }
12 return NULL;
13 }
14
15 int main(void){
16 pthread_t mythread;
17 if(pthread_create(mythread, NULL, thread_function, NULL)){
18 printf(error creating thread.);
19 abort();
20 }
21
22 if(pthread_join(mythread, NULL)){
23 printf(error joining thread.);
24 abort();
25 }
26 exit(0);
27 }
See also:
• IBM Developworks: POSIX threads explained3
• stackoverflow.com: What is the difference between exit() and abort()?4
Pthreads
pthread_t defined in pthread.h, is often called a ”thread id” (tid);
pthread_create() returns zero on success and a non-zero value on failure;
pthread_join() returns zero on success and a non-zero value on failure;
How to use pthread?
1. #includepthread.h
2. $ gcc thread1.c -o thread1 -pthread
3. $ ./thread1
3http://www.ibm.com/developerworks/linux/library/l-posix1/index.html
4http://stackoverflow.com/questions/397075/what-is-the-difference-between-exit-and-abort
26
27. 2.2 Threads
Pthreads
Example 2
1 #include pthread.h
2 #include stdio.h
3 #include stdlib.h
4
5 #define NUMBER_OF_THREADS 10
6
7 void *print_hello_world(void *tid)
8 {
9 /* prints the thread’s identifier, then exits.*/
10 printf (Thread %d: Hello World!n, tid);
11 pthread_exit(NULL);
12 }
13
14 int main(int argc, char *argv[])
15 {
16 pthread_t threads[NUMBER_OF_THREADS];
17 int status, i;
18 for (i=0; iNUMBER_OF_THREADS; i++)
19 {
20 printf (Main: creating thread %dn,i);
21 status = pthread_create(threads[i], NULL, print_hello_world, (void *)i);
22
23 if(status != 0){
24 printf (Oops. pthread_create returned error code %dn,status);
25 exit(-1);
26 }
27 }
28 exit(NULL);
29 }
Pthreads
With or without pthread_join()? Check it by yourself.
2.2.5 User-Level Threads vs. Kernel-level Threads
User-Level Threads vs. Kernel-Level Threads
Process ProcessThread Thread
Process
table
Process
table
Thread
table
Thread
table
Run-time
system
Kernel
space
User
space
KernelKernel
Fig. 2-13. (a) A user-level threads package. (b) A threads package
managed by the kernel.
User-Level Threads
27
28. 2.2 Threads
User-level threads provide a library of functions to allow user processes to create and
manage their own threads.
No need to modify the OS;
Simple representation
– each thread is represented simply by a PC, regs, stack, and a small TCB, all stored
in the user process’ address space
Simple Management
– creating a new thread, switching between threads, and synchronization between
threads can all be done without intervention of the kernel
Fast
– thread switching is not much more expensive than a procedure call
Flexible
– CPU scheduling (among threads) can be customized to suit the needs of the al-
gorithm – each process can use a different thread scheduling algorithm
User-Level Threads
Lack of coordination between threads and OS kernel
– Process as a whole gets one time slice
– Same time slice, whether process has 1 thread or 1000 threads
– Also – up to each thread to relinquish control to other threads in that process
Requires non-blocking system calls (i.e. a multithreaded kernel)
– Otherwise, entire process will blocked in the kernel, even if there are runnable
threads left in the process
– part of motivation for user-level threads was not to have to modify the OS
If one thread causes a page fault(interrupt!), the entire process blocks
See also: More about blocking and non-blocking calls5
Kernel-Level Threads
Kernel-level threads kernel provides system calls to create and manage threads
Kernel has full knowledge of all threads
– Scheduler may choose to give a process with 10 threads more time than process
with only 1 thread
Good for applications that frequently block (e.g. server processes with frequent in-
terprocess communication)
Slow – thread operations are 100s of times slower than for user-level threads
Significant overhead and increased kernel complexity – kernel must manage and
schedule threads as well as processes
– Requires a full thread control block (TCB) for each thread
5http://www.daniweb.com/software-development/computer-science/threads/384575/synchronous-vs-asynchronous-blocking-vs-non-blo
28
29. 2.2 Threads
Hybrid Implementations
Combine the advantages of two
Multiple user threads
on a kernel thread
User
space
Kernel
spaceKernel threadKernel
Fig. 2-14. Multiplexing user-level threads onto kernel-level
threads.
Programming Complications
• fork(): shall the child has the threads that its parent has?
• What happens if one thread closes a file while another is still reading from it?
• What happens if several threads notice that there is too little memory?
And sometimes, threads fix the symptom, but not the problem.
2.2.6 Linux Threads
Linux Threads
To the Linux kernel, there is no concept of a thread
• Linux implements all threads as standard processes
• To Linux, a thread is merely a process that shares certain resources with other pro-
cesses
• Some OS (MS Windows, Sun Solaris) have cheap threads and expensive processes.
• Linux processes are already quite lightweight
On a 75MHz Pentium
thread: 1.7µs
fork: 1.8µs
[2, Sec. 3.1, Processes, Lightweight Processes, and Threads] Older versions of the Linux
kernel offered no support for multithreaded applications. From the kernel point of view,
a multithreaded application was just a normal process. The multiple execution flows of a
multithreaded application were created, handled, and scheduled entirely in User Mode,
usually by means of a POSIX-compliant pthread library.
However, such an implementation of multithreaded applications is not very satisfac-
tory. For instance, suppose a chess program uses two threads: one of them controls the
graphical chessboard, waiting for the moves of the human player and showing the moves
of the computer, while the other thread ponders the next move of the game. While the
first thread waits for the human move, the second thread should run continuously, thus
exploiting the thinking time of the human player. However, if the chess program is just
a single process, the first thread cannot simply issue a blocking system call waiting for
29
30. 2.2 Threads
a user action; otherwise, the second thread is blocked as well. Instead, the first thread
must employ sophisticated nonblocking techniques to ensure that the process remains
runnable.
Linux uses lightweight processes to offer better support for multithreaded applications.
Basically, two lightweight processes may share some resources, like the address space,
the open files, and so on. Whenever one of them modifies a shared resource, the other
immediately sees the change. Of course, the two processes must synchronize themselves
when accessing the shared resource.
A straightforward way to implement multithreaded applications is to associate a light-
weight process with each thread. In this way, the threads can access the same set of
application data structures by simply sharing the same memory address space, the same
set of open files, and so on; at the same time, each thread can be scheduled independently
by the kernel so that one may sleep while another remains runnable. Examples of POSIX-
compliant pthread libraries that use Linux’s lightweight processes are LinuxThreads, Na-
tive POSIX Thread Library (NPTL), and IBM’s Next Generation Posix Threading Package
(NGPT).
POSIX-compliant multithreaded applications are best handled by kernels that support
”thread groups”. In Linux a thread group is basically a set of lightweight processes that
implement a multithreaded application and act as a whole with regards to some system
calls such as getpid(), kill(), and _exit().
Linux Threads
clone() creates a separate process that shares the address space of the calling process. The
cloned task behaves much like a separate thread.
Program 3
Kernel space
User space
fork()
clone()
vfork()
System Call Table
3
120
289
Program 1 Program 2
C Library
vfork()fork() clone()
System call sys_fork()
sys_clone()
sys_vfork()
do_fork()
clone()
1 #include sched.h
2 int clone(int (*fn) (void *), void *child_stack,
3 int flags, void *arg, ...);
arg 1 the function to be executed, i.e. fn(arg), which returns an int;
30
31. 2.2 Threads
arg 2 a pointer to a (usually malloced) memory space to be used as the stack for the new thread;
arg 3 a set of flags used to indicate how much the calling process is to be shared. In fact,
clone(0) == fork()
arg 4 the arguments passed to the function.
It returns the PID of the child process or -1 on failure.
$ man clone
The clone() System Call
Some flags:
flag Shared
CLONE_FS File-system info
CLONE_VM Same memory space
CLONE_SIGHAND Signal handlers
CLONE_FILES The set of open files
In practice, one should try to avoid calling clone() directly
Instead, use a threading library (such as pthreads) which use clone() when starting
a thread (such as during a call to pthread_create())
clone() Example
31
33. 3.1 IPC
3.1 IPC
Interprocess Communication
Example:
ps | head -2 | tail -1 | cut -f2 -d' '
IPC issues:
1. How one process can pass information to another
2. Be sure processes do not get into each other’s way
e.g. in an airline reservation system, two processes compete for the last seat
3. Proper sequencing when dependencies are present
e.g. if A produces data and B prints them, B has to wait until A has produced some
data
Two models of IPC:
• Shared memory
• Message passing (e.g. sockets)
Producer-Consumer Problem
compiler Assembly
code assembler
Object
module loader
Process
wants
printing
file
Printer
daemon
3.2 Shared Memory
Process Synchronization
Producer-Consumer Problem
• Consumers don’t try to remove objects from Buffer when it is empty.
• Producers don’t try to add objects to the Buffer when it is full.
Producer
1 while(TRUE){
2 while(FULL);
3 item = produceItem();
4 insertItem(item);
5 }
Consumer
1 while(TRUE){
2 while(EMPTY);
3 item = removeItem();
4 consumeItem(item);
5 }
How to define full/empty?
33
34. 3.2 Shared Memory
Producer-Consumer Problem
— Bounded-Buffer Problem (Circular Array)
Front(out): the first full position
Rear(in): the next free position
out
in
c
b
a
Full or empty when front == rear?
Producer-Consumer Problem
Common solution:
Full: when (in+1)%BUFFER_SIZE == out
Actually, this is full - 1
Empty: when in == out
Can only use BUFFER_SIZE-1 elements
Shared data:
1 #define BUFFER_SIZE 6
2 typedef struct {
3 ...
4 } item;
5 item buffer[BUFFER_SIZE];
6 int in = 0; //the next free position
7 int out = 0;//the first full position
Bounded-Buffer Problem
34
35. 3.3 Race Condition and Mutual Exclusion
Producer:
1 while (true) {
2 /* do nothing -- no free buffers */
3 while (((in + 1) % BUFFER_SIZE) == out);
4
5 buffer[in] = item;
6 in = (in + 1) % BUFFER_SIZE;
7 }
Consumer:
1 while (true) {
2 while (in == out); // do nothing
3 // remove an item from the buffer
4 item = buffer[out];
5 out = (out + 1) % BUFFER_SIZE;
6 return item;
7 }
out
in
c
b
a
3.3 Race Condition and Mutual Exclusion
Race Conditions
Now, let’s have two producers
4
5
6
7
abc
prog.c
prog.n
Process A
out = 4
in = 7
Process B
Spooler
directory
Fig. 2-18. Two processes want to access shared memory at the
same time.Race Conditions
Two producers
1 #define BUFFER_SIZE 100
2 typedef struct {
3 ...
4 } item;
5 item buffer[BUFFER_SIZE];
6 int in = 0;
7 int out = 0;
35
36. 3.3 Race Condition and Mutual Exclusion
Process A and B do the same thing:
1 while (true) {
2 while (((in + 1) % BUFFER_SIZE) == out);
3 buffer[in] = item;
4 in = (in + 1) % BUFFER_SIZE;
5 }
Race Conditions
Problem: Process B started using one of the shared variables before Process A was fin-
ished with it.
Solution: Mutual exclusion. If one process is using a shared variable or file, the other
processes will be excluded from doing the same thing.
Critical Regions
Mutual Exclusion
Critical Region: is a piece of code accessing a common resource.
A enters critical region
A leaves critical region
B attempts to
enter critical
region
B enters
critical region
T1
T2
T3
T4
Process A
Process B
B blocked
B leaves
critical region
Time
Fig. 2-19. Mutual exclusion using critical regions.
Critical Region
A solution to the critical region problem must satisfy three conditions:
Mutual Exclusion: No two process may be simultaneously inside their critical regions.
Progress: No process running outside its critical region may block other processes.
Bounded Waiting: No process should have to wait forever to enter its critical region.
Mutual Exclusion With Busy Waiting
Disabling Interrupts
1 {
2 ...
3 disableINT();
4 critical_code();
5 enableINT();
6 ...
7 }
36
37. 3.3 Race Condition and Mutual Exclusion
Problems:
• It’s not wise to give user process the power of turning off INTs.
– Suppose one did it, and never turned them on again
• useless for multiprocessor system
Disabling INTs is often a useful technique within the kernel itself but is not a general
mutual exclusion mechanism for user processes.
Mutual Exclusion With Busy Waiting
Lock Variables
1 int lock=0; //shared variable
2 {
3 ...
4 while(lock); //busy waiting
5 lock=1;
6 critical_code();
7 lock=0;
8 ...
9 }
Problem:
• What if an interrupt occurs right at line 5?
• Checking the lock again while backing from an interrupt?
Mutual Exclusion With Busy Waiting
Strict Alternation
Process 0
1 while(TRUE){
2 while(turn != 0);
3 critical_region();
4 turn = 1;
5 noncritical_region();
6 }
Process 1
1 while(TRUE){
2 while(turn != 1);
3 critical_region();
4 turn = 0;
5 noncritical_region();
6 }
Problem: violates condition-2
• One process can be blocked by another not in its critical region.
• Requires the two processes strictly alternate in entering their critical region.
Mutual Exclusion With Busy Waiting
Peterson’s Solution
int interest[0] = 0;
int interest[1] = 0;
int turn;
P0
1 interest[0] = 1;
2 turn = 1;
3 while(interest[1] == 1
4 turn == 1);
5 critical_section();
6 interest[0] = 0;
P1
1 interest[1] = 1;
2 turn = 0;
3 while(interest[0] == 1
4 turn == 0);
5 critical_section();
6 interest[1] = 0;
37
38. 3.3 Race Condition and Mutual Exclusion
References
[1] Wikipedia. Peterson’s algorithm — Wikipedia, The Free Encyclopedia. [Online; ac-
cessed 23-February-2015]. 2015.
Mutual Exclusion With Busy Waiting
Hardware Solution: The TSL Instruction
Lock the memory bus
enter region:
TSL REGISTER,LOCK | copy lock to register and set lock to 1
CMP REGISTER,#0 | was lock zero?
JNE enter region | if it was non zero, lock was set, so loop
RET | return to caller; critical region entered
leave region:
MOVE LOCK,#0 | store a 0 in lock
RET | return to caller
Fig. 2-22. Entering and leaving a critical region using the TSL
instruction.
See also: [19, Sec. 2.3.3, Mutual Exclusion With Busy Waiting, p. 124].
Mutual Exclusion Without Busy Waiting
Sleep Wakeup
1 #define N 100 /* number of slots in the buffer */
2 int count = 0; /* number of items in the buffer */
1 void producer(){
2 int item;
3 while(TRUE){
4 item = produce_item();
5 if(count == N)
6 sleep();
7 insert_item(item);
8 count++;
9 if(count == 1)
10 wakeup(consumer);
11 }
12 }
1 void consumer(){
2 int item;
3 while(TRUE){
4 if(count == 0)
5 sleep();
6 item = rm_item();
7 count--;
8 if(count == N - 1)
9 wakeup(producer);
10 consume_item(item);
11 }
12 }
Producer-Consumer Problem
Race Condition
Problem
1. Consumer is going to sleep upon seeing an empty buffer, but INT occurs;
2. Producer inserts an item, increasing count to 1, then call wakeup(consumer);
3. But the consumer is not asleep, though count was 0. So the wakeup() signal is lost;
4. Consumer is back from INT remembering count is 0, and goes to sleep;
5. Producer sooner or later will fill up the buffer and also goes to sleep;
6. Both will sleep forever, and waiting to be waken up by the other process. Deadlock!
38
39. 3.4 Semaphores
Producer-Consumer Problem
Race Condition
Solution: Add a wakeup waiting bit
1. The bit is set, when a wakeup is sent to an awaken process;
2. Later, when the process wants to sleep, it checks the bit first. Turns it off if it’s set,
and stays awake.
What if many processes try going to sleep?
3.4 Semaphores
What is a Semaphore?
• A locking mechanism
• An integer or ADT
that can only be operated with:
Atomic Operations
P() V()
Wait() Signal()
Down() Up()
Decrement() Increment()
... ...
1 down(S){
2 while(S=0);
3 S--;
4 }
1 up(S){
2 S++;
3 }
More meaningful names:
• increment_and_wake_a_waiting_process_if_any()
• decrement_and_block_if_the_result_is_negative()
Semaphore
How to ensure atomic?
1. For single CPU, implement up() and down() as system calls, with the OS disabling all
interrupts while accessing the semaphore;
2. For multiple CPUs, to make sure only one CPU at a time examines the semaphore, a
lock variable should be used with the TSL instructions.
Semaphore is a Special Integer
A semaphore is like an integer, with three differences:
1. You can initialize its value to any integer, but after that the only operations you are
allowed to perform are increment (S++) and decrement (S- -).
2. When a thread decrements the semaphore, if the result is negative (S ≤ 0), the thread
blocks itself and cannot continue until another thread increments the semaphore.
3. When a thread increments the semaphore, if there are other threads waiting, one of
the waiting threads gets unblocked.
39
40. 3.4 Semaphores
Why Semaphores?
We don’t need semaphores to solve synchronization problems, but there are some ad-
vantages to using them:
• Semaphores impose deliberate constraints that help programmers avoid errors.
• Solutions using semaphores are often clean and organized, making it easy to demon-
strate their correctness.
• Semaphores can be implemented efficiently on many systems, so solutions that use
semaphores are portable and usually efficient.
The Simplest Use of Semaphore
Signaling
• One thread sends a signal to another thread to indicate that something has happened
• it solves the serialization problem
Signaling makes it possible to guarantee that a section of code in one thread will
run before a section of code in another thread
Thread A
1 statement a1
2 sem.signal()
Thread B
1 sem.wait()
2 statement b1
What’s the initial value of sem?
Semaphore
Rendezvous Puzzle
Thread A
1 statement a1
2 statement a2
Thread B
1 statement b1
2 statement b2
Q: How to guarantee that
1. a1 happens before b2, and
2. b1 happens before a2
a1 → b2; b1 → a2
Hint: Use two semaphores initialized to 0.
40
41. 3.4 Semaphores
Thread A: Thread B:
Solution 1:
statement a1 statement b1
sem1.wait() sem1.signal()
sem2.signal() sem2.wait()
statement a2 statement b2
Solution 2:
statement a1 statement b1
sem2.signal() sem1.signal()
sem1.wait() sem2.wait()
statement a2 statement b2
Solution 3:
statement a1 statement b1
sem2.wait() sem1.wait()
sem1.signal() sem2.signal()
statement a2 statement b2
Solution 3 has deadlock!
Mutex
• A second common use for semaphores is to enforce mutual exclusion
• It guarantees that only one thread accesses the shared variable at a time
• A mutex is like a token that passes from one thread to another, allowing one thread at
a time to proceed
Q: Add semaphores to the following example to enforce mutual exclusion to the shared
variable i.
Thread A: i++ Thread B: i++
Why? Because i++ is not atomic.
i++ is not atomic in assembly language
1 LOAD [i], r0 ;load the value of 'i' into
2 ;a register from memory
3 ADD r0, 1 ;increment the value
4 ;in the register
5 STORE r0, [i] ;write the updated
6 ;value back to memory
Interrupts might occur in between. So, i++ needs to be protected with a mutex.
Mutex Solution
Create a semaphore named mutex that is initialized to 1
1: a thread may proceed and access the shared variable
0: it has to wait for another thread to release the mutex
Thread A
1 mutex.wait()
2 i++
3 mutex.signal()
Thread B
1 mutex.wait()
2 i++
3 mutex.signal()
41
42. 3.4 Semaphores
Multiplex — Without Busy Waiting
1 typedef struct{
2 int space; //number of free resources
3 struct process *P; //a list of queueing producers
4 struct process *C; //a list of queueing consumers
5 } semaphore;
6 semaphore S;
7 S.space = 5;
Producer
1 void down(S){
2 S.space--;
3 if(S.space == 4){
4 rmFromQueue(S.C);
5 wakeup(S.C);
6 }
7 if(S.space 0){
8 addToQueue(S.P);
9 sleep();
10 }
11 }
Consumer
1 void up(S){
2 S.space++;
3 if(S.space 5){
4 addToQueue(S.C);
5 sleep();
6 }
7 if(S.space = 0){
8 rmFromQueue(S.P);
9 wakeup(S.P);
10 }
11 }
if S.space 0,
S.space == Number of queueing producers
if S.space 5,
S.space == Number of queueing consumers + 5
The work flow: There are several processes running simultaneously. They all need to
access some common resources.
1. Assuming S.space == 3 in the beginning
2. Process P1 comes and take one resource away. S.space == 2 now.
3. Process P2 comes and take the 2nd resource away. S.space == 1 now.
4. Process P3 comes and take the last resource away. S.space == 0 now.
5. Process P4 comes and sees nothing left. It has to sleep. S.space == -1 now.
6. Process P5 comes and sees nothing left. It has to sleep. S.space == -2 now.
7. At this moment, there are 2 processes (P4 and P5) sleeping. In another word,
they are queuing for resources.
8. Now, P1 finishes using the resource, and released it. After it does a S.space++, it
finds out that S.space = 0. So it wakes up a Process (say P4) in the queue.
9. P4 wakes up, and back to execute the instruction right after sleep().
10. P4 (or P2|P3) finishes using the resource, and releases it. After it does a S.space++,
it finds out that S.space = 0. So it wakes up P5 in the queue.
11. the queue is empty now.
Barrier
Barrier
Barrier
Barrier
A A A
B B B
C C
D D D
Time Time Time
Process
(a) (b) (c)
C
Fig. 2-30. Use of a barrier. (a) Processes approaching a barrier.
(b) All processes but one blocked at the barrier. (c) When the last
process arrives at the barrier, all of them are let through.
1. Processes approaching a barrier
42
43. 3.4 Semaphores
2. All processes but one blocked at the barrier
3. When the last process arrives at the barrier, all of them are let through
Synchronization requirement:
specific_task()
critical_point()
No thread executes critical_point() until after all threads have executed specific_task().
Barrier Solution
1 n = the number of threads
2 count = 0
3 mutex = Semaphore(1)
4 barrier = Semaphore(0)
count: keeps track of how many threads have arrived
mutex: provides exclusive access to count
barrier: is locked (≤ 0) until all threads arrive
When barrier.value0,
barrier.value == Number of queueing processes
Solution 1
1 specific_task();
2 mutex.wait();
3 count++;
4 mutex.signal();
5 if (count n)
6 barrier.wait();
7 barrier.signal();
8 critical_point();
Solution 2
1 specific_task();
2 mutex.wait();
3 count++;
4 mutex.signal();
5 if (count == n)
6 barrier.signal();
7 barrier.wait();
8 critical_point();
Only one thread can pass the barrier!
Barrier Solution
Solution 3
1 specific_task();
2
3 mutex.wait();
4 count++;
5 mutex.signal();
6
7 if (count == n)
8 barrier.signal();
9
10 barrier.wait();
11 barrier.signal();
12
13 critical_point();
Solution 4
1 specific_task();
2
3 mutex.wait();
4 count++;
5
6 if (count == n)
7 barrier.signal();
8
9 barrier.wait();
10 barrier.signal();
11 mutex.signal();
12
13 critical_point();
Blocking on a semaphore while holding a mutex!
barrier.wait();
barrier.signal();
43
44. 3.4 Semaphores
Turnstile
This pattern, a wait and a signal in rapid succession, occurs often enough that it has a
name called a turnstile, because
• it allows one thread to pass at a time, and
• it can be locked to bar all threads
Semaphores
Producer-Consumer Problem
Whenever an event occurs
• a producer thread creates an event object and adds it to the event buffer. Concur-
rently,
• consumer threads take events out of the buffer and process them. In this case, the
consumers are called “event handlers”.
Producer
1 event = waitForEvent()
2 buffer.add(event)
Consumer
1 event = buffer.get()
2 event.process()
Q: Add synchronization statements to the producer and consumer code to enforce the
synchronization constraints
1. Mutual exclusion
2. Serialization
See also [5, Sec. 4.1, Producer-Consumer Problem].
Semaphores — Producer-Consumer Problem
Solution
Initialization:
mutex = Semaphore(1)
items = Semaphore(0)
• mutex provides exclusive access to the buffer
• items:
+: number of items in the buffer
−: number of consumer threads in queue
44
45. 3.4 Semaphores
Semaphores — Producer-Consumer Problem
Solution
Producer
1 event = waitForEvent()
2 mutex.wait()
3 buffer.add(event)
4 items.signal()
5 mutex.signal()
or,
Producer
1 event = waitForEvent()
2 mutex.wait()
3 buffer.add(event)
4 mutex.signal()
5 items.signal()
Consumer
1 items.wait()
2 mutex.wait()
3 event = buffer.get()
4 mutex.signal()
5 event.process()
or,
Consumer
1 mutex.wait()
2 items.wait()
3 event = buffer.get()
4 mutex.signal()
5 event.process()
Danger: any time you wait for a semaphore while holding a mutex!
items.signal()
{
items++;
if(items == 0)
wakeup(consumer);
}
items.wait()
{
items--;
if(items 0)
sleep();
}
Producer-Consumer Problem With Bounded-Buffer
Given:
semaphore items = 0;
semaphore spaces = BUFFER_SIZE;
Can we?
if (items = BUFFER_SIZE)
producer.block();
if: the buffer is full
then: the producer blocks until a consumer removes an item
No! We can’t check the current value of a semaphore, because
! the only operations are wait and signal.
? But...
Why can’t check the current value of a semaphore? We DO have seen:
void S.down(){
S.value--;
if(S.value 0){
addToQueue(S.L);
sleep();
}
}
void S.up(){
S.value++;
if(S.value = 0){
rmFromQueue(S.L);
wakeup(S.L);
}
}
Notice that the checking is within down() and up(), and is not available to user process
to use it directly.
45
46. 3.4 Semaphores
Producer-Consumer Problem With Bounded-Buffer
1 semaphore items = 0;
2 semaphore spaces = BUFFER_SIZE;
1 void producer() {
2 while (true) {
3 item = produceItem();
4 down(spaces);
5 putIntoBuffer(item);
6 up(items);
7 }
8 }
1 void consumer() {
2 while (true) {
3 down(items);
4 item = rmFromBuffer();
5 up(spaces);
6 consumeItem(item);
7 }
8 }
works fine when there is only one producer and one consumer, because putIntoBuffer()
is not atomic.
putIntoBuffer() could contain two actions:
1. determining the next available slot
2. writing into it
Race condition:
1. Two producers decrement spaces
2. One of the producers determines the next empty slot in the buffer
3. Second producer determines the next empty slot and gets the same result as the first
producer
4. Both producers write into the same slot
putIntoBuffer() needs to be protected with a mutex.
With a mutex
1 semaphore mutex = 1; //controls access to c.r.
2 semaphore items = 0;
3 semaphore spaces = BUFFER_SIZE;
1 void producer() {
2 while (true) {
3 item = produceItem();
4 down(spaces);
5 down(mutex);
6 putIntoBuffer(item);
7 up(mutex);
8 up(items);
9 }
10 }
1 void consumer() {
2 while (true) {
3 down(items);
4 down(mutex);
5 item = rmFromBuffer();
6 up(mutex);
7 up(spaces);
8 consumeItem(item);
9 }
10 }
46
47. 3.5 Monitors
3.5 Monitors
Monitors
Monitor a high-level synchronization object for achieving mutual exclusion.
• It’s a language concept, and C does not have it.
• Only one process can be active in a monitor at
any instant.
• It is up to the compiler to implement mutual ex-
clusion on monitor entries.
– The programmer just needs to know that by
turning all the critical regions into monitor
procedures, no two processes will ever exe-
cute their critical regions at the same time.
1 monitor example
2 integer i;
3 condition c;
4
5 procedure producer();
6 ...
7 end;
8
9 procedure consumer();
10 ...
11 end;
12 end monitor;
Monitor
The producer-consumer problem
1 monitor ProducerConsumer
2 condition full, empty;
3 integer count;
4
5 procedure insert(item: integer);
6 begin
7 if count = N then wait(full);
8 insert_item(item);
9 count := count + 1;
10 if count = 1 then signal(empty)
11 end;
12
13 function remove: integer;
14 begin
15 if count = 0 then wait(empty);
16 remove = remove_item;
17 count := count - 1;
18 if count = N - 1 then signal(full)
19 end;
20 count := 0;
21 end monitor;
1 procedure producer;
2 begin
3 while true do
4 begin
5 item = produce_item;
6 ProducerConsumer.insert(item)
7 end
8 end;
9
10 procedure consumer;
11 begin
12 while true do
13 begin
14 item = ProducerConsumer.remove;
15 consume_item(item)
16 end
17 end;
3.6 Message Passing
Message Passing
• Semaphores are too low level
• Monitors are not usable except in a few programming languages
• Neither monitor nor semaphore is suitable for distributed systems
• No conflicts, easier to implement
Message passing uses two primitives, send and receive system calls:
- send(destination, message);
- receive(source, message);
47
48. 3.6 Message Passing
Message Passing
Design issues
• Message can be lost by network; — ACK
• What if the ACK is lost? — SEQ
• What if two processes have the same name? — socket
• Am I talking with the right guy? Or maybe a MIM? — authentication
• What if the sender and the receiver on the same machine? — Copying messages is
always slower than doing a semaphore operation or entering a monitor.
Message Passing
TCP Header Format
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Message Passing
The producer-consumer problem
1 #define N 100 /* number of slots in the buffer */
2 void producer(void)
3 {
4 int item;
5 message m; /* message buffer */
6 while (TRUE) {
7 item = produce_item(); /* generate something to put in buffer */
8 receive(consumer, m); /* wait for an empty to arrive */
9 build_message(m, item); /* construct a message to send */
10 send(consumer, m); /* send item to consumer */
11 }
12 }
13
14 void consumer(void)
15 {
16 int item, i;
17 message m;
18 for (i=0; iN; i++) send(producer, m); /* send N empties */
19 while (TRUE) {
20 receive(producer, m); /* get message containing item */
21 item = extract_item(m); /* extract item from message */
22 send(producer, m); /* send back empty reply */
23 consume_item(item); /* do something with the item */
24 }
25 }
48
49. 3.7 Classical IPC Problems
3.7 Classical IPC Problems
3.7.1 The Dining Philosophers Problem
The Dining Philosophers Problem
1 while True:
2 think()
3 get_forks()
4 eat()
5 put_forks()
How to implement get_forks() and put_forks() to ensure
1. No deadlock
2. No starvation
3. Allow more than one philosopher to eat at the same time
The Dining Philosophers Problem
Deadlock
#define N 5 /* number of philosophers */
void philosopher(int i) /* i: philosopher number, from 0 to 4 */
{
while (TRUE) {
think( ); /* philosopher is thinking */
take fork(i); /* take left fork */
take fork((i+1) % N); /* take right fork; % is modulo operator */
eat( ); /* yum-yum, spaghetti */
put fork(i); /* put left fork back on the table */
put fork((i+1) % N); /* put right fork back on the table */
}
}
Fig. 2-32. A nonsolution to the dining philosophers problem.
• Put down the left fork and wait for a while if the right one is not available? Similar
to CSMA/CD — Starvation
The Dining Philosophers Problem
With One Mutex
1 #define N 5
2 semaphore mutex=1;
3
4 void philosopher(int i)
5 {
6 while (TRUE) {
7 think();
8 wait(mutex);
9 take_fork(i);
10 take_fork((i+1) % N);
11 eat();
12 put_fork(i);
13 put_fork((i+1) % N);
14 signal(mutex);
15 }
16 }
• Only one philosopher can eat at a time.
• How about 2 mutexes? 5 mutexes?
49
50. 3.7 Classical IPC Problems
The Dining Philosophers Problem
AST Solution (Part 1)
A philosopher may only move into eating state if neither neighbor is eating
1 #define N 5 /* number of philosophers */
2 #define LEFT (i+N-1)%N /* number of i’s left neighbor */
3 #define RIGHT (i+1)%N /* number of i’s right neighbor */
4 #define THINKING 0 /* philosopher is thinking */
5 #define HUNGRY 1 /* philosopher is trying to get forks */
6 #define EATING 2 /* philosopher is eating */
7 typedef int semaphore;
8 int state[N]; /* state of everyone */
9 semaphore mutex = 1; /* for critical regions */
10 semaphore s[N]; /* one semaphore per philosopher */
11
12 void philosopher(int i) /* i: philosopher number, from 0 to N-1 */
13 {
14 while (TRUE) {
15 think( );
16 take_forks(i); /* acquire two forks or block */
17 eat( );
18 put_forks(i); /* put both forks back on table */
19 }
20 }
The Dining Philosophers Problem
AST Solution (Part 2)
1 void take_forks(int i) /* i: philosopher number, from 0 to N-1 */
2 {
3 down(mutex); /* enter critical region */
4 state[i] = HUNGRY; /* record fact that philosopher i is hungry */
5 test(i); /* try to acquire 2 forks */
6 up(mutex); /* exit critical region */
7 down(s[i]); /* block if forks were not acquired */
8 }
9 void put_forks(i) /* i: philosopher number, from 0 to N-1 */
10 {
11 down(mutex); /* enter critical region */
12 state[i] = THINKING; /* philosopher has finished eating */
13 test(LEFT); /* see if left neighbor can now eat */
14 test(RIGHT); /* see if right neighbor can now eat */
15 up(mutex); /* exit critical region */
16 }
17 void test(i) /* i: philosopher number, from 0 to N-1 */
18 {
19 if (state[i] == HUNGRY state[LEFT] != EATING state[RIGHT] != EATING) {
20 state[i] = EATING;
21 up(s[i]);
22 }
23 }
Starvation!
Step by step:
50
51. 3.7 Classical IPC Problems
1. If 5 philosophers take_forks(i) at the same time, only one can get mutex.
2. The one who gets mutex sets his state to HUNGRY. And then,
3. test(i); try to get 2 forks.
(a) If his LEFT and RIGHT are not EATING, success to get 2 forks.
i. sets his state to EATING
ii. up(s[i]); The initial value of s(i) is 0.
Now, his LEFT and RIGHT will fail to get 2 forks, even if they could grab mutex.
(b) If either LEFT or RIGHT are EATING, fail to get 2 forks.
4. release mutex
5. down(s[i]);
(a) block if forks are not acquired
(b) eat() if 2 forks are acquired
6. After eat()ing, the philosopher doing put_forks(i) has to get mutex first.
• because state[i] can be changed by more than one philosopher.
7. After getting mutex, set his state to THINKING
8. test(LEFT); see if LEFT can now eat?
(a) If LEFT is HUNGRY, and LEFT’s LEFT is not EATING, and LEFT’s RIGHT (me) is not EATING
i. set LEFT’s state to EATING
ii. up(s[LEFT]);
(b) If LEFT is not HUNGRY, or LEFT’s LEFT is EATING, or LEFT’s RIGHT (me) is EATING, LEFT
fails to get 2 forks.
9. test(RIGHT); see if RIGHT can now eat?
10. release mutex
The Dining Philosophers Problem
More Solutions
• If there is at least one leftie and at least one rightie, then deadlock is not possible
• Wikipedia: Dining philosophers problem
See also: [23, Dining philosophers problem]
51
52. 3.7 Classical IPC Problems
3.7.2 The Readers-Writers Problem
The Readers-Writers Problem
Constraint: no process may access the shared data for reading or writing while another
process is writing to it.
1 semaphore mutex = 1;
2 semaphore noOther = 1;
3 int readers = 0;
4
5 void writer(void)
6 {
7 while (TRUE) {
8 wait(noOther);
9 writing();
10 signal(noOther);
11 }
12 }
1 void reader(void)
2 {
3 while (TRUE) {
4 wait(mutex);
5 readers++;
6 if (readers == 1)
7 wait(noOther);
8 signal(mutex);
9 reading();
10 wait(mutex);
11 readers--;
12 if (readers == 0)
13 signal(noOther);
14 signal(mutex);
15 anything();
16 }
17 }
Starvation The writer could be blocked forever if there are always someone reading.
The Readers-Writers Problem
No starvation
1 semaphore mutex = 1;
2 semaphore noOther = 1;
3 semaphore turnstile = 1;
4 int readers = 0;
5
6 void writer(void)
7 {
8 while (TRUE) {
9 turnstile.wait();
10 wait(noOther);
11 writing();
12 signal(noOther);
13 turnstile.signal();
14 }
15 }
1 void reader(void)
2 {
3 while (TRUE) {
4 turnstile.wait();
5 turnstile.signal();
6
7 wait(mutex);
8 readers++;
9 if (readers == 1)
10 wait(noOther);
11 signal(mutex);
12 reading();
13 wait(mutex);
14 readers--;
15 if (readers == 0)
16 signal(noOther);
17 signal(mutex);
18 anything();
19 }
20 }
3.7.3 The Sleeping Barber Problem
The Sleeping Barber Problem
52
54. References
[1] Wikipedia. Inter-process communication — Wikipedia, The Free Encyclopedia. [On-
line; accessed 21-February-2015]. 2015.
[2] Wikipedia. Semaphore (programming) — Wikipedia, The Free Encyclopedia. [On-
line; accessed 21-February-2015]. 2015.
4 CPU Scheduling
4.1 Process Scheduling Queues
Scheduling Queues
Job queue consists all the processes in the system
Ready queue A linked list consists processes in the main memory ready for execute
Device queue Each device has its own device queue3.2 Process Scheduling 105
queue header PCB7
PCB3
PCB5
PCB14 PCB6
PCB2
head
head
head
head
head
ready
queue
disk
unit 0
terminal
unit 0
mag
tape
unit 0
mag
tape
unit 1
tail registers registers
tail
tail
tail
tail
•
•
•
•
•
•
•
•
•
Figure 3.6 The ready queue and various I/O device queues.
The system also includes other queues. When a process is allocated the
CPU, it executes for a while and eventually quits, is interrupted, or waits for
the occurrence of a particular event, such as the completion of an I/O request.
Suppose the process makes an I/O request to a shared device, such as a disk.
Since there are many processes in the system, the disk may be busy with the
I/O request of some other process. The process therefore may have to wait for
the disk. The list of processes waiting for a particular I/O device is called a
device queue. Each device has its own device queue (Figure 3.6).
A common representation of process scheduling is a queueing diagram,
such as that in Figure 3.7. Each rectangular box represents a queue. Two types
of queues are present: the ready queue and a set of device queues. The circles
represent the resources that serve the queues, and the arrows indicate the flow
of processes in the system.
A new process is initially put in the ready queue. It waits there until it is
selected for execution, or is dispatched. Once the process is allocated the CPU
and is executing, one of several events could occur:
• The process could issue an I/O request and then be placed in an I/O queue.
• The process could create a new subprocess and wait for the subprocess’s
termination.
• The process could be removed forcibly from the CPU as a result of an
interrupt, and be put back in the ready queue.
• The tail pointer — When adding a new process to the queue, don’t have to find the
tail by traversing the list
Queueing Diagram
54
55. 4.2 Scheduling
106 Chapter 3 Processes
ready queue CPU
I/O I/O queue I/O request
time slice
expired
fork a
child
wait for an
interrupt
interrupt
occurs
child
executes
Figure 3.7 Queueing-diagram representation of process scheduling.
In the first two cases, the process eventually switches from the waiting state
to the ready state and is then put back in the ready queue. A process continues
this cycle until it terminates, at which time it is removed from all queues and
has its PCB and resources deallocated.
3.2.2 Schedulers
A process migrates among the various scheduling queues throughout its
lifetime. The operating system must select, for scheduling purposes, processes
from these queues in some fashion. The selection process is carried out by the
appropriate scheduler.
Often, in a batch system, more processes are submitted than can be executed
immediately. These processes are spooled to a mass-storage device (typically a
disk), where they are kept for later execution. The long-term scheduler, or job
scheduler, selects processes from this pool and loads them into memory for
execution. The short-term scheduler, or CPU scheduler, selects from among
the processes that are ready to execute and allocates the CPU to one of them.
The primary distinction between these two schedulers lies in frequency
of execution. The short-term scheduler must select a new process for the CPU
frequently. A process may execute for only a few milliseconds before waiting
for an I/O request. Often, the short-term scheduler executes at least once every
100 milliseconds. Because of the short time between executions, the short-term
scheduler must be fast. If it takes 10 milliseconds to decide to execute a process
for 100 milliseconds, then 10/(100 + 10) = 9 percent of the CPU is being used
(wasted) simply for scheduling the work.
The long-term scheduler executes much less frequently; minutes may sep-
arate the creation of one new process and the next. The long-term scheduler
controls the degree of multiprogramming (the number of processes in mem-
ory). If the degree of multiprogramming is stable, then the average rate of
process creation must be equal to the average departure rate of processes
4.2 Scheduling
Scheduling
• Scheduler uses scheduling algorithm to choose a process from the ready queue
• Scheduling doesn’t matter much on simple PCs, because
1. Most of the time there is only one active process
2. The CPU is too fast to be a scarce resource any more
• Scheduler has to make efficient use of the CPU because process switching is expen-
sive
1. User mode → kernel mode
2. Save process state, registers, memory map...
3. Selecting a new process to run by running the scheduling algorithm
4. Load the memory map of the new process
5. The process switch usually invalidates the entire memory cache
Scheduling Algorithm Goals
All systems
Fairness giving each process a fair share of the CPU
Policy enforcement seeing that stated policy is carried out
Balance keeping all parts of the system busy
Batch systems
Throughput maximize jobs per hour
Turnaround time minimize time between submission and termination
CPU utilization keep the CPU busy all the time
Interactive systems
55
56. 4.3 Process Behavior
Response time respond to requests quickly
Proportionality meet users’ expectations
Real-time systems
Meeting deadlines avoid losing data
Predictability avoid quality degradation in multimedia systems
See also: [19, Sec. 2.4.1.5, Scheduling Algorithm Goals, p. 150].
4.3 Process Behavior
Process Behavior
CPU-bound vs. I/O-bound
Types of CPU bursts:
• long bursts – CPU bound (i.e. batch work)
• short bursts – I/O bound (i.e. emacs)
Long CPU burst
Short CPU burst
Waiting for I/O
(a)
(b)
Time
Fig. 2-37. Bursts of CPU usage alternate with periods of waiting
for I/O. (a) A CPU-bound process. (b) An I/O-bound process.
As CPUs get faster, processes tend to get more I/O-bound.
4.4 Process Classification
Process Classification
Traditionally
CPU-bound processes vs. I/O-bound processes
Alternatively
Interactive processes responsiveness
• command shells, editors, graphical apps
Batch processes no user interaction, run in background, often penalized by the sched-
uler
• programming language compilers, database search engines, scientific computa-
tions
Real-time processes video and sound apps, robot controllers, programs that collect data
from physical sensors
56
57. 4.5 Process Schedulers
• should never be blocked by lower-priority processes
• should have a short guaranteed response time with a minimum variance
The two classifications we just offered are somewhat independent. For instance, a
batch process can be either I/O-bound (e.g., a database server) or CPU-bound (e.g., an
image-rendering program).
While real-time programs are explicitly recognized as such by the scheduling algorithm
in Linux, there is no easy way to distinguish between interactive and batch programs. The
Linux 2.6 scheduler implements a sophisticated heuristic algorithm based on the past
behavior of the processes to decide whether a given process should be considered as
interactive or batch. Of course, the scheduler tends to favor interactive processes over
batch ones.
4.5 Process Schedulers
Schedulers
Long-term scheduler (or job scheduler) selects which processes should be brought into
the ready queue.
Short-term scheduler (or CPU scheduler) selects which process should be executed
next and allocates CPU.
Midium-term scheduler swapping.
• LTS is responsible for a good process mix of I/O-bound and CPU-bound process lead-
ing to best performance.
• Time-sharing systems, e.g. UNIX, often have no long-term scheduler.
Nonpreemptive vs. preemptive
A nonpreemptive scheduling algorithm lets a process run as long as it wants until it
blocks (I/O or waiting for another process) or until it voluntarily releases the CPU.
A preemptive scheduling algorithm will forcibly suspend a process after it runs for
sometime. — clock interruptable
4.6 Scheduling In Batch Systems
Scheduling In Batch Systems
First-Come First-Served
• nonpreemptive
• simple
• also has a disadvantage
What if a CPU-bound process (e.g. runs 1s at a time) followed by many I/O-bound
processes (e.g. 1000 disk reads to complete)?
* In this case, a preemptive scheduling is preferred.
57
58. 4.7 Scheduling In Interactive Systems
Scheduling In Batch Systems
Shortest Job First
(a)
8
A
4
B
4
C
4
D
(b)
8
A
4
B
4
C
4
D
Fig. 2-39. An example of shortest job first scheduling. (a) Run-
ning four jobs in the original order. (b) Running them in shortest
job first order.
Average turnaround time
(a) (8 + 12 + 16 + 20) ÷ 4 = 14
(b) (4 + 8 + 12 + 20) ÷ 4 = 11
How to know the length of the next CPU burst?
• For long-term (job) scheduling, user provides
• For short-term scheduling, no way
4.7 Scheduling In Interactive Systems
Scheduling In Interactive Systems
Round-Robin Scheduling
(a)
Current
process
Next
process
B F D G A
(b)
Current
process
F D G A B
Fig. 2-41. Round-robin scheduling. (a) The list of runnable
processes. (b) The list of runnable processes after B uses up its
quantum.
• Simple, and most widely used;
• Each process is assigned a time interval, called its quantum;
• How long shoud the quantum be?
– too short — too many process switches, lower CPU efficiency;
– too long — poor response to short interactive requests;
– usually around 20 ∼ 50ms.
Scheduling In Interactive Systems
Priority Scheduling
Priority 4
Priority 3
Priority 2
Priority 1
Queue
headers
Runable processes
(Highest priority)
(Lowest priority)
Fig. 2-42. A scheduling algorithm with four priority classes.58
59. 4.8 Thread Scheduling
• SJF is a priority scheduling;
• Starvation — low priority processes may never execute;
– Aging — as time progresses increase the priority of the process;
$ man nice
4.8 Thread Scheduling
Thread Scheduling
Process A Process B Process BProcess A
1. Kernel picks a process 1. Kernel picks a thread
Possible: A1, A2, A3, A1, A2, A3
Also possible: A1, B1, A2, B2, A3, B3
Possible: A1, A2, A3, A1, A2, A3
Not possible: A1, B1, A2, B2, A3, B3
(a) (b)
Order in which
threads run
2. Runtime
system
picks a
thread
1 2 3 1 3 2
Fig. 2-43. (a) Possible scheduling of user-level threads with a 50-
msec process quantum and threads that run 5 msec per CPU burst.
(b) Possible scheduling of kernel-level threads with the same
characteristics as (a).
• With kernel-level threads, sometimes a full context switch is required
• Each process can have its own application-specific thread scheduler, which usually
works better than kernel can
4.9 Linux Scheduling
• [17, Sec. 5.6.3, Example: Linux Scheduling].
• [17, Sec. 15.5, Scheduling].
• [2, Chap. 7, Process Scheduling].
• [11, Chap. 4, Process Scheduling].
Call graph:
cpu_idle()
schedule()
context_switch()
switch_to()
Process Scheduling In Linux
A preemptive, priority-based algorithm with two separate priority ranges:
1. real-time range (0 ∼ 99), for tasks where absolute priorities are more important than
fairness
59
60. 4.9 Linux Scheduling
2. nice value range (100 ∼ 139), for fair preemptive scheduling among multiple processes
In Linux, Process Priority is Dynamic
The scheduler keeps track of what processes are doing and adjusts their priorities
periodically
• Processes that have been denied the use of a CPU for a long time interval are boosted
by dynamically increasing their priority (usually I/O-bound)
• Processes running for a long time are penalized by decreasing their priority (usually
CPU-bound)
• Priority adjustments are performed only on user tasks, not on real-time tasks
Tasks are determined to be I/O-bound or CPU-bound based on an interactivity
heuristic
A task’s interactiveness metric is calculated based on how much time the task exe-
cutes compared to how much time it sleeps
Problems With The Pre-2.6 Scheduler
• an algorithm with O(n) complexity
• a single runqueue for all processors
– good for load balancing
– bad for CPU caches, when a task is rescheduled from one CPU to another
• a single runqueue lock — only one CPU working at a time
The scheduling algorithm used in earlier versions of Linux was quite simple and straight-
forward: at every process switch the kernel scanned the list of runnable processes, com-
puted their priorities, and selected the ”best” process to run. The main drawback of that
algorithm is that the time spent in choosing the best process depends on the number of
runnable processes; therefore, the algorithm is too costly, that is, it spends too much time
in high-end systems running thousands of processes[2, Sec. 7.2, The Scheduling Algo-
rithm].
60
61. 4.9 Linux Scheduling
Scheduling In Linux 2.6 Kernel
• O(1) — Time for finding a task to execute depends not on the number of active tasks
but instead on the number of priorities
• Each CPU has its own runqueue, and schedules itself independently; better cache
efficiency
• The job of the scheduler is simple — Choose the task on the highest priority list to
execute
How to know there are processes waiting in a priority list?
A priority bitmap (5 32-bit words for 140 priorities) is used to define when tasks are on a
given priority list.
• find-first-bit-set instruction is used to find the highest priority bit.
Scheduling In Linux 2.6 Kernel
Each runqueue has two priority arrays
4.9.1 Completely Fair Scheduling
Completely Fair Scheduling (CFS)
Linux’s Process Scheduler
up to 2.4: simple, scaled poorly
• O(n)
• non-preemptive
• single run queue (cache? SMP?)
from 2.5 on: O(1) scheduler
140 priority lists — scaled well
one run queue per CPU — true SMP support
preemptive
ideal for large server workloads
showed latency on desktop systems
from 2.6.23 on: Completely Fair Scheduler (CFS)
improved interactive performance
61
62. 4.9 Linux Scheduling
up to 2.4: The scheduling algorithm used in earlier versions of Linux was quite simple
and straightforward: at every process switch the kernel scanned the list of runnable
processes, computed their priorities, and selected the ”best” process to run. The
main drawback of that algorithm is that the time spent in choosing the best pro-
cess depends on the number of runnable processes; therefore, the algorithm is too
costly, that is, it spends too much time in high-end systems running thousands of
processes[2, Sec. 7.2, The Scheduling Algorithm].
No true SMP all processes share the same run-queue
Cold cache if a process is re-scheduled to another CPU
Completely Fair Scheduler (CFS)
For a perfect (unreal) multitasking CPU
• n runnable processes can run at the same time
• each process should receive 1
n of CPU power
For a real world CPU
• can run only a single task at once — unfair
while one task is running
the others have to wait
• p-wait_runtime is the amount of time the task should now run on the CPU for it
becomes completely fair and balanced.
on ideal CPU, the p-wait_runtime value would always be zero
• CFS always tries to run the task with the largest p-wait_runtime value
See also: Discussing the Completely Fair Scheduler6
CFS
In practice it works like this:
• While a task is using the CPU, its wait_runtime decreases
wait_runtime = wait_runtime - time_running
if: its wait_runtime ̸= MAXwait_runtime (among all processes)
then: it gets preempted
• Newly woken tasks (wait_runtime = 0) are put into the tree more and more to the
right
• slowly but surely giving a chance for every task to become the “leftmost task” and
thus get on the CPU within a deterministic amount of time
References
[1] Wikipedia. Scheduling (computing) — Wikipedia, The Free Encyclopedia. [Online;
accessed 21-February-2015]. 2015.
6http://kerneltrap.org/node/8208
62