(http://www.engin.umd.umich.edu/CIS/course.des/cis450/mcfadyen/chapter4.htm)

Chapter 4

PROCESSES

 

Early computer systems allowed only one program to be executed at a time.  This program had complete control of the system, and had access to all of the system’s resources.  Current-day computer systems allow multiple programs to be loaded into memory and to be executed concurrently.  This evolution requires firmer control and more compartmentalization of various programs.

 

 

 

PROCESS CONCEPT

 

·        Process is a program in execution.

·        A batch system executes jobs

·        A time-shared system has user programs, or task

·        On a single-user system one interactive and several batch programs can be ran at one time.  (note jobs and processes are used interchangeably)

 

 

THE PROCESS

 

·        A process is sometimes known as the text section.

·        A program counter and the contents of the processor’s registers represent the current activity.

·        A stack is included which holds the temporary data (such as temp variables, return addresses, subroutine parameter)

·        A data section is included containing global variables

·        A program is a passive entity such as the contents of a file stored on disk

·        A process is an active entity with a program counter specifying the next instruction to execute and a set of associated resources. (diagram of process state on page 90)

 

PROCESS STATE 

 

·        As processes executes it changes states

·        Processes may be in one of the following states 

1.      New: The process is being created

2.      Running: Instructions are being executed

3.      Waiting:  The process is waiting for some event to occur (such as an I/O completion or reception of a signal)

4.      Ready:  The process is waiting to be assigned to a processor.

5.      Terminated:  The process has finished execution.

 

These names are arbitrary and vary between operating systems.  The states that they represent are found on all systems, however.  Many processes may be ready and waiting.

PROCESS CONTROL BLOCK

 

·        A PCB represents each process in the operating system.

·        A process control block (PCB) is also called a task control block. (diagram of PCB on page 91)

·        The components of the PCB are as followed:

1.      Process state: The state may be new, ready, running, waiting, halted, and so on.

2.      Program counter: The counter indicates the address of the next instruction to be executed for this process.

3.      CPU: The registers vary in number and type, depending on the computer architecture. They include accumulators, index registers, stack pointers, and general-purpose registers plus any condition-code information.  Along with the program counter, this state information must be saved when an interrupt occurs, to allow the process to be continued correctly afterward (figure 4.3 pg. 92 is the CPU switch from process to process)

4.      Memory-management information: This information may include such information as the value of the base and limit registers, the page table, or the segment tables depending on the memory system used by the operating system.

5.      Accounting information: This information includes the amount of CPU and real time used, time limits, account numbers, job or process numbers, and so on.

6.      I/O status information: The information includes the list of I/O devices (such as tape drives) allocated to this process, a list of open files, and so on.

 

·        The PCB serves as a repository for any information that may vary from process to process.

 

 

PROCESS SCHEDULING

 

·        The objective of multiprogramming is to have some process running all the time, and

      To maximize CPU utilization.

·        The objective of time-sharing is to switch the CPU among processes so frequently that users can interact with each program while it is running.

 

 

SCHEDULING QUEUES

 

·        Processes enter the system and are put into a job queue.

·        The processes that are residing in main memory and are ready and waiting to execute are kept on a list called the ready queue.

·        The queue is generally stored as a linked list

·        The list of processes waiting for a particular I/o device is called a device queue

·        A new process is initially put in the ready queue.  It waits in the ready queue until it is selected for execution or dispatch and is given to the CPU. (ready-queue diagram on pg. 94)

·        After the process is allocated to the CPU one of the following events can happen

1.      The process could issue an I/O request, and then be placed in an I/O queue

2.      The process could create a new sub-process and wait for its termination

3.      The process could be removed forcibly from the CPU, as a result of an interrupt, and be put back in the ready queue

 

SCHEDULERS

 

·        A process migrates between the various scheduling queues throughout its lifetime.

·        The operating system must select, for scheduling purposes, processes from these queues in some fashion.

·        The appropriate scheduler carries out the selection process.

·        The long term scheduler (or job scheduler) select processes from this pool and loads them into memory for execution.

·        The long-term scheduler controls the degree of multi-programming.

·        The short- term scheduler (or CPU scheduler) selects from among the processes that are ready to execute, and allocates the CPU to one of them. (a queuing-diagram pg. 95)

·        An I/o bound process is one that spends more of its time doing I/o then it spends doing computations

·        A CPU bound process is one that generates I/O request infrequently, using more of its time doing computation then an I/O bound process uses.

·        It is very important that a long -term scheduler select a good process mix of I/O bound and CPU bound processes.

·        The medium term scheduler was introduced by such operating systems as time-share. (medium term scheduler diagram pg. 96)

·        Swapping is when a process is reintroduced into memory and its execution can be continued where it left off.

·        Swapping may be necessary to improve the process mix because a change in memory requirements has over committed available memory

 

CONTEXT SWITCH

 

·        Context switch is switching the CPU to another process requires saving the state of the old process and loading the saved state for the new process.

·        Context switch time is pure overhead, because the system does no useful work while switching.

·        Context-switch times are highly dependent on hardware support.

·        A context switch simply include changing the pointer to the current register set.

 

 

 

OPERATION ON PROCESS

 

·        The processes in the system can be execute concurrently, and must be created and deleted dynamically. Thus, the operating system must provide a mechanism for process creation and termination

 

 

PROCESS CREATION

 

·        A process may create several new processes, via a create-process system call, during the course of execution. 

·        The creating process called parent process

·        The new processes are called the children processes

·        Each new process may in turn create other processes, forming a tree of processes (diagram pg. 98)

·        When a process creates a new process, two possibilities exist in terms of execution

1.      The parent continues to execute concurrently with its children

2.      The parent waits until some or all of its children have terminated

·        There are also two possibilities in terms of the address space of the new processes

1.      The child process is a duplicate of the parent process

2.      The child processes has a program loaded into it

·        Process identifier is a unique integer

·        A new process is created by the fork system call

·        Both processes (the parent and the child) continues execution at the instruction after the fork with one difference

·        The return code for the fork is zero for the new (child) process, whereas the (nonzero) process identifier of the child is returned to the parent.

·        The execve system called is used after a fork by one of the two processes to replace the process’ memory space with a new program.

·        The execve system call loads a binary file into memory (destroying the memory image of the program containing the execve system call and starts its execution

·        The parent can then create more children, or, if it has nothing else to do while the child runs, it can issue a wait system call to move itself off the ready queue until the termination of the child

 

 

 

 

 

 

 

 

 

 

 

 

CHAPTER 4 NOTES

PART 2

 

 

PROCESS TERMINATION

 

·        A process terminates when it finishes executing its last statement and asks the operating system to delete it by using the exit system call.

·        The process may return data (output) to its parent process (via the wait system call)

·        A process can cause the termination of another process via an appropriate system call (for example, abort)

·        A parent may terminate the execution of one of its children for a variety of reasons such as

1.      The child has exceeded its usage of some of the resources it has been allocated

2.      The task assigned to the child is no longer required.

3.      The parent is exiting, and the operating system does not allow a child to continue if its parent terminates.

·        Many systems including VMS do not allow a child to exist if its parent has terminated. In such systems, if a process terminates (either normally or abnormally), then all its children also are terminated, known as cascading termination.

·        The wait system call returns the process identifier of a terminated child, so that the parent can tell which of the possibly many children has terminated.

·        If the parent terminates, however, all the children are terminated by the operating system

 

 

COOPERATING PROCESSES

 

·        The concurrent processes executing in the operating system may be either independent processes or cooperating processes.

·        A process id independent if it cannot affect or be affected by the other processes executing in the system

·        A process is cooperating if it can affect or be affected by the other processes executing in the system

·        There are several reasons for providing an environment that allows process cooperation

1.      Information sharing:  Since several users may be interested in the same piece of information (for instance, a shared file) we must provide an environment to allow concurrent access to these types of resources

2.      Computation speedup: If we want a particular task to run faster, we must break it into subtasks, each of which will be executing in parallel with the others.  Notice that such a speedup can be achieved only if the computer has multiple processing elements (such as CPU’s or I/O channels)

3.      Modularity:  We may want to construct the system in a modular fashion, dividing the system function into processes.

4.      Convenience:  Even an individual user may have many tasks to work on at one time.  For instance, a user may be editing, printing, and compiling in parallel.

·        A producer process produces information that is consumed by a consumer process.

·        To allow producer and consumer processes to run concurrently we must have available a buffer of items that can be filled by the producer and emptied by the consumer

·        The unbound-buffer producer-consumer problem places no practical limit on the size of the buffer.

·        The bound-buffer producer-consumer problem assumes that there is a fixed buffer size

 

 

THREADS

 

·        A thread, sometimes called a lightweight process (LWP), is a basic unit of CPU utilization, and consists of a program counter, register set, and a stack space.

·        It shares with peer threads its code section, data section, and operating-system resources such as open files and signals, collectively known as a task.

·        A traditional or heavyweight process is equal to a task with one thread.

·        Some systems implement user-level threads in user-level libraries rather than systems calls.

·        Threads can be in one of several states: ready, blocked, running, or terminated

·        Threads provided a mechanism that allowed sequential processes to make blocking system calls while also achieving parallelism. (diagram pg. 104)

 

 

THREADS EXAMPLE IN SLOARIS 2

 

·        By request a thread can also be pinned to a processor

·        Solaris 2 supports user-level threads.  (diagram of example pg. 106-107)

·        The example was concluded by examining the resource need of each of the thread types

1.      A kernel thread has only a small data structure and a stack.  Switching between kernel threads doses not require changing memory access information, and therefore is relatively fast.

2.      An LWP contains a process control block with register data accounting information, and memory information.  Switching between LWP’s therefore requires quite a bit of work and is relatively slow.

3.      A user-level thread needs only a stack and a program counter: no kernel resources are required.  The kernel is not involved in scheduling these user-level threads; therefore, switching among them is fast.  There may be LWP’s in the process that supports these user-level threads.

 

 

 

INTERPROCESS COMMUNICATION

 

·        INTERPROCEES COMMUNICATION (IPC) provides a mechanism to allow processes to communicate and to synchronize their actions.

·        IPC is best provided by a message system

 

 

BASIC STRUCTURE

 

·        An IPC facility provides at least two operations send (message) and receive (message)

·        If processes P and Q want to communicate, they must send messages to and receive messages from each other; a communication link must exist between them.

·        Questions asked regarding the logic behind link implementation are

1.      How are links established

2.      Can a link be associated with more than two processes

3.      How many links can there be between every pair of processes

4.      What is the capacity of a link?  That is, does the link have some buffer space?  If it does, how much

5.      Is a link unidirectional or bi-directional?  That is, if a link exists between P and Q, can messages flow in only one direction (such as only from P to Q) or in both directions?

·        A link is unidirectional only if each process connected to the link can either send or receive, but not both, and each link has at least one receiver process connected to it

·        There are also methods for logically implementing a link and send/receive operation such as

1.      Direct or indirect communication

2.      Symmetric or asymmetric communication

3.      Automatic or explicit buffering

4.      Send by copy or send by reference

5.      Fixed sized or variable-sized messages

 

 

NAMING

 

·        Processes that want to communicate must have a way to refer to each other

·        They can use either direct or indirect communication

 

DIRECT COMMUNICATION

 

·        In the direct communication discipline each process that wants to communicate must explicitly name the recipient or sender of the communication

·        The send and receive primitives are defined as followed (ASYMMECTRIC)

send(P,message). Send a message to process P

receive(Q,message).  Receive a message from process Q

 

·        A communication link in this scheme has the following properties

1.      A link is established automatically between every pair of processes that want to communicate.  The process need to know only each other’s identity to communicate

2.      A link is associated with exactly two processes

3.      Between each pair of processes, there exist exactly one link

4.      The link may be unidirectional, but is usually bi-directional

·        The producer process is defined

repeat

   

     produce an item in nextp

   

     send(consumer,nextp);

until false

·        The consumer process is defined

repeat

    receive(producer,nextc);

    

    consumer the item in nextc

until false

·        Another scheme of direct communication is defined below (SYMMECTRIC)

send(P,message). Send a message to process P

receive(id, message).  Receive a message from any process; the variable id is set to the name of the process with which communication has taken place

·        Disadvantages of the direct communication

1.      Limited modularity of the resulting process definitions

2.      Changing the name of a process may necessitate examining all other process definitions

3.      All references to the old name must be found, so that they can be modified to new name

 

INDIRECT COMMUNICATION

 

·        With indirect communication, the messages are sent to and received from mailboxes (also referred to as ports)

·        A mailbox can be viewed abstractly as an object into which messages can be removed.

·        Each mailbox has a unique identification

·        Two processes can communicate only if the processes have a shared mailbox

·        The send and receive primitives are defined as followed

send(A,message). Send a message to mailbox A

receive(A,message). Receive a message from mailbox A

 

 

 

 

·        A communication link has the following properties

1.      A link is established between a pair of processes only if they have a shared mailbox

2.      A link may be associated with more than two processes

3.      Between each pair of communicating processes, there may be a number of different links, each link corresponding to one mailbox

4.      A link may either be unidirectional or bi-directional

·        Now suppose that processes p1,p2, and p3 all share mailbox A.  P1 sends a message to A, while p2 and p3 each execute a receive from A.  Which process will receive the message sent by p1.  To answer this question one of the following can be done

1.      Allow a link to be associated with at most two processes

2.      Allow a most one processes at a time to execute a receive operation

3.      Allow the system to select arbitrarily which process will receive the message(that is, either p2 or p3 but not both, will receive  the message)  The system may identify the receiver to the sender.

·        A mailbox may be owned either by a process or by the system

·        If the mailbox is owned by a process (that is the mailbox is attached to or identified as part of the process) then we distinguish between the owner (who can only receive messages through this mailbox) and the user of the mailbox (who can only send messages to the mailbox)

·        If a mailbox is owned by the operating system then it has an existence of its own.  The mailbox is independent, and is not attached to any particular process.

·        The operating system provides a mechanism that allows a process

1.      To create a new mailbox

2.      To send and receive messages through the mailbox

3.      To destroy a mailbox

·        Garbage collection is when a separate operation occurs to search for and deallocate memory that is no longer in use. (will be discussed in section 10.3.5)

 

 

BUFFERING

 

·        A link has some capacity that determines the number of messages that can reside in it temporarily.

·        This property can be viewed as a queue of  messages attached to a link.

·        There are three ways that such a queue can be implemented

1.      Zero capacity:  The queue has a maximum length 0; thus the link cannot have any messages waiting in it.  In this case, the sender must wait until the recipient receives the message transfer to take place.  This is called a rendezvous

2.      Bounded capacity:  The queue has finite length n; thus at most n messages can reside in it.  If the queue is not full when a new message is sent the latter is placed in the queue (either the message is copied or a pointer to the message is kept) and the sender can continue execution without waiting.  The link has a finite capacity however.  If the link is full the sender must be delayed until space is available in the queue

3.      Unbounded capacity:  The queue has potentially infinite length thus any number of messages can wait in it. The sender is never delayed

 

 

PROCESS TERMINATES

 

·        Either a sender or a receiver may terminate before a message is processed.

1.      A receiver process P may wait for a message from a process Q that has terminated. If no action is taken P will be blocked forever. In this case the system may either terminate P or notify O that Q has terminated

2.      Process P may send a message to a process Q that has terminated. In the automatic-buffering scheme no harm is done P simply continues with its execution.  If P needs to know that Q has processed its message it must program explicitly for an acknowledgement. In the no-buffering case P will be blocked forever.  As in case 1 the system may either terminate P or notify P that Q has terminated.

 

LOST MESSAGES

 

·        A message from process P to process Q may become lost somewhere in the communications network, due to a hardware or communication-line failure

·        There are 3 ways to handle such an issue

1.      The operating system is responsible for detecting this event and for resending the messages

2.      The sending process is responsible for detecting this event and for retransmitting the message if it wants to do so

3.      The operating system is responsible for detecting this event it then notifies the sending process that the message has been lost. The sending process can proceed as it chooses

·        Timeouts can handle message lost as well

 

SCRAMBLED MESSAGES

 

·        Noise in the communication channel

·        Error checking codes

 

 

AN EXAMPLE:  MACH

 

·        As an example of a message-based operating system, consider the MACH operating system developed at Carnegie Mellon University.  (pg. 116-117)

·        If a mailbox is full, the sending thread has four operation

1.      Wait indefinitely until there is room in the mailbox

2.      Wait at most n milliseconds

3.      Do not wait at all, but return immediately

4.      Temporarily cache a message. One message can given to the operating system to keep even though the mailbox to which it is being sent is full.  When the message can actually be put  in the mailbox  a message is sent back to the sender only one such message to a full mailbox can be pending at any time for a given sending thread.

·        A mailbox set is a collection of mailboxes as declared by the task, which can be grouped together and treated as one mailbox for the purpose of the task.

·        A port_status system call returns the number of messages in a given mailbox

 

 

AN EXAMPLE: WINDOWS NT

 

·        The Windows NT operating system is an example of modern design that employs modularity to increase functionality and decrease the time needed to implement new features.  (pg. 118-119)

·        NT provides support for multiple operating environments or subsystems which application programs communicate via a message passing mechanism

·        The application programs can be considered to be clients of the NT subsystem server.

·        The communication works as followed

1.      Client opens a handle to the subsystem’s connection port object

2.      Client sends a connection request

3.      Server creates two private communication ports and returns the handle to one of them to the client

4.      The client and sever use the corresponding port handle to send messages or call back and listen for replies

·        Quick LPC is another method of message passing