NBS Special Publication

III.

(3) conduct benefits analysis, and

(4) test performance evaluation hypothesis.

OPERATING SYSTEMS FRAMEWORK

In order to conduct an effective performance evaluation, it is necessary to understand the operating system. Although the precise implementation of each operating system differs in many respects, they have many facets in common. The identification and classification of these common functions into a framework for understanding operating systems is an important step in computer performance evaluation (CPE).

This framework for operating systems is not intended to be all inclusive since operating systems can be examined and classified from other meaningful points of view. Examining and classifying from the user point of view, is an example of another framework. The framework presented here isolates those portions that are important in CPE and allows various systems to be compared by means of this common classification.

Large-scale third-generation computers typically have more resources than any single program is likely to need. The rationale for an operating system is to manage the utilization of these resources allowing many programs to be executing at one time. The resource manager concept of operating system is the product of J. Donovan and S. Madnick of MIT.

In a computing system with finite resources and a demand for resources that periodically exceeds capacity, the resource manager (operating system) makes many policy decisions. Policy decisions are strategies for selecting a course of action from a number of alternatives. There is general agreement that as many relevant factors as possible should be included in policy decisions. For example, the core allocator algorithm might consider such factors as the amount requested, the amount available, the job priority, the estimated run time, other outstanding requests, and the availability of other requested peripherals. Disagreement arises as to how factors should be weighted and the strategies that are most appropriate for the installation workload.

The component of the operating system that decides which jobs will be allowed to compete for the CPU is the job scheduler. The scheduling policy might be first-in, first-out (FIFO) or estimated run time with the smallest jobs scheduled first. The FIFO strategy is one of the simplest to design and implement, but it has many disadvantages. The order of arrival is not necessarily the order in which jobs should be selected for execution nor can it be assured that a FIFO scheduling policy will provide a balanced job mix and adequate turnaround for priority jobs. Although general prupose algorithms can provide

an acceptable level of service for a wide spectrum of situations, tuning to the needs of the particular DPI can realize significant gains in efficiency and ease of use.

Once a job has been allocated and is eligible for execution, the process scheduler (system dispatcher) decides when jobs gain control of the CPU and how long they maintain control. The process scheduler queue may be maintained in order by priority, by the ratio of I/O to CPU time or in a number of different ways. The job scheduler decides which jobs will be eligible to compete for CPU time while the process scheduler decides which jobs will receive CPU time. A number of other decision points and important queues are found in operating systems. These include I/O supervisors, output queues, interrupt processing supervisors, and data management supervisors.

Another important facet of operating systems is the non-functional software which carries out the policy decisions and comprises the bulk of the operating system software. The gains to be effected in the areas of policy making are primarily the result of tuning to the particular Dat a Processing Installation (DPI) while the gains from the non-functional software are primarily the result of improvements in efficiency. For example, the routine that physically allocates core has a number of minor housekeeping decisions to make and any improvements to be realized in this area will be from an improved chaining procedure or a faster way of searching tables or a similar type of gain in efficiency.

Once this framework for computer systems has been adopted, systems are no longer viewed as a collection of disparate components, but as a collection of resources. The operating system is the manager of these resources and attempts to allocate them in an efficient manner through policy decisions and non-functional software to carry out these decisions. System bottlenecks are not mysterious problems, but are the result of excess demand for a limited resource. By stating performance problems in terms of the relevant resources instead of in terms of the effects of the problem, the areas in need of investigation are clearly delineated.

IV. SURVEY THE ENVIRONMENT

Understanding and classifying the subject operating system is one facet of understanding the total system. Computers do not operate in a vacuum. Their performance is influenced by the attitudes and abilities of the operations personnel, programmers and managers. Although im¬ proving the performance of a system by streamlining the administrative procedures is not as dramatic as discovering a previously unknown inefficiency in the operating system, the net gain can be just as worthwhile.

Computer Operations

Parts of the following survey are based on the Rand document, "Computer Performance Analysis: Framework and Initial Phases for a Performance Improvement Effort," by T. E. Bell.

There are several measures historically used to gauge the effectiveness and productivity of computer installations. These indicators of performance include the number of jobs processed per day, the average CPU utilization, the average turnaround time, and the system availability. Although these measures apply with varying degrees to other areas, they seem most appropriate for computer operations since they are most affected by computer operations. Since examples can be found of systems delivering overall poor performance that have high CPU utilization or that process many jobs a day, these indicators should be considered in the context of the total installation.

Procedures are established to manage and monitor the computer configuration. might include down time and error rate per device, utilization per device, and projections of system changes based on workload.

This

DPI, the training and education of the data processing staff, and the level of documentation for computer systems. Initial analysis performed on data gathered by monitoring tools will not indicate that administrative problems are limiting performance, but further analysis supplemented by the experience of the analyst will usually narrow the potential list of causes to a few and through the process of elimination administrative problems can be isolated.

In a system of any kind and especially in computer systems, each of the components should work harmoniously with others. To some degree, the operator can compensate for inadequate documentation and the job scheduler can provide a reasonable job mix even when not tuned to the DPI workload, but beyond that point the effectiveness and performance of the total system will be degraded. When the operators spend a disproportionate amount of time trying to decipher document ation, other are as will be neglected; and when the job scheduler makes assumptions about the DPI workload that are incorrect, the total throughput will be reduced.

Software

The types of software problems that can occur are of two kinds, system software problems as in operating systems or access methods and application software problems or those that are associated with a particular set of user developed programs. System problems relating to performance are potentially more critical since they can affect all the programs run on the system while application program problems usually do not have a system-wide impact. If, however, the application program is a real-time application that runs 24 hours a day, then its performance assumes a high degree of importance.

While the tuning of system software has a system-wide benefit, the tuning of applications programs has benefit for that particular application. At first, it seems that the investigation of performance problems for application programs would be an area of low payoff since tuning the entire system has an inherently greater impact than tuning a subset of the system. The amount of payoff is dependent upon the size of the application program and the amount of resources used. Tuning a system dedicated to a single application is virtually equivalent to tuning the system software. The characteristics of the DPI workload will dictate the selection of application programs for tuning and performance evaluation. Care should should be taken not to lose sight of the areas of high payoff while concentrating on areas of intrinsically limited payoff. An example is a system that was tuned in regard to record and block sizes, file placements on disk and still delivered poor performance. What was overlooked was the fact that the system contained 12 sorts. Reducing the number of sorts would have equaled all the other performance gains so painstakingly achieved.

The following list itemizes tuning techniques for application programs:

(a) Tuning programs to the special features and optimization capabilities of language

processors.

This section provides guidelines to be used in determining the benefits to be realized from a performance improvement. In almost every case a performance improvement will come at some cost, whether it is the time and resources necessary to implement it, the degradation that will result in some other area of system performance or the change in mode of operation. Certain changes will result in such a dramatic gain in system performance that a detailed benefits analysis is not necessary to determine their value and other changes will produce such a marginal gain in performance that they are obviously not worth the effort. The difficult decisions are the borderline cases where a number of factors must be weighed before a judgement is reached.

The cost to implement a proposed change can be expressed in time and resources, degradation in some other area of system performance, or changes in operating and administrative procedures. Different types of costs can be associated with CPE changes. The operating system efficiency might be improved by recoding a frequently used system module in which case the cost would be the one-time expense to replace the module. A change in job log-in procedures might improve the ability to trace and account for jobs and this would be a recurring cost (since it would be done each time a job was submitted). In general, one-time or nonrecurring costs are usually effected in the areas of computer system hardware and software while recurring costs are usually effected in administrative and operating procedures.

VIII. TEST HYPOTHESIS

In a structured approach starting with an understanding of the total system, an analysis of the problem types, and a well formulated hypothesis, the amount and type of data needed to test the hypothesis should be limited and relatively easy to obtain. Data to test the hypothesis can be derived from personal observations and accounting data or explicit performance monitoring tools such as hardware and software monitors, simulations, analytical models, or synthetic programs.

The hypothesis is intended to be no more than a tentative working solution to the problem and should be reformulated as necessary when additional data is collected. To test a specific hypothesis data should be collected and experiments performed in a controlled environment.

Even when the data gathered tends to support the validity of the performance evaluation hypothesis, the net gain in performance may be surprisingly small or even negligible. The reason for this is that a second restraint on performance became the primary limiting factor once the most obvious was removed. The performance analyst should be aware that there are many limitations on performance and as one bottleneck is eliminated another will become evident. If, for example, the

« Previous Continue »

Books

NBS Special Publication, Issues 401-405