NBS Special Publication

[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

The search for ways to measure the performance of computer systems has led to the development of sophisticated hardware and software monitoring techniques. These tools provide visibility of the utilization of resources, such as CPU, channels and devices and also, with proper analysis, an indication of internal bottlenecks within the computer system. By themselves, however, these utilization values tell a data processing manager very little about his current capacity, how much and what type of additional work can be processed or whether the configuration can be reduced without suffering throughput degradation. The manager has a further problem in knowing when to apply the monitoring tools, in determining beforehand that the system is operating inefficiently or in identifying those applications which are causing the problem.

supervisor calls (SVC's) such as obtain core, allocate devices, execute channel program (EXCP) and set timer. Modules to perform these functions either are already resident in memory and are executed or are loaded from the system device and executed. Other portions of the supervisor read the job card deck; schedule, initiate, and terminate the job; and print the output.

The capability to perform online dataset editing and online application program execution is provided via the Time Sharing Option (TSO). TSO is a special type of application program which is given the ability to execute some privileged instructions usually restricted to the supervisor.

The System Management Facility is so named because it is a series of system exit points that allow the user to program decisions peculiar to his system. For example, an exit is provided immediately after a JOB CARD is read. The computing facility may have decided to abort all jobs with invalid account numbers and make a check of the job's account number at this exit point. When an invalid account number is encountered, an appropriate message is written and the job is terminated.

As the application program is processed SMF data is collected and stored in memory buffers. Wall clock time data is collected on such items as when the job was read in, when the job step was initiated, when device allocation started, and when the job step was terminated. Resource utilization data is collected such as how much memory was allocated, location and type of data sets or files allocated, and number of I/O accesses (EXCP's) made to each data set. When the job step terminates, the SMF data collected on it is output to a file. At the completion of all job steps for a job, another type of record is written to the file to describe the job. Similar data is collected causing different record types to be written at the completion of printing or of a TSO

Accounting data usually includes devices allocated, CPU time, wall clock time of critical events and memory allocated. This type of data can provide information on the work mix and on the resource utilizations of the application programs. The data has further advantages of (1) continual availability and (2) reporting in terms familiar to the computer manager.

BCS EXTENSIONS TO SMF DATA

BCS began using SMF for performance reporting as soon as the data became available with IBM system releases. As weaknesses of data content became apparent, programs were written by BCS for various SMF exits and modifications were made to internal system code to supplement the data. This additional data included the start time of device allocation and the start time of problem program loading (this data is present in the latest system releases). It also included counts of tape mounts due to end of volumes, counts of disk mounts, time in, roll-out status, and absolute memory address of region allocation.

Two significant additions to the SMF data were a calculation of the job step single stream run time, called Resource Utilization Time (RUT), and a calculation of the proportionate resources used by the job step, called Computer Resource Units (CRU's).

The RUT and CRU concepts were originated to provide repeatable billing to customers. A job can be processed on any BCS computer system and in any workload environment and be billed the same cost. RUT and CRU soon became important as performance indicators as discussed in the following sections.

THE SARA SYSTEM

The BCS entries of RUT and CRU and the optional SMF extensions allow the SARA system to overcome the limitations of accounting data for system analysis. In addition, during the processing of SMF data, SARA provides the capability of calibrating the data to the computer configuration and workload so that EXCP's can be transformed into device and channel utilizations. The results of SMF data are now calibrated to unique computing environments and can be different for device locations as well as device types. Although the device/ channel utilizations are estimated, they help to pinpoint trouble spots much more readily than do counts of EXCP's. The calibration constants of each BCS computer system are calibrated periodically.

« Previous Continue »

Books

NBS Special Publication, Issues 401-405