Page images
PDF
EPUB

During the report program processing, the SMF data is analyzed by the machine states of the multiprogramming environment. The CPU and I/O activities of a job step are averaged over the period from program load time to job step termination. A snapshot of system processing at any time period would show a level of activity with a number of jobs processing at a specific CPU and I/O activity level with a specific amount of memory and devices allocated. These parameters represent a machine state until one of the job steps stops or another starts. The activities of all machine states are accounted and reported as illustrated in a later section which shows some of the report types.

Using CRU's, one can calculate CRU's per hour (the CRU's generated per hour of system active time) or the problem programs CRU rate (the CRU's generated per hour of problem program run time). These figures quantify the machine capacity required to process the given workload. CRU can thus be used as a figure of merit for computer performance.

Using RUT, one can calculate RUT hours per system active hour (this is termed "throughput" in SARA and is the accumulated RUT of all job steps over an interval of system active time). One can also calculate a job-lengthening-factor by dividing job step RUT into the actual job step run time. These figures indicate internal conflicts caused by poor machine loading or a poorly tuned system, and thus give an indication of multiprogramming efficiency. SARA also calculates "memory wait time" and "device wait time". Memory wait time is the time from termination of the last job step to start of device allocation for the current job step. (This wait time also includes some device deallocation and initiate time, but long periods indicate a wait for memory.) Device wait time is the time from start of device allocation to start of problem program load.

TYPICAL SARA STUDIES

BCS makes a technical audit of its large scale computers at least once a year. The audit is performed by personnel who are familiar with hardware/software monitors and SARA. The audit establishes the CRU capacity of the machine, its performance in meeting priority demands, identifies bottlenecks and makes recommendations for system and/or configuration changes. The following sample reports were taken from audits and are presented not because of major benefits or savings

that resulted, but because they demonstrate some of the uses of SMF type data when organized properly.

JOB SUMMARY REPORT

Before proceeding to a system study, it may be well to introduce the type of information provided by SARA. The JOB SUMMARY REPORT, shown in Figure 2, gives an overview of system performance and shows total job resource requirements. Figure 2 summarizes an eleven hour, second shift period of operation for a 370/165, although the report period is selectable and may cover any number of hours, days and/or weeks.

This particular report (Figure 2) came from the main processor of a dual 370/165 configuration driven by a LASP operating system. The 24002 (seven-track tapes) and the 24003 (nine-track tapes) device types are unique to this facility, since they were so named during the calibration input phase of SARA (the device type notation extracted from the operating system by SMF, denotes that a tape drive is a 2400 (or 3400) device type and makes no distinction between different types of tape drives).

[blocks in formation]

RESOURCE ALLOCATION DISTRIBUTION

To effectively manage a configuration, it is necessary to know the utilization of resources and specifically how that utilization is distributed. To configure the proper number of tape drives, one must know how much time the minimum number and subsequent numbers of tape drives are required. The "Resource Allocation Distribution" report, shown in Figure 3, illustrates this distributive data.

As the multiprogramming machine states are analyzed, SARA records the discrete number or amount of resources allocated or used at different machine states. Figure 3 is an example of this type of data for allocated nine-track tape drives. The report period was based on a predominantly scheduled production period when tape drive demand was the highest. The 24003 device type is unique to this facility since it was so named during the calibration input phase. The report

shows:

The discrete increments of the resource (in this example zero to twenty eight 24003 tape drives) The time in hours that exactly 0, 1, 2, ..., 28 tape drives were allocated

The percent of the total time period that is represented by the

allocation time of the discrete increment

The accumulated percent

A graphic distribution to the nearest one percent of the discrete increment percentage.

The arithmetic mean number of tape drives allocated is reported as well as the "number of units or less" that are allocated 95% of the time.

This particular distribution is from the same main processor of a dual 370/165 configuration as was discussed in Figure 2. The total configuration had 84 tape drives, 28 nine-track and 20 seven-track on the main processor and 20 nine-track and 16 seven-track on the support processor. In addition, the configuration had thirty two 3330 disk spindles and sixteen 2314 disk spindles. Facility management wanted to decrease their dependence on tape and add more 3330 modules but they were

constrained to maintain the same configuration costs with no schedule slides. A logical tradeoff was: How many tape

drives can be released to recoup the costs of additional 3330 modules?

[blocks in formation]
[blocks in formation]

The support processor that must handle the spooling activity (reading cards and printing output) and scheduling of jobs seems to have bogged down at more than 4 job streams as witnessed by the throughput and CRU's/hour (note that data points at less than 10% of the active time are usually discarded as too small a sample). In addition, LASP response to Remote Job Entry (RJE) and a BCS online system seems to have slowed down drastically during the demand workload periods when short jobs were running and much initiating, terminating, card reading and printing output was occurring. This was verified by timing the duration for LASP to refurnish the RJE print buffers and timing its response time to the online/

MAIN PROCESSOR

O O

1

0

0

0

0

2

0

0

0

3

2

2.84

39

447

[blocks in formation]

The interval average report shows chronological intervals of average activity throughout the report period. Figure 5 is an abbreviated example of this report showing problem program hourly averages of CPU per cent, CRU's/ hour, throughput, core allocated, number of job streams and calculated channel utilization for the time period from 0800 to 1800.

This data was collected from a 370/165 running HASP RJE and BCS TSO. The workload is primarily priority demand jobs which are submitted via RJE with up to 20 communication lines and TSO supporting 60 to 70 users. As jobs are submitted, a "job class" is calculated and assigned by the system based on job resource requirements.

Initiators are given job class assignments to best satisfy the priority demands (e.g. an initiator assigned job classes A, B, C must process all jobs of class A before B or C; if no class A jobs are in the queue, it may process class B; if no class A's or B's, it may process class C). The calculated job classes and the initiator class assignments are such that small jobs are given preferential treatment. The larger the job, the longer the committed time to process the job.

This facility had been experiencing unacceptable TSO response at 70 users. To reduce internal conflicts and/or TSO degradation, the number of active initiators was reduced from six to four during peak TSO periods from 0800 to 1800. Thus, four problem programs would be the most that could run concurrently. The two deducted initiators had been primarily assigned to processing large jobs.

The reduction in active initiators did result in reduced TSO response time. SARA showed a reduced job lengthening factor, and the BCS software monitor showed a significant reduction in device/channel queuing. There was also a reduction in processing CRU's/hour.

Figure 5 is typical of processing with four active initiators. It is apparent from the Interval Average Report that: ⚫ the job class assignments were far too weighted to small jobs as witnessed by the poor core utilization which in turn produced few CRU'S

⚫ the input workload could not keep the initiators active with their present class assignments per the low number of job streams

the deducted initiators should be activated at 1600 per the reduction in job streams (this is about the time that terminal users quit submitting small jobs since they won't be processed by quitting time and submit large overnight jobs; this is also the time of highest job backlog).

The initiator job class assignments were reworked (this is a separate study in itself but relies heavily on job class and job priority resource requirements as reported by SARA). The result was an increase in CRU's/hour generation from less than 300 to 350, better satisfaction of priority demands and little effect on the improved TSO response times The system with four initiators was doing better than the previous operation with six initiators (although the new job class assignments were primarily responsible, a concurrent effort to reorganize the system packs was also beneficial).

[blocks in formation]

Also, the studies make mention of

throughput, CRU's per hour and the job lengthening factor as performance indicators. With experience and a variable number of computer configurations to investigate, BCS performance analysts have come to recognize the performance capability of a given machine configuration. This allows an almost instantaneous analysis of a "machine in trouble" and if the problem cannot be found, the hardware/software monitors are used for a thorough analysis.

VIRTUAL MEMORY SYSTEM

Virtual memory systems provide up to 16 megabytes of virtual core, a faster and larger real memory, and a more integrated operating system with internal processing algorithms that can be tailored to the workload. The concept of job processing does not change. The following discussion is based on OS/VS2Release 1.6 unless specific details are known on Release 2.0. For example, Release 2 offers a System Activity Measurement Facility (MF1), a monitoring function that sounds like a limited software monitor, but little is known about it at present.

Figure 6 is a simplified overview of the OS/VS2 system operation. The primary functional features and differences over OS/MVT are:

the Dynamic Address Translator
(DAT), a hardware device that
translates CPU "virtual memory"
accesses into "real memory"
address locations

the IO supervisor additional soft-
ware function which translates IO
virtual memory addresses into real
memory address locations

virtual memory allocation to job
steps in 64K byte blocks

• real memory utilization by job steps in 4K byte pages.

The virtual system philosophy then is to increase the user's memory size (a bottleneck heretofore) by giving him up to 16 megabytes of usable core and to decrease the real memory needed by a particular job by making the job's most active 4K pages reside in real memory. This along with the internal priority schemes should allow the user to load up the system with jobs and let the system work them off as efficiently as possible.

Under OS/VS2, SMF records for job steps and TSO sessions include page ins and page outs. In addition, for TSO, the

[blocks in formation]
« PreviousContinue »