NBS Special Publication

Bibliography

1. Abrams, Marshall D., George E. Lindamood, and Thomas N. Pyke, Jr., Measuring and Modelling Man-Computer Interaction, in Association for Computing Machinery, Proceedings of the SIGME Symposium, February 1973, pp. 136-142, 11 refs. (6430166)

The paper describes the Dialogue Monitor developed at the Institute for Computer Sciences and Technology as a tool for the measurement of computer services, particularly those provided by interactive systems. The model of service used here is concerned with performance measurement external both to the user and to the computer, and focuses on the dialogue which takes place between the two.

External performance measurement completes the analytic and stimulus approaches used by Karush; service may be measured as delivered to actual users or to an artificial stimulus.

The character, with two associated descriptors, is the unit of measurement. The first descriptor is the identity of the character's source; the second descriptor specifies the time of occurrence of the character. The character itself is explicit; its source is implicit in the communications discipline; and an external clock provides the time of occur

rence.

With these simple data several models of the man-computer dialogue have been developed. The models differ primarily in the degree of the interactiveness of the dialogue. The paper presents two models and the terms used to describe each. The data stream model is described in terms of idle time, think time, computer burst, user burst, computer interburst time, user interburst time, computer burst segment, user burst segment. The stimulus-acknowledgement-response model is described in terms of acknowledgement delay. acknowledgement time, acknowledgement character count, system response time, system transmit time, system character count, user think time, user transmit time, and user character count.

The operation of the monitor is described briefly and some analysis of the data is provided. (JLW)

Category: 1.2

Key words: Dialogue monitor; idle time; man-computer interaction; measurement tools; models; think time.

2. Arbuckle, R. A., Computer Analysis and Thru

put Evaluation, Computers and Automation, 15:1 (January 1966) pp. 12-15, 19. (6430197)

"The real criterion for measuring system performance is thruput. Yet many evaluations. use only internal comparisons to rate a system's overall performance." Add-time, instruction time, instruction-mix, and kernel problem comparisons may provide relative internal performance figures for specific cases, but these are generally applicable to comparable computer families. In thruput evalua tion, the power of a system must be measured in terms of how fast it can perform the complete job.

In this connection, two type of benchmark problems are suggested: one which estimates time, another which reports actual running times. How well either can evaluate thruput depends essentially on two major considerations: how well the benchmarks reflect actual jobs; and how well they characterize the total workload. Production job runs are identified as the "best way" to measure a system's performance. The widespread use of generalized compilers provides the capability to run actual production jobs or systems with entirely different organizations. The choice of selecting jobs to reflect total system load still remains, even with this approach. Hardware monitors can be used to evaluate and tune system components. The article concludes with an example of the use of a hardware monitor to improve performance of an IBM-7094 system. (JLW)

Categories: 3.0; 1.2

Key words: Application benchmarks; computer sys tems: hadware monitors; IBM-7094; throughput.

3. Bell, T. E., Computer Performance Analysis: Measurement Objectives and Tools, The RAND Corp., Santa Monica, Calif., Rept. No R-584-NASA/PR. February 1971, 32 pp.. 27 refs. (6430172)

This report suggests a number of objectives. for computer system measurement and analysis beyond the commonly accepted one of tuning. Objectives in computer operationsidentifying operational problems and improv ing operational control-mean that personnel in this area should become familiar with new tools and techniques. Computer system simulators should be concerned with model validation as well as model development. Installation managers need results from this field in order to select equipment, trade man-time for machine-time, and tune installed equip ment.

Data collection tools for use in measurement and analysis are necessary to fulfill these objectives. These tools range from simple, inexpensive ones audio and visual indicators, operator opinions, and logs-to the more sophisticated hardware and software monitors. Each of the simple tools can provide initial indications of performance, but hardware and software monitors are usually necessary for a thorough analysis. Five binary characteristics can describe a monitor: (1) implementation medium, (2) separability, (3) sample portion, (4) analysis concurrency, and (5) data presentation. An analyst should determine the characteristics his analysis requires before choosing a product.

Recognizing objectives and choosing measurement tools are two important steps in a performance analysis study. This report deals with these two topics so that analysts can proceed to four more difficult and critical topics. Modeling, choosing a data collection mode, experimental design, and data analysis deserve at least as much attention as examining data collection tools. (Author)

Category: 1.2

Key words: Computer performance analysis; computer performance measurement; measurement tools; simulators; validation.

4. Bell, T. E., B. W. Boehm and R. A. Watson, Computer System Performance Improvement: Framework and Initial Phases, The RAND Corp., Santa Monica, Calif., Rept. No. R-549– PR, August 1971, 55 pp., 4 refs. (6430187) This report distills selected RAND experience and research in the measurement and evaluation of computer system performance into a set of practical guidelines for organizing the initial phases of an effort to improve the performance of a general-purpose computer system. The report is designed primarily as an aid for "getting started" and provides a procedural framework which consists of seven phases. Only the initial three phases are discussed in detail in this report.

Phase 1 is "understanding the system," and a Preliminary Questionnaire is suggested for this purpose. The Questionnaire asks general, descriptive questions about organization, workload, hardware and software, and the accounting system. For Phase 2 which is "analyzing operations," a Detailed Questionnaire is suggested as a guide to the kind of data gathering which must be undertaken in order to analyze computer system performance. The details required identify characteristics of operations, of jobs, and of the system. A Questionnaire on current measurement and

evaluation activities is also suggested. Phase 3 is an aid to installations in developing performance improvement hypotheses; methods of analysis are suggested that provide a transition from analyzing operations to formulating hypotheses, and a number of general hypotheses appropriate to particular problem situations are presented. (Modified author)

Category: 1.2

Key words: Computer performance analysis; guidelines; questionnaires.

5. Boehm, B. W., Computer Systems Analysis Methodology: Studies in Measuring, Evaluating, and Simulating Computer Systems, The RAND Corp., Santa Monica, Calif., Rept. No. R-520-NASA, September 1970, 42 pp., 17 refs. (6430186)

The report is a summary of the results of four studies on computer systems analysis and simulation performed under contract to NASA. One of these studies was on measurement and evaluation of computer systems. Among the critical areas cited in this context is that of "a strong instability in gross measures of multiprogrammed system performance (central processing unit utilization, throughput, etc.) with respect to changes in load characteristics, disk data set allocation, and scheduling algorithms. Small changes in load characteristics, etc., can easily produce large changes in multiprogrammed system performance. This phenomenon has the following significant operational implications: (1) significant improvements in CPU utilization or throughput (usually at least 30 percent; sometimes over 300 percent) can be realized from investments in tuning multiprogrammed computer systems; (2) computer systems selected and procured because of their performance on a series of benchmark jobs can lead to disastrous mismatches if great care is not taken to assure that the benchmarks are fully representative; and (3) as workload characteristics change with time, the maintenance of a well-tuned computer requires a continuous rather than a one-shot effort."

Thus considerable study of the pertinent interactions is necessary before the key contributing factors are isolated. In situations arising from several dominant factors, use of the simplest explanation as a basis for decision can lead to "highly dysfunctional" results.

A good example of this phenomenon is provided in one of the studies. Performance measures of an IBM-360/65 (in terms of the percentage of CPU cycles productively utilized) indicated an increase after the addition of 50 percent more of

Bibliography

1. Abrams, Marshall D., George E. Lindamood. and Thomas N. Pyke, Jr., Measuring and Modelling Man-Computer Interaction, in Association for Computing Machinery, Proceedings of the SIGME Symposium, February 1973, pp. 136-142, 11 refs. (6430166)

External performance measurement completes the analytic and stimulus approaches used by Karush; service may be measured as delivered to actual users or to an artificial stimulus. The character, with two associated descriptors, is the unit of measurement. The first descriptor is the identity of the character's Source; the second descriptor specifies the time of occurrence of the character. The character itself is explicit: its source is implicit in the communications discipline; and an external clock provides the time of occur

rence.

With these simple data several models of the man-computer dialogue have been developed. The models differ primarily in the degree of the interactiveness of the dialogue. The paper presents two models and the terms used to describe each. The data stream model is described in terms of idle time, think time, computer burst, user burst, computer interburst time, user interburst time, computer burst segment, user burst segment. The stimulus-acknowledgement-response model is described in terms of acknowledgement delay, acknowledgement time, acknowledgement character count, system response time, system transmit time, system character count, user think time, user transmit time, and user character count.

The operation of the monitor is described briefly and some analysis of the data is provided. (JLW)

Category: 1.2

Key words: Dialogue monitor; idle time; man-computer interaction; measurement tools; models; think time.

2. Arbuckle, R. A., Computer Analysis and Thru

put Evaluation, Computers and Automation, 15:1 January 1966) pp. 12-15, 19. (6430197)

"The real criterion for measuring system performance is thruput. Yet many evaluations use only internal comparisons to rate a system's overall performance." Add-time, instruction time, instruction-mix, and kernel problem comparisons may provide relative internal performance figures for specific cases, but these are generally applicable to comparable computer families. In thruput evalua tion, the power of a system must be measured in terms of how fast it can perform the complete job.

In this connection, two type of benchmark problems are suggested: one which estimates. time, another which reports actual running times. How well either can evaluate thruput depends essentially on two major considerations: how well the benchmarks reflect actual jobs; and how well they characterize the total workload. Production job runs are identified as the "best way" to measure a system's performance. The widespread use of generalized compilers provides the capability. to run actual production jobs or systems with entirely different organizations. The choice of selecting jobs to reflect total system load still remains, even with this approach. Hardware monitors can be used to evaluate and tune system components. The article concludes with an example of the use of a hardware monitor to improve performance of an IBM-7094 system. (JLW)

Categories: 3.0; 1.2

Key words: Application benchmarks; computer systems: hadware monitors; IBM-7094; throughput.

3. Bell, T. E., Computer Performance Analysis: Measurement Objectives and Tools, The RAND Corp., Santa Monica, Calif., Rept. No. R-584-NASA/PR. February 1971, 32 pp.. 27 refs. (6430172)

This report suggests a number of objectives for computer system measurement and analy sis beyond the commonly accepted one of tuning. Objectives in computer operationsidentifying operational problems and improv. ing operational control-mean that personnel in this area should become familiar with new tools and techniques. Computer system simulators should be concerned with model validation as well as model development. Installation managers need results from this field in order to select equipment, trade man-time for machine-time, and tune installed equip

ment.

Data collection tools for use in measurement and analysis are necessary to fulfill these objectives. These tools range from simple, inexpensive ones-audio and visual indicators, operator opinions, and logs to the more sophisticated hardware and software monitors. Each of the simple tools can provide initial indications of performance, but hardware and software monitors are usually necessary for a thorough analysis. Five binary characteristics can describe a monitor: (1) implementation medium, (2) separability, (3) sample portion, (4) analysis concurrency, and (5) data presentation. An analyst should determine the characteristics his analysis requires before choosing a product.

Category: 1.2

Key words: Computer performance analysis; computer performance measurement; measurement tools; simulators; validation.

4. Bell, T. E., B. W. Boehm and R. A. Watson, Computer System Performance Improvement: Framework and Initial Phases, The RAND Corp., Santa Monica, Calif., Rept. No. R-549PR, August 1971, 55 pp., 4 refs. (6430187) This report distills selected RAND experience and research in the measurement and evaluation of computer system performance into a set of practical guidelines for organizing the initial phases of an effort to improve the performance of a general-purpose computer system. The report is designed primarily as an aid for "getting started" and provides a procedural framework which consists of seven phases. Only the initial three phases are discussed in detail in this report.

Category: 1.2

Key words: Computer performance analysis; guidelines; questionnaires.

core memory and several additional disk drives. However, a more detailed analysis of the data indicated that the increase actually correlated with a decrease in the average number of jobs resident in the increased memory, and was primarily due to an "otherwise undetected" increase in average CPU usage by individual user jobs. Analysis also indicated that the increased performance was due as much to decreases in the I/O characteristics of the workload as it was to configuration changes. (JLW)

Category: 1.2

Key words: Computer performance evaluation; computer performance measurement; multiprogrammed computer systems; system tuning; workload repre

sentation.

6. Boksenbaum, Melvin, Results of Benchmark Comparison of the Performance of IBM-360/85 and 370/165, Memorandum Rept., 2 Apr 71, 11 pp. (6430146)

The author states that the benchmark is a valid and useful tool in a comparison of the 360/85 with the 370/165. The reasons given for this are several. The performance of the 360/85 is known so that the relative performance of the 370/165 will be meaningful. Both machines are available with similar peripheral equipment and operating systems software. The programs selected for the benchmark are representative of the installation workload.

The benchmark consists of three parts: (1) a compute-bound linear programming problem run standalone; (2) an I/O bound monthly financial closing job also run standalone; and (3) a job stream of 29 programs taken from the normal installation workload.

The first two programs cited above represent the two extreme types of processing by a computer system and their performance by the 370/165 is a good measure of that computer's capabilities. The 29 programs comprising the job stream were carefully selected in order to represent every type of job run on the 360/85. Programs from each user department were selected on the basis of departmental monthly computer usage. The package includes a daily financial run which spans the job stream, six linear programming jobs of various sizes, and 22 other programs whose elapsed running times vary from one to fifteen minutes and which are coded in PL/I, COBOL, assembly language, and object code. The benchmark has a run time of approximately one hour on the 360/85. Summary data are presented in tabular form.

The results indicate that performance of the 370/ 165 should be approximately 90 percent of the

360/85. Turnaround time, peripheral utilization and overtime usage will not be significantly affected. However, the replacement is expected to effect a cost reduction of more than 20 percent, so that net increase in performance is significant. (JLW)

Category: 3.3

Key words: Application benchmarks; benchmark run analysis; computer performance measurement; com puter systems; IBM-370/165; IBM-360/85; work load representation.

7. Bookman, Philip G., Barry A. Brotman and Kurt L. Schmitt, Use Measurement Engineering for Better System Performance, Computer Decisions. 4:14 (April 1972) pp. 28–32. (6430167)

The article suggests, in the absence of a discipline of "measurement engineering," an engineeringlike approach which will provide a methodology for the application of measurement tasks. This approach is illustrated by a case study of a data center at Allied Chemicals which had an IBM360/50 and a 360/40 processing a workload consisting of local batch, remote job entry, and online systems. Through the use of the Configuration Utilization Evaluator (CUE) and of the Data Set Optimizer (DSO) the system was tuned and optimized so that the net results were the dropping of the 360/40 system and an annual saving of about a quarter of a million dollars. The author considers hardware and software monitors necessary tools to a measurement methodology, without which it would be impossible to obtain information on what parts of the system are in need of improvement or which ones are not operating at optimal capability. (JLW)

Category: 1.2

Key words: Computer performance measurement; configuration evaluators; data optimizers; measurement engineering; software monitors; system optimizing: system tuning.

8. Brocato, Louis J., Getting the Best Computer Svstem for Your Money, Computer Decisions, 3:9 (September 1971) pp. 12-16. (6430198)

The article describes a method for evaluating vendor proposals, based on weighting all of the required system elements and dividing the score by dollar costs. The "best" system then is benchmarked. This is a departure from current prac tice in which all vendors are required to perform the benchmark. If the benchmark run of the "best" system is successful, then that system is selected for procurement. If the benchmark fails, then the "next best" system is benchmarked, and so on, if necessary, until a contract is awarded. (JLW)

« Previous Continue »

Books

NBS Special Publication, Issues 401-405