Research and Development in the Computer and Information Sciences: Overall system design considerations; a selective literature review

hensive survey of computer simulation languages and applications, with tables of comparative characteristics, as of 1966.7.26 In addition, IBM has provided a bibliography on simulation, also as of 1966.

Again, as in the area of graphic input manipulation and output, the field of effective simulation has specific R & D requirements for improved and more versatile machine models and programming languages. Clancy and Fineberg suggest that “the very number and diversity of languages suggests that the field [of digital simulation languages] suffers from a lack of perspective and direction.” (1965, p. 23).

The area of improved simulation languages is one that has a multiple interaction between software and hardware, especially where a computer is to be used to simulate another computer, perhaps one whose design is not yet complete 7.27 or to simulate many different scheduling, queuing and storage allocation alternatives in time-shared systems (see, for example, Blunt 1965). Such problems are also discussed by Scherr (1965) and by Larsen and Mano (1965), among others, while Parnas (1966) describes

modification of ALGOL (SFD-ALGOL, for "System Function Description") applicable to the simulation of synchronous systems.

However, there are difficult current problems in that languages such as SIMSCRIPT do not take advantage of the modularity of many processing systems, that conditional scheduling of sequences of events is extremely difficult 7.28 and that "we are still plagued by our inability to program for simultaneous action, even for the scheduling of large units in a computing system." (Gorn, 1966, p. 232).

In addition, for simulation and similar applications, heuristic or ad hoc programming facilities may be required. Thus, "a computer program which is to serve as a model must be able to have well-organized yet manipulatable data storage, easily augmentable and modifiable. The program must be self-modifying in a similarly organized way. It should be able to handle large blocks of data or program routines by specification of merely a name." (Strom, 1965, p. 114.)

For simulations or testings with controls, and without discernible interruption or reallocation of normal servicing of client processing requests, compilers must be available that will transform queries expressed in one or more commonly available customer languages to the language(s) most effectively used by the substituted experimental system and to the format(s) available in a master data base.

Then there are problems in the development of an appropriate "scenario", or sequence of events to be simulated.7.29 Burdick and Naylor (1966) provide a survey account of the problems of design and analysis of computer simulation experiments.

The problems of effective simulation of complex, interdependent processes are another area of increasing concern. Suppose, for example, that we are seeking to simulate a process in which many

separate operations are carried out concurrently or in parallel, and that the simulation technique requires a serial sequencing of these operations. Depending upon the choice of which one of the theoretically concurrent operations is processed first in the sequentializing procedure, the results of the simulation may be significantly different in one case than in another.7.30

For example, the SL/1 language being developed at the University of Pisa under Caracciolo di Forino (1965) is based in part on SOL (Simulation-Oriented Language, see Knuth and McNeley, 1964) and in part on SIMULA (the ALGOL extension developed by O. J. Dahl, of the Norwegian Computing Center, Oslo).7.31 A second version, SL/2, now under development, will provide self-adapting features to optimize the system. Caracciolo emphasizes that, for any set of deterministic processes that are to be applied simultaneously, but where problems of incompatibility may arise, the problems can be reduced to a set of probabilistic processes. Otherwise, if one sequentializes parallel, concurrent processes actually dependent upon the order of sequentialization, then hidden problems of incompatibility may vitiate the results obtained.

Despite difficulties, however, progress has been and is being made. Thus computer simulation has been investigated as a means of system simulation for determination of probable costs and benefits in advance of major investments in equipment or procedures.7.32 Then, as reported by Gibson (1967), simulation studies have been used to determine that block transfers of 4 to 16 words will facilitate reduction of effective internal access times to a few nanoseconds. Other programs to simulate digital data processing, time-shared system performance, and the like, are discussed by Larsen and Mano (1965) and by Scherr (1965). Simulation studies in terms of multiprocessor systems are represented by Lindquist et al. (1966) 7.33 and by Huesmann and Goldberg (1967).7.34

Other advantages from research and development efforts to be anticipated from computer simulation experiments are those of transfer of applications from a given computer to another not yet installed or available, 7.35 advancements in techniques of pictorial data processing and transmission,7.36 advance appraisals of performance of time-shared systems,7.37 and investigations of probable performance of adaptive recognition systems.

Finally, we note prospects for system simulation as a means of evaluation and of redesign, including the alteration of scheduling priorities to meet changing requirements of the system's clientele. Three examples from the literature are as follows: (1) "Use of a simulator permits the installation to continue running its programs as reprogramming proceeds on a reasonable schedule." (Trimble, 1965, p. 18).

(2) "Effective response time simulation can be

easily modified to provide operating costs of retrieval." (Blunt, 1965, p. 9). (3) "When large systems are being developed another set of programs is involved to perform a function not required for simpler situations. These are the simulation and analysis programs for system evaluation and-for semiautomated systems having a human component-system training." (Steel, 1965, p. 232). On the other hand, as Davis warns: "It is obvious that there is some threshold beyond which the real environment is too complex to permit meaningful simulation." (1965, p. 82). For the future, therefore, a system of multiple-working-hypotheses might well

be developed: "The benefits and drawbacks of empirical data gathering vs. simulation vs. mathematical analysis are well documented. What we would really like to be able to do is a little of all three, back and forth, until our gradually increasing comprehension of the problem becomes the desired solution." (Greenberger, 1966, p. 347). Similarly, it may be claimed that simulation models often cumbersome and difficult to adapt to new configurations, with results of somewhat uncertain interpretation due to statistical sampling variability. Ideally, simulation and analytic techniques should supplement each other, for each approach has its advantages." (Gaver, 1967, p. 423).

are

8. Conclusions

As we have seen, major trends in input/output, storage, processor, and programming design relate to multiple access, multiprogrammed, and multiprocessor systems. On-line simulation, instrumentation, and performance evaluation capabilities are necessary in order to effectively measure and test proposed techniques, systems, and networks of broad future significance to improved utilization of automatic data processing techniques.

We may therefore close this report on overall system design considerations with the following quotations:

(1) "In rating the completeness, clarity, and simplicity of the system diagnostics, command language and keyboard procedures, we found their 'goodness' was inversely related to the running efficiency of the system. System developers should examine this condition to determine whether inefficient execution is an inherent feature of system[s] supplying complete and easily understood diagnostics, or a function of the specific interests and prejudices of the developers." (O'Sullivan, 1967, p. 170).

(2) "An engineer who wishes to concern himself with performance criteria in the synthesis of new systems is frustrated by the weakness of measurement of computer system behavior." (Estrin et al., 1967, p. 645.)

(4) "Today, and to an even greater extent tomorrow, the use of multiple functional units within the information processing system, the multiplexing of input and output messages, and the increased use of software to permit multiprogramming will require more subtle measures to evaluate a particular system's performance." (Nisenoff, 1966, p. 1828.)

(5) "Broad areas for further research are indicated . . . Comparative experimental studies of computer facility performance, such as online, offline, and hybrid installations, systematically permuted against broad classes of program languages (machine-oriented, procedure-oriented, and problem-oriented languages), and representative classes of programming tasks." (Sackman et al., 1968, p. 10), and

(6) "Improved methods of simulation, optimizing techniques, scheduling algorithms, methods of dealing with stochastic variables, these are the important developments that are pushing back the limits of our ability to deal with very large systems." (Harder, 1968, p. 233.)

Finally we note that the problems of the information processing system designer, then, are today aggrevated not only by networking, time-sharing, time-slicing, multiprocessor and multiprogramming potentialities, but also by critical questions involving the values and the costs of maintaining the integrity of privileged files. By the terminology "privileged files", we suggest the interpretation of all data stored in a machine-useful system that may have varying degrees of privacy, confidentiality, or security restrictions placed upon unauthorized access. Some of the background considerations affecting both policy and design factors will be discussed in the next report in this series.

Appendix A. Background Notes on Overall System Design Requirements

In this Appendix we present further discussion and background material intended to highlight currently identifiable research and development requirements in the broad field of the computer and information sciences, with emphasis upon overall system design considerations with respect to information processing systems. A number of illustrative examples, pertinent quotations from the literature, and references to current R and D efforts have been assembled. These background notes have been referenced, as appropriate, in the summary text.

1. Introduction

1.1 There are certain obvious difficulties with respect to the organization of material for a series of reports on research and development requirements in the computer and information sciences and technologies. These problems stem from the overlaps between functional areas in which man-machine interactions of both communication and control are sought; the techniques, tools, and instrumentation available to achieve such interactions, and the wide variety of application areas involved.

The material that has been collected and reviewed to date is so multifaceted and so extensive as to require organization into reasonably tractable (but arbitrary) subdivisions. Having considered some of the R and D requirements affecting specific Boxes shown in Figure 1 (p. 2) in previous reports, we will discuss here some of the overall system design considerations affecting more than one of the processes or functions shown in Figure 1.

Other topics to be covered in separate reports in this series will include specific problems of information storage, selection and retrieval systems and the questions of maintaining the integrity of privileged files (i.e., some of the background considerations with respect to the issues of privacy, confidentiality, and/or security in the case of multiply-accessed, machine-based files, data banks, and computercommunication networks).

In general, the plan of attack in each individual report in the series will be to outline in relatively short discursive text the topics of concern, supplemented by background notes and quotations and by an appendix giving the bibliographic citations of quoted references. It is planned, however, that there will be a comprehensive summary, bibliography, and index for the series as a whole.

Since problems of organization, terminology, and coverage have all been difficult in the preparation of this series of reports, certain disclaimers and observations with respect to the purpose and scope of this report, its necessary selectivity, and the problems of organization and emphasis are to be noted. Obviously, the reviewer's interests and limitations will emerge at least indirectly in terms of the selectivity that has been applied.

In general, controversial opinions expressed or implied in any of the reports in this series are the sole responsibility of the author(s) of that report and are not intended in any way to represent the official policies of the Center for Computer Sciences and Technology, the National Bureau of Standards, or the Department of Commerce. However, every effort has been made to buttress potentially controversial statements or implications either with direct quotations or with illustrative examples from the pertinent literature in the field.

It is especially to be noted that the references and quotations included in the text of this report, in the corroborative background notes, or in the bibliography, are necessarily highly selective. Neither inclusion nor citation is intended in any way to represent an endorsement of any specific commercially available device or system, of any particular investigator's results with respect to those of others, or of the objectives of projects that are mentioned. Conversely, omissions are often inadvertent and are in no sense intended to imply adverse evaluations of products, materials and media, equipment, systems, project goals and project results, or of bibliographic references not included.

There will be quite obvious objections to this necessary selectivity from readers who are also R & D workers in the fields involved as to the representativeness of cited contributions from their own work or that of others. Such criticisms are almost inevitable. Nevertheless, these reports are not intended to be state-of-the-art reviews as such, but, rather, they are intended to provide provocative suggestions for further R & D efforts. Selectivity must also relate to a necessarily arbitrary cut-off date in terms of the literature covered.

These reports, subject to the foregoing caveats, are offered as possible contributions to the understanding of the general state of the art, especially with respect to long-range research possibilities in a variety of disciplines that are potentially applicable to information processing problems. The reports are therefore directed to a varied audience among whom are those who plan, conduct, and support

research in these varied disciplines. They are also addressed to applications specialists who may hope eventually to profit from the results of current research efforts. Inevitably, there must be some repetitions of the obvious or over-simplifications of certain topics for some readers, and there must also be some too brief or inadequately explained discussions on other topics for these and other readers. What is at best tutorial for one may be difficult for another to follow. It is hoped, however, that the notes and bibliographic citations will provide sufficient clues for further follow-up as desired. The literature survey upon which this report is based generally covered the period from mid-1962 to mid1968, although a few earlier and a few later references have also been included as appropriate.

1.2 Certain features of the information flow and and process schema of Figure 1 are to be noted. It is assumed, first, that the generalized information processing system should provide for automatic access from and to many users at many locations. This implies multiple inputs in parallel, system

interruptibility, and interlacings of computer programs. It is assumed, further, that the overall scheme involves hierarchies of systems, devices and procedures, that processing involves multistep operations, and that multimode operation is possible, depending on job requirements, prior or tentative results, accessibility, costs, and the like. It should be noted, next, that techniques suggested for a specific system may apply to more than one operational box or function shown in the generalized diagram of Figure 1. Similarly, in a specific system, the various operations or processes may occur in different sequences (including iterations) and several different ones may be combined in various ways. Thus, for example, questions of remote console design may affect original item input, input of processing service requests, output, and entry of feedback information from the user or the system client. The specific solutions adopted may be implemented in each of these operational areas, or combined into one, e.g., by requiring all imputs and outputs to flow through the same hardware.

2. Requirements and Resources Analysis

2.1 "The single information flow concept is input-oriented. The system is organized so that essential data are inserted into a common reservoir through point-of-origin input/output devices. User requirements are then satisfied from this reservoir of fundamental data about transactions.

"Thus, the single information flow concept is characterized by random entry of data, direct access to data in the system, and complete real-time processing . . . fast response, a high degree of reliability, and an easily expansible system. (Moravec, 1965, p. 173).

2.2 "In a highly distributed system, however, information on inputs to the organization flow directly to relatively low-level way stations where all possible processing is done and all actions are taken that are allowed by the protocol governing that level. In addition to the direct actions that it takes, the lowest, or reflexive, level of information processing ordinarily generates two classes of information. These are, first, summaries of actions taken or anticipated and, second, summaries of information inputs that, because of their type, salience, or criticality, fall outside the range of action that policy has established as appropriate for that level..

"In computer terms, a highly distributed system involves a primary executive program that adds and subtracts subroutines to various primary libraries from which alternative subroutines are to be drawn and combined. Secondary executive programs, responding to separate inputs and conditions, select and organize subroutines from each of these primary libraries and add and subtract subroutines to various secondary libraries from which tertiary executive programs select alternative subroutines for use at

their level and for controlling the library one level down, and so forth. The flexibility of a distributed system is an outgrowth of the ability of each of the lower executive programs to organize its program on the basis of separate inputs reaching it directly." (Bennett, 1964, pp. 104-106).

"By a distributed implementation of an information service system we mean that the data processing activity is carried out by several or many installations... The data base is now distributed among the installations making up the information network for this service system . .

"The distributed information network should offer considerable advantage in reducing the cost of terminal communications by permitting installations to be located near concentrations of terminals. (Dennis, 1968, p. 373).

2.3 "A large number of factors (user communities, document forms, subject disciplines, desired services, to name but a few) compete for the attention of the designer of information service systems. A methodology for the careful organization of these factors and the orderly consideration of their relationships is essential if intelligent decisions are to be made." (Sparks et al., 1965, pp. 1-2).

"The lack of recognition of the nature and even, in some cases, the existence of the problems facing the information systems designer has meant that there has been little or no orderly development of generally agreed upon system methodology." (Hughes Dynamics, 1964, p. 1-7).

"To the best of our knowledge, no one has yet developed a completely satisfactory theory of information processing. Because there is no strong theoretical basis for the field, we must rely on intuition, experience and the application of heuristic

« Previous Continue »

Books

Research and Development in the Computer and Information Sciences: Overall ...