Page images
PDF
EPUB

system without having designed both . . . 'Integrated Hardware/Software Design' requires not only a knowledge of hardware design and program design, but a thorough understanding of the total relationships between hardware and software in a working system. Effective hardware/ software design means consideration of the many potential tradeoffs between capability resident in the machine configuration and capability resident in the supporting software." (Constantine, 1968, p. 50).

5.2 For example, at the [ACM-SUNY Conf.], "Dr. McCann talked about the need for 'Language Escalation'" (Commun. ACM 9, 645 (1966)), while Perlis (1965), Clippinger (1965), and Burkhardt (1965), among others, discuss the general problems of hierarchies of language.

5.3 "Very large programs, roughly defined, are those that (1) demand many times the available primary storage for retention of code and immanent data, and (2) are sufficiently complex structurally to require more than ten coders for implementation." (Steel, 1965, p. 231).

"For the hypothetical air traffic control system used as an example, approximately forty volumes of several hundred pages each would be necessary to specify the program subsystems fully. In addition to detailing the many functions that must be performed by the several programs, limits on environmental behavior and transfer functions between the environment and the communication network must be described . . . The amount of interrelated, precise and not easily obtainable data required is staggering." (Steel, 1965, p. 233).

"The point upon which the mind should focus in the foregoing description of current large-scale programming procedures is that many people work a long time to prepare a computer to do quickly what it would take many people a long time to do. The number of people involved in a large programming task is very great, and the time it takes to prepare the program successfully is very long. For example, the programming of the entire Semi-Automatic Ground Environment system took roughly a thousand man-years, and the average output for the entire staff was between one and ten 'debugged' and final machine instructions per day. Despite the fact that almost any programmer can write a valid subroutine of ten instructions in ten minutes, the figure for the Semi-Automatic Ground Environment system does not indicate that anyone was lazy or inept. It represents the extreme difficulty of verylarge-scale programming and the essential incoherence, insofar as complex and highly interactive. procedures are concerned, of large, hierarchical staffs of people." (Licklider, 1964, p. 121).

5.4 How can very large, very complex, problems be effectively segmented for programmer attack? "For the practical application of programming languages to large problems, efficient segmentation

features are desirable so that parts of problems can be handled independently." (Burkhardt, 1965, p. 5). "Automatic segmenting of code is a critical and difficult problem. To be efficient, a program should not loop from segment to segment and further, all code not normally executed should be separate from the main flow of the program. Since program segments will be retrieved on demand, this reduces the program accesses." (Perry, 1965, p. 247).

"Since running programs will have access to only a portion of the memory, frequent ‘page turning' will be necessary as the program goes through its major operating pieces. Segmenting of the program can be very difficult if it must be done manually by the programmer. Assemblers or compilers with automatic segmentation or semi-automatic segmentation are needed." (Bauer, 1965, p. 23).

5.5 "We must now master, organize, systematize and document a whole new body of technical experience - that pertaining to programming systems. (Brooks, 1965, p. 88).

"When a programmer is asked to change a record format, for example, in an unfamiliar program, or when a systems analyst must estimate the time required for such a change, difficulties may arise which are out of proportion to the routine nature of the problem . . .

"Inadequate documentation may be regarded as one of the most serious problems confronting computer users." (Fisher, 1966, p. 26).

"One major concern is the program library. There are many needs here. One is an indexing system to permit users to retrieve a program from among the many. Another is adequate documentation of library programs. Documentation must specify what the program will do and under what conditions. Still another need is a linking mechanism: That is, a means for passing data from one library program to another without manual intervention so that library programs can become segments of composite, larger programs. These requirements make it difficult today to provide a convenient library of computer processes even in a single computer, not to mention the problems raised by machine-tomachine incompatibility. Today it is difficult to stand on the shoulders of previous programmers." (David, 1966, p. 2).

"The importance of a library to provide a repository for programs was recognized early, but full exploitation has been impeded by poor program documentation, lack of interest on the part of programmers, and language problems. Many program libraries now consist of programs in languages either dead or destined for an early demise. Limitations result from the dearth of program 'readers' and the serious practical difficulties in translation between machine languages." (Barton, 1963, p. 170).

5.6 "One of the most significant elements in the orderly, rapid assimilation of multiaccess system technology is adequate and appropriate documentation. 'Appropriate' is the operative word here, since the historical norms for system and program docu

mentation are probably inappropriate for the future. The time is rapidly approaching when 'professional' programmers will be among the least numerous and least significant system users." (Mills, 1967, p. 227).

5.7 "However, there is another means of reducing programming cost-making use of program development already done by others. At present this is difficult to do because of poor documentation and maintenance of programs by their authors. The primary reason for this sad state of affairs is the absence of any clear incentive for program authors to provide good documentation and to tailor their programs to the demands of prospective users rather than their own private whims." (Dennis, 1968, p. 376).

5.8 "The importance of documentation in the management of large programming developments is generally accepted. A number of groups have found a formal system of documentation the most effective management tool at their disposal. In its most advanced implementation, such a system of documentation is on-line to a time sharing system available to all participating members of the system programming project.

"Difficulties often are related to the programmer's resistance to documentation which may be due to several reasons:

● Lack of tangible evidence of benefit to his own activity.

The inaccessibility of his colleagues' documentation because of sheer quantity, lack of organization and common format and out of date status.

Rejection of standards, imposed for reasons he does not appreciate.

Belief (often confirmed) that he can get along without, and in fact feel at his creative 'best' when free to improvise.

"Putting a documentation system on-line appears to have overcome this resistance in a manner acceptable to the programmer.

The system itself can help him by rejecting certain types of inconsistencies.

He has instant access to the latest version of his colleagues' work.

Standards have been translated into formatting conventions with which he is familiar. • He understands that the system must safeguard itself and his programs from unauthorized change. Thus, he more readily accepts the need for authorization to change and implement." (Kay, 1969, p. 431).

5.9 "The greatest difficulty in programming still concerns the language to be used and the fact that any given program is relatively noninterchangeable on another machine unless it has been rather wastefully written in, say ALGOL or FORTRAN." (Duncan, 1967, p. x).

Machine language translation programs are therefore of interest, since they allow programs written for

one machine to be run on another and they provide a bootstrap for changeovers from one equipment system to another. Examples of such programs are Control Data's computer-aided translation system to transfer 7090 programs into their own 3600 Compass language (Wilson and Moss, 1965) and a system developed to reprogram Philco 2000 codes into IBM 7094 language (Olsen, 1965).

5.10 For example, "Any software system would become increasingly useful if it could be adapted to a variety of I/O configurations." (Salton, 1966, p. 209).

"If the performance of input/output functions requires specialized coding in the master control program of a system, then altering the set of peripherals or changing its i/o functions requires modifications of the master control programs, leading to the . . . problem of coping with evolution." (Dennis and Glaser, 1965, p. 9).

...

"If systems design can be automated, i.e., programmed for a computer, ultimately the configuration can be selected by systematically designing the systems for a variety of configurations and selecting the configuration which will run these applications at lowest cost." (Greenberger, 1965, p. 278).

5.11 "Several apparently mutually exclusive features of programming languages all have their advantages . . . How are we to resolve these issues?" (Raphael, 1966, p. 71).

...

5.12 "Walter F. Bauer, president of Informatics Incorporated, . . . predicted that in ten years all computer systems will be online systems and that 90 percent of all work on computers will involve online interaction." (Commun. ACM 9, No. 8, 645 (Aug. 1966).)

5.13 "Multiprogramming: That operation of a (serial) processor which permits the execution of a number of programs in such a way that none of the programs need be completed before another is started or continued." (Collila, 1966, p. 51.)

"A subfield of multiprogramming is concerned with the problems of computer system organization which arise specifically because of the multiplicity of input-output devices which interface with the system. The problems in this area are referred to as problems of multiaccessing." (Wegner, 1967, p. 135.)

"The requirements for console languages will pose a formidable problem for facility designers of the future." (Wagner and Granholm, 1965, p. 287).

"The whole system is multi-programmed, there being a number of object programs in core at once. Undoubtedly, we shall see such systems in operation and undoubtedly they will work. In the present state of knowledge, however, the construction of a supervisor for such a system is an immense task, and when constructed it has severe run-time overheads." (Wilkes and Needham, 1968, p. 315).

5.14 Brooks says further, "the new systems concepts of today and tomorrow are most keenly programming systems concepts: efficient time-sharing, fail-softly multiprocessing, effective mass

376-411 070-6

[ocr errors]

information retrieval, algorithms for storage allocation, nationwide real-time Teleprocessing systems.' (1965, p. 90).

5.15 "Most individuals accustomed to scientific computation or commercial data processing fail to appreciate the magnitude of the programming effort required for real-time control system implementation." (Steel, 1965, p. 231).

"Dr. Saul Rosen of Purdue University mentioned several fallacies of current time-sharing systems of which the most important is the belief that manufacturers who have great difficulty producing relatively simple software systems will somehow be better able to produce the very complex systems required for time sharing." (Commun. ACM 9, No. 8, 645 (Aug. 1966).)

5.16 "Typical functions of such executive systems include: priority scheduling, interrupt handling, error recovery, communications switching and the important and relevant area of cataloging, accessing and manipulating information and program files." (Weissman, 1967, p. 30).

5.17 "Today's fastest machine cannot be loaded down and will be idle most of the time unless it is coupled to a large number of high speed channels and peripheral units. . . In order to distribute input-output requests in a balanced flow, it must also be controlled by a complex monitor that chooses wisely among the jobs in its job queue." (Clippinger, 1965, p. 207).

"In some systems, more than one single program is processed with simultaneity . . . Inefficiency often results because the mix of individual programs, each written for sole occupancy of a computer, is unlikely to demand equal loading of each parallel element." (Opler, 1965, p. 306).

"System overhead includes scheduling and the continuous processing of console input. These functions are almost uniformly distributed, degrading the processor's execution rate by almost a constant." (Scherr, 1965, p. 14).

"The executive is usually multipurpose. It must be designed with a balance between the conflicting requirements of (1) continuous flow or batch processing, and (2) control for a demand processor in case time-sharing consoles should be attached. In addition, it usually has facilities for on-line control-in particular for communications switching." (Wagner and Granholm, 1965, p. 287).

"The utility programs provide three basic functions: the movement of data within the system required by time sharing or pooled procedure, the controlling of the printout of information on a pooled basis, and the controlling of accesses to auxiliary memory." (Bauer, 1965, p. 22).

"From the system designers' point of view, in time-sharing systems the most important thing is the supervisory program. Gallenson & Weissman pursue this subject in considerable detail and highlight other features such as 'memory protection, error checking circuitry for hardware, software and

[blocks in formation]

able logical modules [and] dynamic relocation mechanism' as being essential for time-sharing." (Davis, 1966, p. 225).

"From a programmer's point of view, one of the most important features of the second generation. of computers is the way it is possible to exploit their automatic interruption facilities to provide control programs and operating systems. A typical computer will have stored in it, more or less permanently, a control program (which may be called the 'Director', 'Master', 'Supervisor', 'Executive', etc.) whose functions are usually to arrange the loading and unloading of independent 'object' programs (the programs which actually do the work) and keep a record of the sequence of jobs they perform, to allocate input-output devices to these programs, and to enable the computer's operators to exercise the necessary control over its operation. It may also provide facilities for performing various kinds of input-output operation. The control program may be able to arrange for several object programs to be stored in the computer at once, and to 'time-share' the use of the instruction-sequencing unit of the computer between all these programs....

"The relationship between a control program and the object programs it controls in many ways resembles that between a deity and mere mortals – the analogy extends to the permanence, privileges, independence and ‘infallability' of control programs. Perhaps because of this, a misconception seems to have grown up about the extent of their activities. Although in a computer equipped with one instruction-sequencing unit the control program only 'comes to life' following an interruption of the object program, and effectively expires when the latter is resumed, it seems to be half-believed that, all the time the object program is active, the control program is leading some kind of independent existence; like an all-seeing presence, keeping a close watch on all the activities of the object program. This myth probably springs from experience of the behaviour of the control program when the object program is caught obeying an illegal instruction: but in fact this occurrence is detected by hardware, not by the control program itself." (Wetherfield, 1966, p. 161).

"Built-in accounting and analysis of system logs are used to provide a history of system performance as well as establish a basis for charging users." (Estrin et al., 1967, p. 645).

5.18 "A very significant development in software and one which must be given serious attention by the facility system designer, is the relatively new concept of Data Base Management." (Wagner and Granholm, 1965, p. 287).

"The layout and structuring of files to facilitate the efficient use of a common data base for a wide range of purposes requires careful analyses of the applications, the devices which store the files, and the file organization and processing. The criteria for efficiency are, as usual, maximum

throughput and minimum requirement for storage space. Ease of programming, program size, and running time are also important considerations." (Bonn, 1966, p. 1866).

"The most widespread information retrieval systems are those for data base file management, which process records organized into fields, each containing a type of data in the record." (Hayes, 1968, p. 23).

"Data Management Problems. It is interesting to review a few of the problems raised by G. H. Dobbs at the summary session of the first symposium on data management systems mentioned above. One of these was the diverse terminology and points of view, which make it difficult to extract any basic principles. Another was lack of concern to input quality control. Still another was lack of appreciation for the real-life data base problems as the user sees them. At the second symposium, two years later, Galentine described the relative lack of progress as 'apalling.' Dobbs, at this second session, identified several specific technical areas needing further development - among these, the ability to allow an unsophisticated user to describe data structures, capability to change data and file organization, ability to share files among simultaneous users with adequate file security 'lockout' procedures, and the need for more flexible report formatting." (Climenson, 1966, p. 128).

5.19 One example of a developmental system claiming to incorporate these features is the Catalog Input/Output System at RAND Corporation. More specifically: "Computer applications in linguistics, library science, and social science are creating a need for very large, intricately structured, and in some cases tentatively organized files of data. The catalog - a generalized format for data structures is designed to meet that need . . . The computer programs will:

a) Facilitate partitioning, rearranging, and converting data from any source in preparation for writing the catalog.

b) Format and convert data for printing on one of a variety of printers.

c) Sort the data elements within a catalog and merge data from two or more separate catalogs. d) Restructure a file by rearranging the order of classes of data-catalog transformations. e) Address nodes in the structure, retrieve data from the structure, and add to or delete from the structure-file maintenance." (Kay et al., 1966, pp. 1-2).

5.20 "In recent years there has been a rapid growth in the use of so-called 'formatted file systems'. These systems are general-purpose data storage, maintenance and retrieval systems designed to provide the user with a maximum amount of flexibility. They feature the use of a single set of programs to handle a variety of demands on a group of large files. Each file may possess a different

format, but all records within a file must be identical in format. New files may be created or old files changed to meet new requirements. Data can be added to files, or changes can be made to correct errors in existing files." (Baker and Triest, 1966, p. 5-1).

"The Formatted File System (FFS) developed for the Defense Intelligence Agency is a generalpurpose data management system for the IBM 1410 which is coupled to the 1410/7010 Operating System. It is oriented to a set of users (technicians) who can maintain an intimate knowledge of the structure of their files and the query language to access them. It employs both tapes and disc to define, maintain, and query a set of independent files. A table of contents and cross index can be defined and maintained on tape or disc. An FFS file must have a unique key field group in each record. A single level of embedded files (periodic sets) is permitted in the record. Except for the last field of a record, all fields are fixed in length. The query language permits general logical conditions and relations and provides several geographical and statistical operators. FFS is one of the few general-purpose data management systems which are operational." (Minker and Sable, 1967, p. 148).

"The users of non-numeric systems had requirements for very long alphanumeric records. Some of the records were formatted as were unit records but the fields were not all of predetermined length. To cope with this, the formatted file concept was developed. It had the ability to handle records of variable length by referring to a data definition which described the permissible record contents, context, and internal structure. The data definition could be carried within each record but was more normally separated into a data definition table to eliminate redundant entries. The formatted file could handle variable length records but could not interpret completely free form text. Special techniques were developed to handle free text which, in general, relied on the usual delimiters in the text, such as periods and commas, to identify the end of each structural unit. Free text then could be interpreted by scanning it as though the computer were reading it from left to right." (Aron, 1968, p. 7).

5.21 "Univac's B-O or Flow-Matic, which was running in 1956, was probably the first true DataProcessing compiler. It introduced the idea of file descriptions, consisting of detailed record and item descriptions, separate from the description of program procedures. It also introduced the idea of using the English language as a programming language." (Rosen, 1964, p. 8).

5.22 "General Electric has announced GECOS III (General Comprehensive Operating Supervisor III), an advanced operating system for large-scale computers. GECOS III integrates requirements for on-line batch, remote batch, and time-sharing into one system using a common data base. The 'heart' of the GECOS III is a centralized file system of hierarchical, tree-structured design which provides

multiprocessor access to a common data base, full file protection, and access control." (Commun. ACM 11, 71 (Jan. 1968).)

5.23 "The Integrated Data Store (IDS), developed by Bachman of the General Electric Company, is a data processing programming system that relies on linkage of all types for its retrieval and maintenance strategies. Through extensions to the COBOL language and compiler, IDS permits the programmer to use mass random-access storage as an extension of memory." (Minker and Sable, 1967, p. 126).

"The IDS file structure allows a linked-list structure in which the last item on every list is linked back to the parent item that started the list. Thus, it is possible to return to the parent item without a recursive list of return points. In IDS, each record is an element in a linked list. A file of records may be subordinated to a master record by linking it to the first member of the subordinate file and chaining from that point, through each record in the subordinate file, through the last one, and back to the master record. There is no inherent limit in IDS to the number of records that may exist in a chain or to the number of detailed chains that may be linked to a given record with a single master record. There is also no inherent limit to the depth of nesting that is permitted; i.e., a record in a chain that is subordinate to a given record may, in turn, have subordinate record chains." (Minker and Sable, 1967, pp. 126-127).

"The G.E. Integrated Data Store is an example of a linked file organization. Master and detail items are organized in a series of linked chains to form records. Each chain at least one master item and one or more detail items. Each item contains linking or chaining information which contains the addresses of the next item and the previous item in the chain. An item may belong to several chains, and linking information to all chains is included in the record." (Bonn, 1966, p. 1867).

5.24 "Franks describes the SDC Time-Shared Data Management System (TDMS), whose design draws upon ADAM and the earlier LUCID. TDMS employs an interesting data structure involving only a single appearance of each item of data with appropriately organized pointers to represent order, multiple instances, etc. This is a muchdiscussed idea that has needed exploration in a large system. TDMS, like too many similar systems, lacks means by which the system can 'learn' frequently traversed paths through the data, a mechanism that would permit subsequent identical or similar searches to be handled more efficiently." (Mills, 1967, p. 240).

"Williams & Bartram have developed a report generator as part of the TDMS. The object of this program is to give a nonprogrammer the ability, while he is on-line with the system, to access a large file of data for the purpose of de

scribing and generating a report. Another work in this area is by Roberts." (Minker and Sable, 1967, p. 137).

"The file organization of TDMS is an inverted tree structure with self-defining entries. This organization has made it possible for TDMS to meet its goal of providing rapid responses to unpredictable queries in a time-shared environment. Although this organization requires more on-line, random-access storage than most other file organizations, the benefits obtained far outweigh this storage cost." (Bleier and Vorhaus, 1968, p. F97).

5.25 "IBM has developed a Generalized Information System based on experience gained with military file applications. Because of IBM's intention to provide this system as part of its applications library for the System/360 series, this system undoubtedly will be examined quite closely by a variety of potential users. The system has two basic modules: one for defining, maintaining, and retrieving files of 'formatted' data, and one for text processing and concordance-type retrieval." (Climenson, 1966, p. 126).

"The text-processing module of GIS includes three basic files: (1) a dictionary ordered on key word, each record containing: pointers to synonyms and equivalents, key word frequency data, and a pointer to (2) the inverted file, which can contain a variable number of document numbers indexed by the given key word. Finally, (3) the master file can contain bibliographic data and all words stored for that document. Given the document number, the bibliographic data, and key words from the document, the system can automatically generate the above files." (Climenson, 1966, p. 127).

5.26 "GRED, the Generalized Random Extract Device developed at Thiokol Chemical Corporation and described by Heiner & Leishman, is written in COBOL. Files within the system follow the COBOL restrictions of fixed-record size and fixedfield length size for each field; the files are also restricted to tape. File definitions are provided at run time by the user, who specifies the file and record description. A file definition library option is provided and can be maintained by input request. The system, developed for the IBM 7010 computer, has the ability to sort and output data, and is useful for small files." (Minker and Sable, 1967, p. 147).

5.27 "At a second symposium held in September 1965, a benchmark problem was used to organize the discussion of specific systems. The problem involved a management data base—that is, an organization table and personnel files. Five systems were given the same file data and asked to create the file(s) and perform several kinds of operations on it. The five systems were: COLINGO of MITRE Corporation, the Mark III File Management System of Informatics, Inc., the on-line data management system by Bolt, Beranek, and Newman, Inc., the BEST System of National Cash Register, and the

« PreviousContinue »