Page images
PDF
EPUB
[blocks in formation]

or no.

Test Programs As Test Data, Not Algorithms

to

a

The test programs do not embody some definitive algorithm by which the question of processor conformance can be answered yes There is an important sense in which it is only accidental that they are programs at all; indeed, some of them, syntactically, are not. Rather their primary function is as test data. It is readily apparent, for instance, that the majority of BASIC test programs are algorithmically trivial; some consist only of series of PRINT statements. Viewed as test data, however, i.e., a series of inputs to a system whose behavior we wish probe, the underlying motivation for their structure becomes intelligible. Simply put, it is the goal of the tests to exercise at least one representative of every meaningfully distinct type of semantic behavior provided for in the language standard. This strategy is characteristic of testing in general: all one can do is submit a representative subset of the typically infinite number possible inputs to the system under investigation implementation) and see whether the results are in accord with the specifications for that system (the language standard).

to

syntactic structure or

of

(the

Thus, successful results of the tests are necessary, but not sufficient show that the specifications are met. A failed test shows that a language implementation is not standard. A passed test shows that it may be. A long series of passed tests which seem to cover all the various aspects of the language gives us a large measure of confidence that the implementation conforms to the standard.

It can scarcely be stressed too strongly that the test programs do not represent some self-sufficient algorithm which will automatically deliver correct results to a passive observer. Rather they are best seen as one component in a larger system comprising not only the programs, but the documentation of the programs, the documentation of the processor under test, and, not least, a reasonably well-informed user who must actively interpret the results of the tests in the context of some broad background knowledge about the programs, the processor, and the language standard. If,

for

example, a processor rejects a

standard program, it certainly fails to conform to the standard; yet this is a type of behavior which can hardly be detected by the program itself: only a human observer who knows that the processor must accept standard programs, and that this program is standard, is capable of the proper judgment that this

therefore violates the language standard.

processor

3.2 Special Issues Raised By The Standard Requirements

3.2.1

Implementation-defined Features

At several points in the standard, processors are given a choice about how to implement certain features. These subjects of choice are listed in Appendix C of the standard. In order to conform, implementations must be accompanied by documentation describing their treatment of these features (see section 1.4.2(7) of the standard). Many of these choices, especially those concerning numeric precision, string and numeric overflow, and uninitialized variables, can have a marked effect on the result of executing even standard programs. A given program, for instance, might execute without exceptions on one standard implementation, and cause overflow on another, with a notably different numeric result. The programs that test features in these areas call for especially careful interpretation by the

user.

Another class of implementation-defined features is that associated with language enhancements. If an implementation executes non-standard programs, it also must document the meaning it assigns to the non-standard constructions within them. For instance, if an implementation allows comparison of strings with a less-than operator, it must document its interpretation of this comparison.

[blocks in formation]

to

1) the

The standard for BASIC, in view of its intended user base of beginning and casual programmers, attempts to specify what a conforming processor must do when confronted with non-standard circumstances. There are two ways in which this can happen: a program submitted to the processor might not conform standard syntactic rules, or 2) the executing program might attempt some operation for which there is no reasonable semantic interpretation, e.g., division by zero, assignment subscripted variable outside of the array. standard, the first case is called an error, and the second an exception, and in order to conform, a processor must take certain

actions upon encountering either sort of anomaly.

In the

a

to BASIC

Given a program with a syntactically non-standard construction the processor must either reject the program with a message to the user noting the reason for rejection, or, if it accepts the program, it must be accompanied by documentation which describes the interpretation of the construction.

of

If a condition defined as an exception arises in the course execution, the processor is obliged, first to report the exception, and then to do one of two things, depending on the type of exception: either it must apply a so-called recovery procedure and continue execution, or it must terminate execution.

The

Note that it is the user, not the program, who must determine whether there has been an adequate error or exception report, or whether appropriate documentation exists. pseudo-code in Figure 1 describes how conforming implementations must treat errors. It may be thought of as an algorithm which the user (not the programs) must execute in order to interpret

[blocks in formation]

a

The procedure for error handling in Figure 1 speaks of processor accepting or rejecting a-program. The glossary (sec. 19) of the standard defines accept as "to acknowledge as being valid". A processor, then, is said to reject a program if it in some way signifies to the user that an invalid construction (and not just an exception) has been found, whenever it encounters the presumably non-standard construction, or if the processor simply fails execute the program at all. A processor implicitly accepts a program if the processor encounters all constructions within the program with no indication to the user that the program contains constructions ruled out by the standard or the implementation's documentation.

to

In like manner, we can construct pseudo-code operating instructions to the user, which describe how to determine whether an exception has been handled in conformance with the standard and this is shown also in Figure 1.

As a point of clarification, it should be understood that these categories

of

error

to

all

and exception apply implementations, both compilers and interpreters, even though they are more easily understood in terms of a compiler, which first does all the syntax checking and then all the execution, than of an interpreter. There is no requirement, for instance, that error reports precede exception reports. It is the content, rather than the timing, of the message that the standard implies. Messages to reject errors should stress the fact of ill-formed source code. Exception reports should note the conditions, such as data values or flow of control, that are abnormal, without implying that the source code per se is invalid.

if program is standard

Error Handling

if program accepted by processor

if correct results and behavior
processor PASSES

else

processor FAILS (incorrect interpretation)

endif

else

processor FAILS (rejects standard program)

endif

else (program non-standard)

if program accepted by processor

if non-standard feature correctly documented

processor PASSES

else

processor FAILS (incorrect/missing documentation
for non-standard feature)*

endif

else (non-standard program rejected)

if appropriate error message

processor PASSES

processor FAILS (did not report reason for rejection)

else

[blocks in formation]

if processor reports exception

if procedure is specified for exception

and host system capable of procedure if processor follows specified procedure processor PASSES

else

processor FAILS (recovery procedure not followed)

endif

else (no procedure specified or unable to handle)

if processor terminates program

processor PASSES

else

processor FAILS (non-termination on fatal exception)

endif

endif

else

processor FAILS (fail to report exception)

endif

Figure 1

[blocks in formation]

The design of the test programs is an attempt harmonize several disparate goals: 1) exercise all the individual parts of the standard, 2) test combinations of features where it seems likely that the interaction of these features is vulnerable to incorrect implementation, 3) minimize the number tests, make the tests easy to use and their results easy to interpret, and 5) give the user helpful information about the implementation even, if possible, in the case of failure of a test. The rest of this section describes the strategy we ultimately adopted, and its relationship to conformance and to interpretation by the user of the programs.

to

of

4)

[blocks in formation]

Perhaps the most difficult problem of design is to find some organizing principle which suggests a natural sequence to the programs. In many ways, the most natural and simple approach is simply to test the language features in the order they appear in the standard itself. The major problem with this strategy is that the tests must then use untested features in order to exercise the features of immediate interest.

This raises the possibility that the feature ostensibly being tested might wrongly pass the test because of a flaw in the implementation of the feature whose validity is implicitly being assumed.

Furthermore, when a test does report a failure, it is not clear whether the true cause of the failure was the feature under test or one of the untested features being used.

a

These considerations seemed compelling enough that we decided to order the tests according to the principle of testing features before using them. This approach is not without its own problems, however. First and most importantly, it destroys any simple correspondence between the tests and sections of the standard. The testing of a given section may well be scattered throughout the entire test sequence and it is not a trivial task to identify just those tests whose results pertain to the section of interest. To ameliorate this problem, we have been careful to note at the beginning of each test just which sections of the standard it applies to, and have compiled cross-reference listing (see section 6.3), so that you may quickly find the tests relevant to a particular section. A second problem is that occasionally the programming of a test becomes artificially awkward because the language feature appropriate for a certain task hasn't been tested yet. While the programs generally abide by the test-before-use rule, there are some cases in which the price in programming efficiency and convenience is simply too high and therefore a few of the programs do employ untested features. When this happens, however, the program always generates a message telling you which untested feature it is depending on. Furthermore, we were careful to use the untested

« PreviousContinue »