Interview questions: Syntax Testing

In syntax testing, the test-cases, i.e. the input to the software, are created based on the specifications of languages understood by the interfaces of a software. Interfaces have many formats: command-line prompts, files, environment variables, pipes, sockets, etc. An interface has a language which defines what legal input to the interface is and what is not. This language may be hidden or open. The existence of a hidden language is not explicitly realised by developers of the software, but a piece of software might nevertheless read some input data, parse it, and act accordingly. In a broader sense, hidden languages exist also in data structures used to transfer information from a software module to another. An open language is appropriately specified in the software documentation.
The motivation for syntax testing springs from the fact that each interface has a language, whether it is hidden or open, from which effective tests can be created with a relatively small effort. Syntax testing is more likely to find faults from the portions of software responsible for hidden language handling, because open languages must have been explicitly considered by the programmer and thus the input handling portion is likely to be better.
Automated syntax testing requires a formal description of the input language in machine readable format. If the language is hidden, the tester must create a specification for it. Common choices are BNF (Backus-Naur form) and regular expressions. Both are notations to define context-free grammar languages. A sentence is a sequence of bytes which are arranged according to the rules of the language. Context-free languages have traditionally been used in compilers of various sorts to create parsers for input sentences. In syntax testing, a context-free language is the base used to generate sentences. These sentences are then fed, or injected, into the software being tested to see if it accepts them, rejects them or fails to process them altogether.
The selection of test-cases in syntax testing could start with single-error sentences. This is likely to reveal most faults assuming the faults are mutually independent and a fault is triggered by one error in a sentence. After all the sentences with one error are tried, the testing proceeds to pairs of errors, three error combinations, and so on. The number of test-cases grows exponentially by the number of combined errors. Several different kinds of errors can be produced in syntax testing:
Syntax errors
Syntax errors violate the grammar of the underlying language. They are created by removing an element, adding an extra element and providing the elements in wrong order. Syntax errors can exist on different levels in the grammar hierarchy: top-level, intermediate-level and field-level.
Delimiter errors
Delimiters mark the separation of fields in a sentence. In ASCII-coded languages the fields are normally characters and letters, and delimiter are white space characters (space, tab, line-feed, etc.), or other delimiters characters (commas, semicolons, etc.) or their combinations. Delimiters can be omitted, multiplied or replaced by other unusual characters. Paired delimiters, such as braces, can be left unbalanced.
Field-value errors
A field-value error is an illegal field in a sentence. Normally, a field value has a range or many disjoint ranges of allowable values. Field errors can include values which are one-below, one-above and totally out-of-range. Values exactly at the range boundary should also be checked.
Context-dependent errors
A context-dependent error violates some property of a sentence which cannot, in practice, be described by context-free grammar.
State dependency error
Not all sentences are acceptable in every possible state of a software component. A state dependency error is generated by inputting a correct sentence during an incorrect state.
Automatic generation of sentences leads to automatic test design where test-cases are designed by a computer. It is suitable, for example, for stress testing, where software is fed with a large amount of input data.

Interview questions

Thursday, April 17, 2008

Syntax Testing

No comments:

Blog Archive