Geoff Chappell - Software Analyst
[ ]#[ ][directive[arguments]]
From the start of the line, there may be any amount of white space, including none, before the # sign. There may then be any amount of white space, including none, before the optional identifier that is here labelled directive. Except for a coding oversight discussed below, it is an error (C2019) if the first character that follows the # sign and is not white space is not valid for starting an identifier.
Though directive is an identifier, it is not subject to macro expansion. It is a fatal error (C1021) if directive is not the name of a supported directive (as listed below). Interpretation of whatever follows directive varies from one directive to another. Details are left to separate notes for the individual directives.
In the following list of preprocessor directives supported by Microsoft Visual C++ version 13.00.9466, those that seem to be omitted from the product documentation are highlighted yellow.
In addition, #bimport is recognised but only to be rejected, so that it produces the fatal error C1021, just as for an unrecognised directive.
For the # sign and for white space before and after, the input stream is interpreted as if trigraphs, line splices and comments have been translated. For the composition of directive, interpretation of the input stream follows the usual rules for identifiers.
If where directive is expected there is instead a question mark, backslash or forward slash that does not introduce a trigraph, line splice or comment (respectively), then the compiler proceeds as if it has not only found an identifer to name the directive but has already processed it. The identifier that was processed most recently becomes the directive and the characters that follow become arguments for that directive. For example, compiling
int pragma; # / message ("This surely ought not work.")
displays the quoted message, just as for
# pragma message ("This surely ought not work")
Some directives may in some circumstances require that the preprocessor scan ahead for another directive that is deemed to match the first, and discard whatever it finds along the way. Details vary with the directives and are presented in the notes for those directives. However, the scanning is common to all and is therefore described here.
When scanning for the matching directive, each line that is not enclosed in quotes (single or double) is tested quickly for whether it is a preprocessor directive, and if so, which one. Except for a coding oversight discussed below, the quick test conforms to the syntax above but interpretation proceeds only as far as identifying the directive and there are no errors: a line that does not scan as a supported preprocessor directive is simply discarded.
NOT QUITE TRUE
Note that each line scanned, whether discarded or not, counts for line numbering. If the /E option (including as implied by /EP or /P) is active, then each line is represented in the preprocessor output as an empty line.
While scanning, the preprocessor tracks a nesting level of conditional blocks. Each #if, #ifdef or #ifndef opens a block.
The quick test neglects to translate trigraphs on the way to finding the directive. Consider for example, the fragment
??=if 0 whatever ??=endif
The first line is interpreted fully and accepted as a #if directive. If the conditional expression for this directive evaluated as non-zero, then the preprocessor would interpret fully the lines of whatever. The last line, too, would be interpreted fully and accepted as a #endif directive, specifically as the directive that closes the conditional block. However, with 0 as the conditional expression, the preprocessor is to discard the lines of whatever, with no more interpretation than to scan for a line to accept as the directive that closes the block. The last line, with its # sign made as a trigraph, is not even a candidate. The preprocessor continues discarding input, still scanning for a directive to close the block.
This neglect of trigraph translation in the quick scan for preprocessor directives is presumably by oversight, not design. Note that Microsoft documents the problem as BUG: Trigraph Statements May Produce End-of-File Error. The KB number (120668), being low, dates Microsoft’s first awareness of the problem to long, long ago. It is good of Microsoft to keep the description up to date, so that the article lists so many versions in which Microsoft is content to leave the problem unfixed. In some sense, this is fair enough: surely nobody nowadays uses trigraphs in real-world code. But if Microsoft means not to support trigraphs, can’t Microsoft actually say so?
As an aside (perhaps of value only for Microsoft’s programmers should they care to fix the problem), note that the problem is not directly with Microsoft’s code that scans for preprocessor directives, but is instead with a routine that this code calls for skipping white space. More precisely, this routine gets the next character from the input stream as if line splices and comments have been translated and white space discarded, but with trigraphs left alone. This routine is used fairly widely in the interpretation of individual preprocessor directives, so that there are rather many cases where trigraphs are not translated. These cases are described in the notes for the relevant directives.