The File Preprocessor
Table of Contents
Main Features
The preprocessor allows you to declare variables and evaluate expressions. It also possesses some programming language capability, with branching control to skip or loop over a selected block of lines.
The preprocesor is built into the source code, rdfiln.f, which most Questaal executables use. Comments at the beginning of rdfiln.f document directives it can process. Here we will use rdfiln to refer to the preprocessor built into a number of Questaal executables.
There is an executable rdfiln which reads a file, runs it through the preprocessor rdfiln, and prints the result to stdout. It is a convenient way to try out some of the examples given below.
Curly brackets contain expressions
rdfiln treats curly brackets {…} specially and substitutes its contents for something-else. Typically {…} will contain an algebraic expression, which is evaluated as a binary number and rendered back as an ASCII representation of the number.
Thus the line
talk {4/2} me
becomes
talk 2 me
{…} can contain other kinds of syntax as well; see below.
Note: The substitution {...}→string may increase the length of the line. If the modified line exceeds the maximum size it is truncated. This maximum length is controlled by parameter recln0 in the main program. Source codes are distributed with recln0=120.
Variables
The preprocessor permits three kinds of variables: floating point scalar, floating-point vector, and strings. They can be declared with preprocessor directives.
- Separate symbol tables are maintained for each of the three kinds of variables.
- Scalar variables and vector elements can be used in algebraic expressions.
- Character variables can be used in string expressions (see below).
Scalar and character variables can also be declared on the command-line using, e.g:
-vnam=expr
creates variable nam and assigns a numerical value-csnam=string
creates character variable snam and assigns a string as a value.
Note that variables declared on the command line are created before the file is run through the preprocessor, and take precedence.
Note: As rdfiln parses a file, it may create new variables, thus enlarging the symbols table. Variables allocated this way are temporary, and disappear after the file is parsed. When rdfiln has finished with whatever it is reading, it destroys variables created by preprocessor directives. You can preserve variables for future use with the % save directive.
Branching and looping constructs
You can conditionally read certain lines of a file, or loop over lines multiple times.
Expression Substitution
Enclosing a string in curly brackets, viz {strn}, instructs the preprocessor to parse the contents of {strn} and substitute it with something else.
Note: To suppress expression substitution, prepend {strn} with a backslash, viz \{strn}. The preprocessor will remove the backslash but leave {strn} unaltered.
strn must take one of the following syntactical forms. If rdfiln cannot match the first form, it tries the second, and so on. (It is an error if no form can be matched.) These four forms are as follows, arranged by the precedence they take in parsing:
- (string substitution): strn is name of a character variable. The value of the variable is substituted.
The variable may be followed by a qualification (see 1a and 1b below). - (conditional substitution): strn begins with a “?”. An expression is evaluated, which determines what string is substituted. See 2 below.
- (vector substitution): strn is the name of a vector. The result is the contents of the vector. See 3 below.
- (expression substitution): strn an algebraic expression. Expressions use C-like syntax. See 4 below.
In more detail, the four rules are as follows:
(string substitution) strn consists of (or begins with) a character variable, say mychar.
a. strn is a character variable. rdfiln replaces {mychar} with contents of mychar.
Example: If mychar=’foo bar’, {mychar} → foo bar.b. strn is a character variable followed by a qualifier (…), which must be one of the following:
(integer1,integer2) (substring of strn). {mychar(n1,n2)} is replaced by the (n1:n2) substring of {mychar}.
Example: If mychar=”foo bar”, {mychar(2,3)} → oo.(charlst,n) (index to charlst). {mychar(charlst,n)} returns in index to {mychar}. {mychar} is parsed for characters in charlst, returning the index to the n1th occurrence; charlst is a sequence of characters.
n is optional: if omitted, the preprocessor uses n=1.
Example: let mychar=”foo bar” and charlst=’abc’. Note that “foo bar” contains characters ‘b’ and ‘a’.
{mychar(‘abc’,2)} → 6, because the 6th character contains the second occurence of [abc].(:e) returns an index marking last nonblank character in {mychar}.
Example: If mychar=’foo bar’, {mychar(:e)} → 7.(/’strn1’/’strn2’/,n1,n2) substitutes strn2 for strn1.
Substitutions are made for the n1th to n2th occurrence of strn1.
Example If mychar=”foo bar”, then {mychar(/’foo’/’boo’/)} → “boo bar” n1 and n2 are optional, as are the quotation marks.
(conditional substitution) strn takes the form {?~expr~strn1~strn2} (Note: the ‘~’ can be any character).
expr is an algebraic expression; strn1 and strn2 are strings. rdfiln returns either strn1 or strn2, depending on the result of expr.
If expr evaluates to nonzero, {…} is replaced by strn1.; else {…} is replaced by strn2.
Example: {?~(n<2)~n is less than 2~n is at least 2} :
{…} becomes “n is less than 2“ if n<2; otherwise it becomes “n is at least 2”(vector substitution) strn is name of a vector variable, say myvec. rdfiln replaces {myvec} with a sequence of numbers separated by one space, which are the contents of myvec.
Example : suppose myvec. has been declared as a 5-element quantity in the following way:
% vec myvec[5] 6-1 6-2 5-2 5-3 4-3
{myvec} will be turned into 5 4 3 2 1
A single element of a vector acts like a scalar. Thus {3*myvec(2)-2} becomes 10.(expression substitution) strn is an algebraic expression composed of numbers combined with unary and binary operators. The syntax is very similar to the C programming language. rdfiln parses strn to obtain a binary number, renders the result in ASCII form, and substitutes the result.
Note: strn may consist of a sequence of expressions, separated by commas. rdfiln returns the value of the last expression. A variable should be assigned to each intermediate expression. Assignment may be simple (=) or involve an arithmetic operation.
Examples:{x=3} ← assigns x to 3 and returns '3' {x=3,y=4} ← assigns x to 3 and y to 4, and returns '4' {x=3,y=4,x*=y} ← assigns x to 3*4 and y to 4, and returns '4' {x=3,y=4,x*=y,x*2} ← assigns x to 3*4 and y to 4, and returns '24'
Further properties of curly brackets
Brackets may be nested. rdfiln will work recursively through deeper levels of bracketing, substituting {..} at each level with a result before returning to the higher level.
Example: Suppose {foo} evaluates to 2. Then:
{my{foo}bar}
will be transformed into
{my2bar}
and finally the result of {my2bar} evaluated.
If rdfiln cannot evaluate {my2bar} it will abort with a message similar to this one:
rdfile: bad expression in line { ... my2bar}
Note: there is a syntactical difference between {expr} and the value of expr itself, because {expr} returns an ASCII representation of expr, and precision is lost. Thus {pi-3} is replaced by .141592654
Syntax of Algebraic Expressions
The general syntax for an expression is a sequence of one or more expressions of the form
{name=expr[,name=expr...]}
Commas separate declarations. Arithmetic operators can be used in place of assignment (=), for example {x=3,y=4,x=y,x2}. The final expression may (and typically does) consist of an expression only omitting name=.
Note: expr may not contain any whitespace.
expr has a syntax very similar to C. It is composed of numbers, scalar variables, elements of vector variables, and macros, combined with unary and binary operators.
Unary operators take first precedence: 1. - arithmetic negative ~ logical negative (.not.) functions abs(), exp(), log(), sin(), asin(), sinh(), cos(), acos() cosh(), tan(), atan(), tanh(), flor(), ceil(), erfc(), sqrt() Note: flor() rounds to the next lowest integer; ceil() rounds up. The remaining operators are binary, listed here in order of precedence with associativity 2. ^ (exponentiation) 3. * (times), / (divide), % (modulus) 4. + (add), - (subtract) 5. < (.lt.); > (.gt.); = (.eq.); <> (.ne.); <= (.le.); >= (.ge.) 6. & (.and.) 7. | (.or.) 8&9 ?: conditional operators, used as: **test**?**expr1**:**expr2** 10&11 () parentheses
The ?: pair of operators follow a C-like syntax: test, expr1, and expr2 are all algebraic expressions. If test is nonzero, expr1 is evaluated and becomes the result. Otherwise expr1 is evaluated and becomes the result.
Assignment Operators
The following are the allowed assignment operators:
assignment-op function '=' simple assignment '*=' replace 'var' by var*expr '/=' replace 'var' by var/expr '+=' replace 'var' by var+expr '-=' replace 'var' by var-expr '^-' replace 'var' by var^expr
Examples of expressions
Suppose that the variables table looks like:
Var Name Val 1 t 1.0000 2 f 0.00000 3 pi 3.1416 4 a 2.0000 ... Vec Name Size Val[1..n] 1 firstnums 5 1.0000 5.0000 2 nextnums 5 6.0000 10.000 ... char symbol value 1 c half 2 a whole 3 blank
Note: You can print out the current variables table with the % show directive. As described in more detail below, such a variables table can be created with the following directives:
% const a=2 % char c half a whole blank " " % vec firstnums[5] 1 2 3 4 5 % vec nextnums[5] 6 7 8 9 10
Then the line
{c} of the {a} {pi} is {pi/2}
is turned into the following;
half of the whole 3.14159265 is 1.57079633
whereas the line
one quarter is {1/(nextnums(4)-5)}
becomes
one quarter is .25
Character Substrings Example:
% char c half a whole To {c(1,3)}ve a cave is to make a {a(2,5)}!
becomes
To halve a cave is to make a hole!
Vector Substitution Example:
{firstnums}, I caught a hare alive, {nextnums} I let him go again ...
becomes
1 2 3 4 5, I caught a hare alive, 6 7 8 9 10 I let him go again ...
Nesting Example: The following illustrates nesting to three levels. The innermost block is substituted first. Beginning with
% const xx{1{2+{3+4}1}} = 2
substitution takes place in three passes:
% const xx{1{2+71}} = 2 % const xx{173} = 2 % const xx173 = 2
_ Example of {?~expr~strn1~strn2} syntax_
MODE={?~k~B~C}3
evaluates to, if k is nonzero:
MODE=B3
or, if k_ is zero:
MODE=C3
Note: the scalar variables table is always initialized with predefined variables t=1 and f=0 and **pi=π. It is STRONGLY ADVISED that you never alter any of these variables.
Preprocessor Directives
- Lines beginning with % keyword are be interpreted as preprocessor directives. Such lines are not part of the the post-processed input.
- Lines which begin with # are comment lines and are ignored. (More generally, text following a # in any line is ignored).
Recognized keywords are
const cconst cvar udef var vec ← allocate and assign numerical variables char char0 cchar getenv vfind ← allocate and assign character variables if ifdef ifndef iffile else elseif elseifd endif ← branching construct while repeat end exit ← looping and terminating constructs echo include includo macro save show stop trace udef ← miscellaneous
Variable declarations and assignments
Keywords : const cconst cvar udef var vec char char0 cchar getenv vfind
- const and var load or alter the variables table. Example:
% const myvar=expr
does two things:
- adds myvar to the scalar variables symbols table if it is not there already. const and var are equivalent in this respect.
- assigns the result of expr to it, if either
- you use the var directive or
- you use the const directive and the variable had not yet been created.
In other words, if myvar already exists prior to the directive, const will not alter its value but var will. Thus the lines
% const a=2 % const a=3
incorporate a into the symbols table with value 2, while
% const a=2 % var a=3
does the same but assigns 3 to a.
Note: if myvar exists, you can multiply, divide, add, subtract from, or exponentiate it with expr, using one of the following C-like syntax:
myvar*=expr myvar/=expr myvar+=expr myvar-=expr myvar^=expr
These operators modify myvar for both const and var directives.
- cconst and cvar conditionally load or alter the variables table. Example:
% cconst test-expr myvar=expr
test-expr is an algebraic expression (e.g., i==3) that evaluates to zero or nonzero.
If test-expr evaluates to nonzero, the remainder of the directive proceeds as const or var do.
Otherwise, no further action is taken.Example: the input segment
% const a=2 b=3 c=4 d=5 A={a} B={b} C={c} D={d} % const a=3 % var d=-1 % const b*=2 c+=3 A={a} B={b} C={c} D={d} % cconst b==6 b+=3 c-=3 A={a} B={b} C={c} D={d} % cconst b==6 b+=3 c-=3 A={a} B={b} C={c} D={d}
generates four lines:
A=2 B=3 C=4 D=5 A=2 B=6 C=7 D=-1 A=2 B=9 C=4 D=-1 A=2 B=9 C=4 D=-1
a is unchanged from its initial assignment while d changes.
Compare the two cconst directives. b and c are altered in the first instance, since the condition b==6 evaluates to 1, while they do not change in the second instance, since now b==6 evaluates to zero.
char loads or alters the character table. Example:
% char c half a whole blank
loads the character table as follows:
char symbol value 1 c half 2 a whole 3 blank
The last declaration can omit an associated string, in which case its value is a blank, as blank is in this case.
Note: Re-declaration of any previously defined variable will change the contents of the variable.
char0 is the same as char , except re-assignment of an existing variable is ignored. Thus char0 is to const as char is to var .
- cchar is similar to char but tests are made to enable different strings to be loaded depending on the results of the tests. The syntax is
% cchar nam expr1 str1 /i>expr2</i> str2 ...
nam is the name of the character variable; expr1 expr2 etc are algebraic expressions.
nam takes the value str1 if expr1 evaluates to nonzero, the value str2 if expr2 evaluates to nonzero, etc. - getenv has a function similar to char , only the contents of the variable are read from the unix environment variables table. Thus
% getenv myhome HOME
puts the string of your home directory into variable myhome.
vec loads or alters elements in the table of vector variables.
% vec v[n] ← creates a vector variable of length n % vec v[n] n1 n2 n3 ... ← does the same, also setting the first elements
Once v has been declared, individual elements of v may be set with the following syntax
% vec v(i) n ← assigns n to v(i) % vec v(i1:in) n1 n2 ... nn ← assigns range of elements i1..in to n1 n2 ... nn
There must be exactly in−i1+1 elements n1 … nn .
Note: if v is already declared, it is an error to re-declare it.
- vfind finds which element in a vector that matches a specified value. The syntax is
% vfind v(i1:i2) svar match-value
svar is a scalar variable and match-value a number or expression. Elements v(i1:i2) are parsed. svar is assigned to the the first instance i for which v(i)=match-value . If no match is found, svar is set to zero.
Example:% vec a[3] 101 2002 30003 % vfind a(1:3) k 2002 ← sets k=2 % vfind a(1:3) k 10 ← sets k=0
Branching constructs
Keywords : if ifdef ifndef iffile else elseif elseifd endif
Branching constructs have a function similar to the C constructs.
if expr, elseif expr, else and endif are conditional read blocks. Lines between these directives are read or not, depending on the value of expr. Example:
% if Quartz is clear % elseif Ag is bright % else neither is right % endif
generates this line if Quartz evaluates to nonzero:
is clear
otherwise this line if Ag evaluates to nonzero*
is bright
and otherwise
neither is rightifdef is similar to if , but has a more general idea of what constitutes an expression.
- if expr requires that expr be a valid expression, while ifdef expr evaluates expr as false if it invalid (e.g. it contains an undefined variable).
- expr can be an algebraic expression, or a sequence of expressions separated by & or | (AND or OR binary operators), viz:
% ifdef expr1 | expr2 | expr3 ...
If any of expr1, expr2, ... evaluate to nonzero, the result is nonzero, whether or not preceding expressions are valid.
Note the syntactical significance of the spaces. expr1|expr2 cannot be evaluated unless both expr1 and expr2 are valid expressions, while expr1 | expr2 may be nonzero if either is valid. - ifdef allows a limited use of character variables in expressions. Either of the following are permissible expressions:
char-variable ← T if char-variable exists, otherwise F char-variable=='string' ← T ifchar-variable has the value string
Example:% ifdef x1==2 & atom=='Mg' | x1===1
is nonzero if scalar x1 is 2 and if character variable atom is equal to "Mg", or if scalar x1 is 1. Note binary operators & and | are evaluated left to right: & does not take precedence over |.
elseifd is to elseif as ifdef is to if.
ifndef expr … is the mirror image of ifdef expr. Lines following this construct are read only if expr evaluates to 0.
iffile filename is a construct analogous to %if or %ifdef for conditional reading of input lines. The test condition is set not by an expression, but whether file filename exists or not.
Note: if, ifdef, and ifndef constructs may be nested to a depth of mxlev. The codes are distributed with mxlev=6.
Looping constructs
Keywords : while repeat end
- while and end mark the beginning and end of a looping construct. Lines inside the loop are repeatedly read until a test expression evaluates to 0.
% while [expr1 expr2 ...] test-expr ← skip to `% end' if test-expr is 0 ... ← these lines become part of the input while test-expr is nonzero % end ← return to the `% while' directive until test-expr is 0
The (optional) expressions [expr1 expr2 …] follow the rules of the const directive:
- Each of expr1, expr2, , … take the form nam= expr or nam op= expr.
- A simple assignment nam=expr has effect only when nam has not yet been loaded into the variables table. Thus it has effect on the first pass through the while loop (provided nam isn’t declared yet) but not subsequent passes.
These rules make it very convenient to construct loops, as the following example shows.
% udef -f db ← removes db from symbols table, if it already exists % while db=-1 db+=2 db<=3 ← db is initialized to -1 only once this is db={db} ← the body of the loop that becomes the input % end ← return file pointer to %while until test db<=3 is 0
generates
this is db=1 this is db=3
Pass 1: db is created and assigned the value −1, then incremented to 1. Condition db<=3 evaluates to 1 and the loop proceeds.
Pass 2: db already exists so db=-1 has no effect. db+=2 increments db to 3.
Pass 3: db increments to 5 causing the condition db<=3 to become 0. The loop terminates. - % repeat … % end is another looping construct with the syntax
% repeat varnam list ... ← lines parsed for each element in list % end
As with the while construct, multiple passes are made through the input lines. list generates a sequence of integers (see the integer list syntax manual). For each member of the sequence varnam takes its value and the body of the loop passed through. list can be just an integer (e.g. 7 ) or define a more complex sequence, e.g. 1:3,6,2 generates the sequence 1 2 3 6 2.
Example: nested while and repeat loops
% const nm=-3 nn=4 % while db=-1 db+=2 db<=3 % repeat k= 2,7 this is k={k} and db={db} {db+k+nn+nm} is db + k + nn+nm, where nn+nm={nn+nm} % end (loop over k) % end (loop over db)
The nested loops are expanded into:
this is k=2 and db=1 4 is db + k + nn+nm, where nn+nm=1 this is k=7 and db=1 9 is db + k + nn+nm, where nn+nm=1 this is k=2 and db=3 6 is db + k + nn+nm, where nn+nm=1 this is k=7 and db=3 11 is db + k + nn+nm, where nn+nm=1
Other directives
Keywords : echo exit include includo macro save show stop trace udef
- echo contents echoes contents to standard output.
Example :% echo hello world
prints
#rf line-no: hello world
line-no is the current line number.
- exit [expr] causes the program to stop parsing the input file, as though it encountered an end-of-file.
- If expr evaluates to nonzero, or if it is omitted, parsing ends.
- If expr evaluates to 0 the directive has no effect.
Note: compare to the stop directive.
- include filename causes rdfiln to include the contents file filename into the input.
- If filename exists, rdfiln opens it and the file pointer is transferred to this file until no further lines are to be read. At that point file pointer returns to the original file.
- If filename does not exist, the directive has no effect.
Notes: %include may be nested to a depth of 10. Looping and branching constructs must reside in the same file.
includo filename is identical to include , except that rdfiln aborts if filename does not exist.
- macro(arg1,arg2,..) expr defines a macro. arg1,arg2,… are substituted into expr before it is evaluated. Example :
% macro xp(x1,x2,x3,x4) x1+2*x2+3*x3+4*x4 The result of xp(1,2,3,4) is {xp(1,2,3,4)}
generates
The result of xp(1,2,3,4) is 30
Note: macros are not quite identical to function declarations. The following lines illustrate this:
% macro xp(x1,x2,x3,x4) x1+2*x2+3*x3+4*x4 The result of xp(1,2,3,4) is {xp(1,2,3,4)} The result of xp(1,2,3,3+1) is {xp(1,2,3,3+1)} The result of xp(1,2,3,(3+1)) is {xp(1,2,3,(3+1))}
generates
The result of xp(1,2,3,4) is 30 The result of xp(1,2,3,3+1) is 27 The result of xp(1,2,3,(3+1)) is 30
macro merely substitutes 1,2,3,… for x1,x2,x3,x4 in expr as follows:
1+2*2+3*3+4*4 ← xp(1,2,3,4) 1+2*2+3*3+4*3+1 ← xp(1,2,3,3+1) 1+2*2+3*3+4*(3+1) ← xp(1,2,3,(3+1))
Operator order matters, so 4 and 3+1 behave differently. By enclosing the fourth argument in parenthesis, operator precedence is maintained.
- save preserves variables after the preprocessor exits. The syntax is:
% save ← preserves all variables defined to this point % save name [name2 ...] ← saves only variables named
Only variables in the scalar symbols table are saved.
- show … prints various things to standard output:
% show vars ← prints out the state of the variables table % show lines ← echos each line generated to the screen until: % show stop ← is encountered
Note: because the vector variables can have arbitrary length, show prints only the size of the vector and the first and last entries.
- stop [expr msg] : causes the program to stop execution.
- If expr evaluates to nonzero, or if it is omitted, program stops ( msg , if present, is printed to standard output before aborting).
- If expr evaluates to 0 the directive has no effect.
Note: compare to the exit directive.
- trace turns on debugging printout. rdfiln prints to standard output information about what it is doing.
- trace 0 turns the tracing off
- trace 1 turns the tracing on at the lowest level.
rdfiln traces directives having to do with execution flow (if-else-endif, repeat/while-end). - trace 2 prints some information about most directives.
- trace 4 is the most verbose
- trace (no argument) toggles whether it is on or off.
10. udef [−f] name [name2 …]’ remove one or more variables from the symbols table. If the −f is omitted, rdfiln aborts with error if you remove a nonexistent variable. If −f is included, removing nonexistent variable does not generate an error. Only scalar and character variables may be deleted.
Source codes
Source codes the preprocessor uses are found in the slatsm directory:
rdfiln.f The source code for the preprocessor. Subroutine rdfile parses an entire file and returns a preprocessed one, can be found in rdfiln.f The key subroutine is rdfiln, which parses one line of a file. symvar.f Maintains the table of variables for floating point scalars. symvec.f Maintains the table of vector variables. a2bin.f Evaluates ASCII representations of algebraic expressions using a C-like syntax, converting the result into a binary number. Expressions may include variables and vector elements. bin2a.f Converts a binary number into a character string (inverse function to a2bin.f). mkilst.f Generates a list of integers for looping constructs, as described below. describes the syntax of integer lists.
rdfiln also maintains a table of character variables. It is kept in the character array ctbl, and is passed as an argument to rdfiln.
Note: the ASCII representation of a floating-point expression is represented to 8 or 9 decimal places; thus it has less precision than the binary form. For example, ‘{1.2345678987654e-8}’ is turned into 1.2345679e-8.