Appendix A - Symbol Table Format

The symbol tables produced by SPAG are designed both as a report for immediate use by programmers, and as a means of communicating information to other programs. The global cross check program GXCHK is one example of a program which uses symbol tables as input. It would be easy to devise others; for example, symbol tables could be used as input by a user-written program to check conformance to local naming conventions.

A symbol table consists of a header record, a series of data and comment records, and a trailer record. The header record has the form

      **++ Symbol table for subprogram TEST1 in file TEST1.FOR

where the subprogram and file names are, of course, variable. The subprogram name starts at character 34 and is terminated by a space. The file name starts 10 characters after the last letter of the subprogram name.

The trailer field has the form

      **-- END OF SYMBOL TABLE

There may be any number of symbol tables in a file, provided that each has appropriate header and trailer records.

Any record with spaces in characters 1-4 is a comment. The first record after the header record is always a comment record of the form

           produced by SPAG     5.00C -40 at 11:45:28 on 15 May 1995

The two digit integer starting at character 34 (40 in the above example) specifies the length of the information field on data records. This number may be varied in future releases of SPAG.

Data records consist of an information field, a name field, and possibly a comment field. The length of the information field is specified on the comment record immediately after the header. The name field begins immediately after the information field, and is of indefinite length. It is terminated by the end of record, or by a '(', '*' , '#' or '=' character. If the first character of the name field is '/', then the name field is terminated by a second '/'. The comment field, if present, consists of all characters between the end of the name field and the end of the record.

Each data record contains information about a single symbol which appears within the current sub-program. The name field contains the name of the symbol. COMMON block names are enclosed within '/' characters (e.g. /COMM1/). INCLUDEd file names are preceded by a '+' character (e.g. +global.inc), and the source file name by a space character. Other symbol names are converted to upper case, but are otherwise unaltered. Data records are sorted by their name fields; this ensures that file names and COMMON blocks appear first.

The information field contains the following parts:

Characters 1- 4

The symbol number. An integer assigned to each symbol. Symbol 0 is the source file.

Characters 5- 9

The number of the father of the current symbol. If the father is 0, spaces are inserted.

Characters 10-13

The position of the current symbol. If the position is undefined, spaces are inserted.

The father symbol and position are defined as follows:

  • For a dummy argument, the father is the current sub-program, and the position is the position in the argument list of this dummy argument.
  • For a COMMON variable or array, the father is the COMMON block name, and the position is the position in the COMMON block of this component. In this context, each variable or array occupies one position, regardless of size.
  • For a COMMON block, the father is the file in which the COMMON block is defined. This may be the source file or an INCLUDEd file. The position is undefined.
  • For an INCLUDEd file, the father is the file containing the INCLUDE statement.
  • For all other symbols (e.g. PARAMETERs, local variables), the father is the file in which the symbol is defined (which may be the source file or an INCLUDE file) and the position is undefined.

Character 14

A single character which specifies what the symbol is used for. Possible values are:

blank

a variable

B

a BLOCKDATA name (the current subprogram)

C

a COMMON block name

D

the name of a file containing only declarations

E

an ENTRY name in the current subprogram

F

a FUNCTION name (the current subprogram)

f

the name of a FUNCTION used by the current sub-program

g

an overloaded generic name for a set of SUBROUTINEs or FUNCTIONs

G

a statement function name

I

the name of an INTRINSIC FUNCTION used by the current sub-program

j

the name of a SUBROUTINE or FUNCTION specified in a MODULE PROCEDURE statement

k

the name of a FUNCTION whose interface is specified in an INTERFACE block

K

the name of a SUBROUTINE whose interface is specified in an INTERFACE block

M

a main PROGRAM name (the current subprogram)

N

a NAMELIST name

P

a PARAMETER

r

the variable specified in the RESULT clause of a FUNCTION definition

R

a derived TYPE or RECORD name (VAX Fortran)

S

a SUBROUTINE name (the current subprogram)

s

the name of a SUBROUTINE called by the current sub-program

U

a MODULE name (the current subprogram)

u

the name of a MODULE USEd by the current sub-program

X

the name of a file containing executable statements

#

a symbol whose name has been changed by SPAG

Character 15

A single character which specifies the data type of the symbol. Possible values are:

blank

untyped (e.g. COMMON block name)

B

BYTE (VAX Fortran extension)

C

CHARACTER

D

DOUBLE PRECISION

F

a structure FIELD (symbol is prefixed by ".")

I

INTEGER

L

LOGICAL

R

REAL

S

Derived TYPE or VAX Fortran STRUCTURE

X

COMPLEX

Y

DOUBLE COMPLEX (VAX Fortran extension)

?

ambiguous (e.g. generic form of INTRINSIC funtion)

Characters 16-19

The size of each data element (as specified using a '*' qualifier in the declaration statement - e.g. CHARACTER*11 or REAL*8). Character 16 is set to '*', and the size is left justified in characters 17 to 19 (e.g. '*4'). If the size exceeds 999 (e.g. CHARACTER*1024), the '*' is omitted and character 16 is used for the extra digit. If the size is specified using a bracketed PARAMETER or asterisk (e.g. CHARACTER*(*) or CHARACTER*(linlen) ), characters 16-19 are set to *(*). For a derived type variable, the symbol number of the type is inserted. If no size is specified, or if none is appropriate, characters 16-19 are blank.

Characters 20-23

The number of dimensions, arguments or components in an array, subprogram or COMMON block. For an array or subprogram character 20 is '(' and character 23 is ')'. For a COMMON block, both are set to '/'. The number is written in characters 21 and 22 (overflowing to character 20 if it exceeds 100). Note that for arrays, the number of dimensions (NOT the size of the array) is shown.

Character 24

A single character which specifies the scope of the symbol. Possible values are:

blank

a local variable

A

a dummy argument

C

a common variable or array

D

a common variable or array which is initialised in the current subprogram. If this value is set, the corresponding COMMON block also has character 24 set to D.

H

a variable inherited from a CONTAINing subprogram.

I

a local variable, initialised in a DATA statement

i

a local variable, initialised in a type statement

M

a variable imported from a MODULE by a USE statement.

N

an initialized variable imported from a MODULE by a USE statement.

O

a Fortran 95 style optional dummy argument

S

a locally defined static variable

For all other types of symbol, character 24 is a space character.

Character 25

A single character which specifies whether the value associated with the symbol is modified by any statement in the current sub-program. Possible values are:

blank

not modified by this subprogram

M

modified by this sub-program

?

a variable or array which is not modified in the current sub-program, but is passed as an argument to another sub-program, and may be modified there.

For an array, COMMON block, structure or INCLUDE file, this character is set if any component is modified by the current subprogram. For the current sub-program, this character is set if any dummy argument is modified.

Character 26

A single character which specifies whether the symbol is used by any statement in the current sub-program (i.e. if its value is read but not changed). Possible values are:

blank

not used by this subprogram

U

used by this sub-program.

?

a variable or array which is not used in the current sub-program, but is passed as an argument to another sub-program, and may be used there.

D

the variable is used only as the implied DO loop variable in a DATA statement. Such a variable only exists at compile time.

For an array, COMMON block, structure or INCLUDE file, this character is set if any component is used by the current subprogram. For the current sub-program, this character is set if any dummy argument is used.

Character 27

A single character which specifies whether the symbol (being a variable, array or subprogram name) is used in an EQUIVALENCE, EXTERNAL, INTRINSIC or INTENT statement in the current subprogram. Possible values are:

blank

not used in any of these statements.

Q

the symbol is used in an EQUIVALENCE statement.

X

the symbol is used in an EXTERNAL statement.

N

the symbol is used in an INTRINSIC statement.

I

the symbol is a dummy argument with INTENT(IN) specified

O

the symbol is a dummy argument with INTENT(OUT) specified

B

the symbol is a dummy argument with INTENT(INOUT) specified

For a COMMON block or INCLUDE file, this character is set to Q if any component is used in an EQUIVALENCE statement by the current subprogram.

Character 28

A single character which specifies how the symbol was assigned a data type (characters 15-19). Possible values are:

blank

untyped symbol

E

the type is assigned explicitly using a type declaration statement (INTEGER, REAL etc.).

e

the type was previously typed IMPLICITly but SPAG has now inserted an explicit type declaration.

I

the type is assigned according to the first letter of the symbol name using an IMPLICIT typing rule.

#

SPAG has deleted the references to the symbol because it was not used.

K

the symbol is an INCLUDEd file which contains an IMPLICIT statement.

M

the variable was imported from a module, and it is not known how it was originally assigned a type.

Characters 29-37

An integer size associated with the symbol.

  • For an array, the total number of elements.
  • For a Fortran 95 style allocatable array, set to -1.
  • For a COMMON block, the size in bytes.
  • For an INTEGER PARAMETER, the value.
  • For a subprogram, the number of executable statements.

Character 38

A single character which specifies whether the symbol has any of the Fortran 95 PUBLIC, PRIVATE, POINTER and PRIVATE attributes. Possible values are:

blank

none of these attributes apply.

A

PUBLIC

B

PRIVATE

C

POINTER

D

PUBLIC POINTER

E

PRIVATE POINTER

F

TARGET

G

PUBLIC TARGET

H

PRIVATE TARGET

Character 39

A single character which specifies the Fortran 95 'KIND' associated with the symbol. Possible values are:

0

no KIND specified

1-9

KINDs 1 to 9

A-Z

KINDs 10 to 35

*

The KIND is compiler dependent

If a numeric KIND value is specified, plusFORT reads the data in f77kinds.f90 to deduce the element size (typically in bytes) so that it can check COMMON block and argument sizes.

Appended Information

If the information field length is 40 (the default in version 5 of SPAG), the symbol name is specified starting at column 41. If item 2 of the SPAG configuration data is set greater than 1, SPAG will, where possible append further useful information as comments after the symbol name. The information which is appended is as follows:

For some symbols, several sets of information may be appended. For example, an array with a length specification might appear as:

      TEXT*(MAXLEN)(N+40,14)

Argument Checking

SPAG also inserts information in the symbol tables to allow subprogram arguments to be checked. These checks are applicable to Fortran 77 programs, but not, in general, to Fortran 95 programs containing subprograms with optional arguments, overloaded functions and operators etc. For such programs, argument checking can be suppressed by setting item 133 of the SPAG configuration data, and items 405, 406 and 441-448 of the GXCHK configuration data to 0.

The information inserted by SPAG takes the form of comment lines inserted after the line describing a CALLed SUBROUTINE or FUNCTION. For example

        455         s     ( 6)                 FOO101
           (I4E,I4V,I4E,L4V,I4E,I4E)

specifies that the subroutine FOO101 is called with 6 arguments. The type of each argument is specified in the bracketed list on the following line. The '(' character is at column 6, and if necessary the list is continued on subsequent lines. It is terminated by a ')'. Each argument is represented by a three character mnemonic of the form R8E, I4L etc. For alternate returns, the abbreviation is 'ALT'; otherwise the three characters are interpreted as described below.

The first character indicates the data type ('I' for INTEGER, 'R' for REAL, 'L' for LOGICAL, 'C' for CHARACTER, 'D' for DOUBLE PRECISION, 'X' for COMPLEX, 'Y' for DOUBLE COMPLEX or 'B' for BYTE).

The second character indicates the element size (e.g. '8' for REAL*8 or DOUBLE PRECISION items). Letters are used for element sizes between 10 and 35. '+' indicates an element size greater than 35, and '*' that the size is variable.

The third character specifies what sort of entity the argument is ('V' for variable, 'E' for constant, expression or DO variable, 'L' for and array element, 'A' for an array name or, 'F' for a subprogram name. If a dummy argument is specified as 'E', this indicates that the subprogram does not change the value of the dummy argument, and that the actual argument may be an expression, constant or DO variable.