m4 -- macro processing language

SYNOPSIS

m4 [-s] [-D name[=value]] [-U name] [file ...]

m4 is a macro processor which reads its input and converts certain strings (macro calls) into special output. It's most often used to preprocess source code for a programming language. It processes the files on the command line in the order given. If no files are specified or if - is used in place of any files, input is read from the standard input. The processed output is written to the standard output.

Options

-D name=value: defines a symbol with the given name and value (where value is interpreted as a string). You may omit the =value part; in this case, m4 creates a symbol with the given name, but with no value.
-s: when producing source code for the C programming language, generates #line directives so that the line numbers in the generated source code are synchronized with the line numbers in the m4 input.
-U name: undefines an existing symbol with the given name. You can use this to get rid of a predefined symbol. Predefined symbols are discussed later.

Input Format

For the most part, m4 simply reads in text and writes it to the standard output. However, m4 also watches for macro calls. A macro call is a special construct that tells m4 to produce some special kind of output. m4 has a number of built-in macros and lets you define macros of your own. A discussion of such macros follows this look at macros in general.

A macro call has the form

name(arg1, arg2,...)

Macro names can consist of alphabetic characters, digits and the underscore (_). The first character of a name cannot be a digit. Alphabetic characters and digits are members of the current locale's [[:alpha:] and [:digit:] character classes. m4 distinguishes uppercase letters from lowercase letters, so that NAME, Name and name are all different.

The arguments in a macro call are strings of text. When m4 reads the arguments in a macro call, it ignores leading white space (that is blanks, spaces, tabs and newlines immediately after the open parenthesis or after a separating comma). If you want to have leading white space characters, you can enclose an argument in left and right quotes (that is, a grave accent at the beginning and an apostrophe at the end), as in

`  this has leading white space'

You should also quote an argument this way if it contains a comma or an unmatched closing parenthesis. m4 always removes one level of enclosing quotes before using the argument.

When m4 recognizes a macro name, it looks for an opening parenthesis ( ( ) immediately following the name. If the character after the name is not an opening parenthesis, m4 assumes there are no arguments. If there is an opening parenthesis, m4 proceeds to collect the arguments for the macro call, up to a matching (unquoted) closing parenthesis. Empty parentheses denote a single empty argument.

When collecting arguments, m4 looks for macros inside the arguments. If m4 finds any macros, it expands them immediately. The expansion process may generate commas or closing parentheses; if so, these have the same effect that commas and closing parentheses did in the original text. For example, suppose a macro is called with a single argument, but that argument is a macro which expands to

A , B

Since this expansion looks like two arguments, m4 behaves as if you specified these two arguments in the original macro call. After it collects all the arguments and expands the macros inside the arguments, m4 rescans the entire macro call again. It is possible that it finds more macro calls, generated by macros inside the arguments. m4 keeps on expanding and rescanning until the arguments contain no more macros; then it expands the original macro.

m4 input may contain comments. By default, these begin with a # character and extend to the end of the line; however, see the description of the changecom macro for ways to change this. m4 copies comments to the output, without processing them for macros, quotes, or ``nested'' comments.

Note:: We are talking about comments in the target language, which is why comments are copied to the output. To place text in the m4 input that you want to disappear from the output, use the dnl macro (see Other Built-In Macros).

Defining Your Own Macros

To define your own macro, use

define(name, string)

This defines a macro with the given name. From this point onward, when m4 sees a macro call to name in the input, it replaces the call with the given string. It is a good practice to always quote name, in case it has an old definition.

Inside string, the construct $n can be used in place of macro arguments, where $1 is replaced with the first argument to the macro, $2 is replaced with the second argument, and so on. n cannot exceed 9. For example, you might say

define(`p', `printf("%d",$1)')

Then the input

p(var);

is expanded to

printf("%d",var);

If you use a construct $n, but there were not n arguments specified in the macro call, m4 replaces the construct with a null string.

There are a few other special constructs that you can use inside string:

$0: stands for the name of the macro itself.
$#: is replaced with the number of arguments specified in the macro call.
$*: is replaced by a list of all the arguments, separated by commas.
$@: is the same as $*, except that all the arguments are quoted.

You may use define to define a macro with the same name as one of the built-in macros to replace a built-in macro.

Numeric Calculations

You can perform numeric calculations with macros. The `numeric value' of a macro is the value of the longest string of digits at the beginning of the macro's string value. For example, if you say

define(`val', `123abc')

the numeric value of val is 123. The following built-in macros let you perform arithmetic with other macros.

decr(name)

Expands to the current numeric value of name, minus one. Notice that decr does not change the value of name; it just expands to a value that is one less.

incr(name)

Expands to the current numeric value of name, plus one. Notice that incr does not change the value of name; it just expands to a value that is one greater.

eval(string)

Evaluates string as an integer arithmetic expression. Operations include:

A+B   addition
A-B   subtraction
A*B   multiplication
A/B   A divided by B
A%B   remainder of A divided by B
A**B  A to the power B
A&B   bitwise AND
A|B   bitwise inclusive OR
A^B   bitwise exclusive OR
~A    bitwise complement
A>B   non-zero if A greater than B
A>=B  non-zero if A greater than or equal to B
A<B   non-zero if A less than B
A<=B  non-zero if A less than or equal to B
A==B  non-zero if A equal to B
A!=B  non-zero if A not equal to B

The usual order of operations applies; you can use parentheses to change the order.

Numbers that begin with a leading zero are assumed to be octal. Numbers that begin with 0x or 0X are hexadecimal. Other numbers are decimal integers. The expanded result of the macro is the result of the arithmetic, expressed as a decimal integer.

eval(string, base)

Is the same as the previous form, except that m4 expresses the result in the given base. Acceptable bases are in the range 2 to 36.

eval(string, base, width)

Is similar to the previous form, except that m4 always expresses the result with at least width characters.

Diversions

There are situations in which you may not want to write output to the standard output in the order it appears in the input. For example, when using m4 to produce C source code, you may want all the variable declarations to appear at the beginning of the output text, even if these declarations are actually scattered throughout the input. You can deal with this sort of situation using diversions.

A diversion is a holding area for processed output. m4 lets you use up to ten diversions, numbered 0 through 9. The command

divert(n)

says that m4 is to save all subsequent output to diversion n. (If n is not a digit in the range 0 to 9, output written to the diversion is discarded.)

When you start using m4, output is written to diversion zero, which is the standard output. When m4 reaches the end of the input, it writes out the contents of all the diversions in numerical order.

The command

undivert(n)

takes all the text that is currently stored in diversion n and empties it into the current diversion without further processing. For example, suppose you have been using diversion 5 to collect a certain type of source code. When you reach the point where that source code is to actually be used, you can say

undivert(5)

to bring in that source code. Notice, however, that the code is brought into the current diversion. Thus you can construct code in one diversion by bringing in code from other diversions.

The undivert process clears out the given diversion. You can then use the diversion to hold new output.

The macro

divnum

expands to the number of the current diversion.

Other Built-In Macros

changecom(start, end)

changes the strings used to enclose comments. For example,

changecom(`/*',`*/')

sets things up so that

/* This is a comment */

Calling changecom without arguments disables comments (nothing can be marked as a comment). If only one argument is specified, it is taken to be the start of a comment; the end is the newline (so that comments begin with start and end at the end of the input line). The start and end strings can be up to five characters long.

changequote(open, close)

changes the strings used to quote arguments. open and close may be strings of up to five characters. For example, with

changequote(<<,>>)

you can `quote' arguments as in

<< This is an argument >>

Calling changequote without arguments restores the quoting characters to the original ones (grave accent for opening quote, apostrophe for closing).

defn(name, name, ...)

expands to a sequence of quoted definition strings of each of the named macros. Referring to our example of the previous section,

defn(`p')

expands to

printf("%d",$1)

dnl

discards all input from this point to the end of the current line (including the newline on the end of the line). This macro is recommended for comments about m4 macros (as opposed to comments in the output text) and for trimming unwanted newlines from the lines defining macros.

dumpdef(name, name, ...)

Prints the names and current string values of the specified macros on the standard output. If no names are given, dumpdef displays every currently-defined macro and its value.

errprint(string)

displays the given string as a diagnostic message (on the standard error stream).

ifdef(name, string1)

expands to string1 if the given name is currently defined as a macro; otherwise, it expands to the null string.

ifdef(name, string1, string2)

expands to string1 if the given name is currently defined as a macro; otherwise, it expands to string2.

ifelse(arg1, arg2, then-string)

expands to then-string if the string arg1 is the same as the string arg2; otherwise, the whole thing expands to the null string.

ifelse(arg1, arg2, then-string, else-string)

expands to then-string if the string arg1 is the same as the string arg2; otherwise, the macro expands to else-string.

ifelse(arg1, arg2, ...)

when called with more than four arguments, compares the first two arguments; if they are equal, the macro expands to argument 3; otherwise, the first three arguments are discarded and ifelse tries again, by comparing arguments 4 and 5. This process repeats until two arguments compare equal or there are four or fewer arguments, in which case the remaining arguments are treated as in the previous ifelse description.

include(filename)

expands to the contents of filename. m4 rescans the text read from the file in the usual way, so that macro processing takes place. If the file cannot be read successfully, an error is issued. See also sinclude.

index(string, substring)

expands to an integer giving the position where substring is found in string. The beginning of the string is position 0. For example,

index(`abcdef', `def')

expands to 3. If substring cannot be found in string, index returns -1.

len(string)

expands to a decimal number giving the number of characters in string.

m4exit(status)

terminates m4 immediately. If you give an integer status, it is used as the exit status of m4; otherwise, the exit status is zero.

m4wrap(string)

specifies that m4 is to process string when it reaches end-of-file on its last input file. For example, string might be the name of a macro that performs clean-up operations. When m4wrap is called multiple times, all string arguments are processed when end-of-file is reached. These arguments are processed in the same order that the m4wrap calls were processed.

maketemp(string)

creates a file name for a temporary file. string must contain the characters XXXXX (and may contain other characters as well). The result of the macro call is string with the XXXXX replaced by the current process ID. (For more on process IDs, see ps.)

popdef(name)

gets rid of the current definition of the macro name restores one that pushdef saved (if any). See the description of pushdef.

pushdef(name, string)

creates a macro called name with the value string, just like define does; however, any existing definition of name is saved so that it can be restored later with popdef.

shift(arg1, arg2,...)

expands to its list of arguments, quoted and separated with commas. However, arg1 is omitted from this list. Thus you get

`arg2', `arg3',...

A common use of shift is in recursive macros that process $1 then invoke themselves with arguments of shift($@).

sinclude(filename)

is similar to include except that no error message is issued if the file cannot be read successfully.

substr(string, pos, length)

returns the substring of string that begins at position pos and is length characters long. The beginning of string is position 0. For example,

substr(`abcdef', 2, 3)

returns the string cde. If you omit the third argument, the substring goes from the given position to the end of string.

syscmd(command)

executes the given command as a system command. For example,

syscmd(`cp file1 file2')

executes the given cp command. No value results from this macro.

sysval(command)

expands to the return code from the last command executed by syscmd.

traceon(name, name, ...)

produces extra output tracing the use of the named macros. If no names are specified, all macros are traced.

traceoff(name, name, ...)

stops tracing uses of the named macros. Calling traceoff without macros turns off the effects of traceon without macros; however, if you explicitly turn on tracing for a particular macro, a traceoff without arguments does not turn off tracing for that macro. For example, in

traceon(`mymacro')
traceon()
traceoff()

the traceoff does not turn off tracing on mymacro. You must explicitly use

traceoff(`mymacro')

translit(string, from, to

Expands to a transliterated version of string. The from and to arguments are strings with the same number of characters. The result of the macro call is the characters of string, except that m4 replaces every character in string that appears in from with the corresponding character from to. For example,

translit(`example', `abcde', `ABCDE')

produces

ExAmplE

The action of this macro is similar to tr.

undefine(name)

gets rid of the specified macro. This works for macros created by pushdef as well as those created by define.

EXAMPLES

In a C program, suppose you have not decided whether a particular data object should be an integer or floating point type. You might write

define(`TYPE', `int')
define(`OUTTYPE', `%d')
     ...
TYPE variable;
     ...
printf("OUTTYPE\n",variable);

By running this through m4, you get valid C source code which contains an integer declaration for variable and the appropriate placeholder in the printf string. If you change your mind, you can change the define macros to

define(`TYPE', `double')
define(`OUTTYPE', `%f')

and run the source code through m4 again.

In practice, you usually put the definitions on the command line instead of in the file. Thus you might say

m4 -D TYPE=double -D OUTTYPE=%f infile.m4c >source.c

to process input from infile.m4c and write it to source.c.

DIAGNOSTICS

Possible exit status values are:

0

Successful completion.

1

An error occurred.

Note:: If you use the built-in m4exit macro, the exit status of m4 is the argument given to m4exit.

PORTABILITY

All UNIX systems. x/OPEN Portability Guide 4.0. Windows 10. Windows Server 2016. Windows Server 2019. Windows 11. Windows Server 2022. Windows Server 2025.

The BSD version of m4 supports substantially fewer built-in macros than this version and does not support comments or multi-character quotes.

AVAILABILITY

PTC MKS Toolkit for Power Users
PTC MKS Toolkit for System Administrators
PTC MKS Toolkit for Developers
PTC MKS Toolkit for Interoperability
PTC MKS Toolkit for Professional Developers
PTC MKS Toolkit for Professional Developers 64-Bit Edition
PTC MKS Toolkit for Enterprise Developers
PTC MKS Toolkit for Enterprise Developers 64-Bit Edition

m4
macro processing language

Command