This document describes the functioning of the BSP compiler program included in this project, bspcomp, and the version of the syntax that it uses. This document does not intend to document the BSP format itself; the specification does that, and it also documents the instruction set. Since the instruction set is documented there, duplicating it here would be redundant; however, this document does describe the overall syntax that the compiler uses, and the various pseudo-instructions that it supports.
The script compiler is invoked with two parameters: the source file and the target BSP filename. For instance, the following invocation:
./bspcomp patch.txt patch.bsp
would create a patch.bsp
file from the source code in patch.txt
.
Note that it is impossible to use more than one source file in the invocation. However, this can be remedied by having
the main source file include the rest — this way, multiple source files can freely be combined into a single resulting
BSP file. All inclusion paths are resolved relative to the current working directory (that is, the directory from which
bspcomp
is invoked).
This is a sample source file:
define result, 0xff
Main:
checksha1 #result, .empty_file_hash
jumpnz #result, Error
seek 0
subtract #result, MessageEnd, Message
writedata Message, #result
exit 0
.empty_file_hash
hexdata da39a3ee5e6b4b0d3255bfef95601890afd80709 ;hash for an empty input (0 bytes)
Error:
print .error
menu #result, .options
jumptable #result
dw .go_ahead
dw .exit
.options
dw .go_ahead_string
dw .exit_string
dw -1
.error
string "The input file is not empty. Continue?"
.go_ahead_string
string "Yes"
.exit_string
string "No (abort)"
.go_ahead
truncate 0
jump Main
.exit
exit 1
Message:
db "Hello world!"
MessageEnd:
This script would patch an empty file into a file saying "Hello world!". If the source file isn't empty, the script would give the option to the user of continuing anyway (deleting the data previously in the file in the process) or aborting the patching process.
Several features of the syntax can be shown in the above sample script:
- Indented lines contain instructions (or pseudo-instructions). Non-indented lines contain labels. (Indentation can be any amount of whitespace; that is, spaces and/or tabs.) A label can only contain letters (uppercase or lowercase; they are case-sensitive), numbers and underscores; it also must not begin with a number. A label can also be preceded by a dot, in which case it is a local label. A label can be on its own in a line, or it can be followed by a colon, and optionally an instruction in the same line.
- A label can be used in an instruction (or pseudo-instruction) as a value. The value of a label is the address at
which the label will be located when the BSP file is compiled and built; the compilation process resolves the labels
into constant values. A local label can only be used in the same scope it is defined, the scope being between one
global (that is, non-local) label and the next; the
.empty_file_hash
label defined in the above sample script can only be used between theMain
andError
labels. - Labels must be unique; local labels only need to be unique within their scope.
- Blank lines are ignored. A semicolon marks a comment; anything between a semicolon and the end of the line is ignored as well.
- Arguments to instructions are given after the instruction itself. The first argument is separated from the name of the instruction simply by whitespace; further arguments are separated from prior ones by a comma surrounded by arbitrary amounts of whitespace.
- String arguments to pseudo-instructions that accept them are surrounded by double quotes. A
"
character itself inside a string argument may be escaped by duplicating it; there are no other escape sequences in strings. (In particular, the\
character has no useful effect other than representing itself in a string argument.) - Numerical constants may be written explicitly (in decimal, in hexadecimal preceded by
0x
, or in octal preceded by0
; negative constants will be converted to unsigned constants in the usual two's complement form), or they may be represented by a label or a definition. A label can only be used where a word constant (or a word-sized immediate argument) would fit; a definition may be used in any context where a number is required, including a variable number. Hence,#result
is simply a clearer way of writing#255
in the above example, sinceresult
has been defined to represent0xff
(which is 255). - Variables are written in the form
#<number>
, where<number>
is either a number or a defined symbol representing a number. Therefore,#255
means variable number 255. Note that variable numbers are byte-sized; a number greater than 255 will be silently truncated to fit.
Pseudo-instructions are instructions for the compiler. They aren't actually instructions, since an engine won't be able to execute them (or know they were there in the first place); instead, they are used for various purposes other than writing code itself, such as adding data to the BSP file, or creating symbolic names for numbers.
The list of available pseudo-instructions that this compiler supports is the following:
db any[, any[, ...]]
dh any[, any[, ...]]
dw any[, any[, ...]]
These pseudo-instructions respectively insert bytes, halfwords (16-bit) or words (32-bit) into the BSP directly, given
by value. They can take any number of arguments, and the bytes, halfwords or words are inserted in the order they are
given; halfwords and words are inserted in little-endian format. Note that only dw
can accept labels as arguments,
since labels are always 32-bit values; also, these pseudo-instructions will silently truncate arguments that are too
large for the data type they accept.
All of these pseudo-instructions also accept strings as arguments, given between quotes. They will insert the strings
as given, encoded in UTF-8; the dh
and dw
pseudo-instructions will zero-pad the strings to make their lengths
respectively multiples of 2 and 4 bytes. Strings and regular values can be mixed in the same line.
string "string"[, "string"[, ...]]
This pseudo-instruction inserts strings followed by null (0x00
) bytes. It is provided purely as a convenience for
clearer code; the following:
string "abc"
string "123", "890"
is equivalent to:
db "abc", 0
db "123", 0, "890", 0
Just like the data pseudo-instructions (db
, dh
and dw
), this pseudo-instruction can take any number of arguments.
hexdata hexstring
This pseudo-instruction inserts raw data given in hexadecimal, of arbitrary length. The arguments must be sequences of hexadecimal digits of any length (hexadecimal digits being 0-9, A-F, a-f); if any argument has an odd length, it is padded with an extra 0 digit at the end to make it even.
This instruction inserts the data in the order it is given. Therefore, this:
hexdata 0123456789abcdef
hexdata fedc, 3210
is equivalent to:
db 0x01, 0x23, 0x45, 0x67, 0x89, 0xab, 0xcd, 0xef
db 0xfe, 0xdc, 0x32, 0x10
align value
This pseudo-instruction aligns the next line's address to a multiple of the specified value, by padding the BSP file
with zeros. It is an error to do align 0
.
define name, value
This pseudo-instruction defines a symbol to be equal to a certain value. The value must be a 32-bit constant, and the
name must be a valid name (the restrictions are the same as for label names, only that defined names cannot be local).
It is possible to change this value by using another define
pseudo-instruction, which changes the value for further
lines of code; for instance, the following example:
define value, 1
set #1, value
define value, 5
set #2, value
define value, 3
set #value, 10
would set variable number 1 to 1, variable number 2 to 5 and variable number 3 to 10.
include "filename"
incbin "filename"
These pseudo-instructions include another file into the current one. The filename must be passed as a string; and, if it is a relative filename, it will be resolved relative to the current working directory of the compiler.
The include
pseudo-instruction will include the file as a source file, and therefore compile it as if it had been
transcribed into the original source file in the location the include
pseudo-instruction is; on the other hand, the
incbin
pseudo-instruction will simply insert the contents of the binary file as a blob, without any parsing.