golex is a flex-compatible lexical analyser generator, written for Go 1.
The below description has been pilfered from flex's description in Debian, adapted to describe golex:
golex is a tool for generating scanners: programs which recognize lexical patterns in text. It reads the given input files for a description of a scanner to generate. The description is in the form of pairs of regular expressions and Go code, called rules. golex generates as output a Go source file, which defines a routine yylex()
. When the routine is run, it analyzes its input for occurrences of the regular expressions. Whenever it finds one, it executes the corresponding Go code.
golex supports all features for regular expression matching as described in flex's manual, except:
- character class set operations
[a-z]{-}[aeiou]
, and - matching EOF
<<EOF>>
.
EOF-matching is intended to be added to a future release of golex. Character class operations, however, will not, unless Go's own regular expression library (based on RE2) comes to.
A number of utility functions required for full flex emulation (mostly concerning manipulating the buffer (stack)) are also not yet available.
The full set of omissions (in regular expressions and otherwise) is detailed in the GitHub Issues for this repository.
golex and the scanners it generates are not fast (unlike those of flex). Rather than implementing its own regular expression engine and crafting a state machine based on that, golex simply defers to Go's built-in regular expressions, and matches character-by-character. Pull requests to right this wrong gratefully accepted! :)
Self-contained examples, taken from throughout the flex manual, have been converted to Go and are included as *.l
in this distribution. I invite you to compare them to the original flex examples to note how similar they are. Here are a few examples, found as username.l
, counter.l
, and toypascal.l
in the golex distribution.
A test
script for building and running an example is included. For example:
./test toypascal.l
will build golex, run golex on toypascal.l
, build the resulting Go code, and then run the resulting lexer.
This is not the first attempt at writing a golex utility, though it might be the first with the aim of behaving as similarly to the original flex as possible.
Other golexen include (but are not limited to):
- Ben Lynn's Nex tool.
- CZ.NIC's package at
git://git.nic.cz/go/lex
. - CZ.NIC's tool at
git://git.nic.cz/go/golex
(it's not like the name is terribly original!).
Copyright 2011-2022 Asherah Connor. Licensed under the 2-Clause BSD License.