-
-
Notifications
You must be signed in to change notification settings - Fork 1
/
init.gale
147 lines (131 loc) · 7.41 KB
/
init.gale
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
{
Copyright (C) 2023 Josh Klar aka "klardotsh" <[email protected]>
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted.
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH
REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT,
INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR
OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE.
}
// Welcome to init.gale, the bring-up file for the canonical Gale
// implementation, which is designed around using the host language (in our
// case, Zig) as little as possible, and self-hosting as much as possible in
// Gale itself. As such, this is almost certainly not the most performant
// possible implementation of Gale: whether I circle back to address this is
// left as a debate for my future self. That said, the desire to
// nearly-entirely self-host Gale led to many of its design decisions,
// especially the ability to be both a relatively high-level language and a
// (perhaps less performant than C/Zig/Rust) lower-ish level language at the
// same time. It is also where the concept of "Trusting Words", which we'll
// learn more about in a bit, came from. From here on, we'll take "Gale", "the
// Gale Nucleus", etc. to refer to this canonical implementation.
// Out of the box, the Gale Nucleus provides us with almost nothing: this file
// is read off the disk as a buffered stream and run through the exceptionally
// primitive word parser which understands:
//
// - // through end of line as a comment
// - signed and unsigned integers (42, -42, 42/u, 42/i, etc.)
// * TODO: suffixes impl
// - symbols, references, and bare-word references ('Foo, &Bar, swap)
//
// The default collection of words is also extremely sparse: we have some basic
// Trusting Word implementations of each of the Generic Stack Manipulation
// words described in `WORDS.md`, of each of the Memory Manipulation words from
// the same document, and then some words specific to this implementation of
// Gale to provide hooks to the runtime, and ways to override default
// behaviors. However, *not* included are pretty much any of the
// actually-interesting details of the language: ergonomic ways to define
// words, any concept of shapes or signatures or dynamic word dispatch,
// docstrings, etc. We'll build all of that right here. We'll be building this
// in "top-down" fashion: given a "userspace" concept as advertized in the
// documentation or example files, we'll build it, and any of its dependent
// concepts and structures that have yet to be built. Thus, the ordering of
// this file will be a *bit* more jumbly than, say, JonesFORTH if you've read
// that (which in general takes a more bottom-up approach, which is admittedly
// more readable even to me...).
// Before we get into defining our first bits of functionality, this is a great
// time to go over a bit of Style and Convention: both in this file, and for
// Gale as a language ecosystem:
//
// - Words generally are lower-kebab-cased (eg. `say-hello-world`)
// - Exception 1: Trusting Words are always screaming-snake-cased with a
// leading exclamation point (eg. `!ALLOCATE_SOMETHING`)
// - Exception 2: Nucleus Words are always screaming-snake-cased with a
// leading at-sign (eg. `@NUM_OBJECTS_ALLOCATED`)
// - Shapes are almost always Pascal-cased (eg. `FooBarAble`)
// - As a general guideline, the acting working stack for a word should quite
// rarely be deeper than four objects
// First, let's get some prettier syntax in here by creating our first word -
// :!, to create a Trusting Word. Since we have no type system yet (nor Shapes
// to populate it with), this is not yet the spec-compliant version of this
// word, just a low-level helper for bootstrapping.
//
// Usage: :! !ALLOCATE_BLOCK @SIZED_OPAQUE ;
// ^ This is not actually the implementation of !ALLOCATE_BLOCK we'll
// land on, read along for that.
// I really, really, do not want to implement Immediate Words in Gale, not even
// in this weird bootstrappy form of the language that will never see userspace
// light of day. If you write a Gale implementation, you're welcome to use
// Immediate Words to solve problems like this. I will instead abuse a "private
// space" (implementation and documentation found over in Zig-land), and start
// prescribing meaning to the empty data therein. Our first contestant: a u8
// byte dedicated to an enumeration of interpreter states, enabling us to move
// from "execution mode" into "symbol mode" (where bare words immediately
// become symbols, even if they exist in the Word or Shape dictionaries) or
// into "ref mode" (bare words become refs to whatever the identifier points
// to, if they are defined. At this low level, unresolveable refs simply panic
// the interpreter).
//
// Since enums in the Gale sense require a type system too, we'll just define
// words to put names to these enum members. See? We're not so unlike a FORTH
// after all :) Since ! and @ are taken, my non-kernel "private" words will be
// %-prefixed.
// @LIT ( @1 -> Word )
// Wraps any value type in an anonymous word which will return that value when
// called. Generally useful when defining words which need to refer to numbers,
// strings, symbols, etc. at runtime.
// Keep these in sync with runtime.zig
// :! %INTERP_MODE_EXEC 0/u ;
0/u @LIT '%INTERP_MODE_EXEC @DEFINE-WORD-VA1
// :! %INTERP_MODE_SYMBOL 1/u ;
1/u @LIT '%INTERP_MODE_SYMBOL @DEFINE-WORD-VA1
// :! %INTERP_MODE_REF 2/u ;
2/u @LIT '%INTERP_MODE_REF @DEFINE-WORD-VA1
// For convenience, let's make some toggle words for these states:
// :! %PS_INTERP_MODE 0/u ;
0/u @LIT '%PS_INTERP_MODE @DEFINE-WORD-VA1
// :! %>_ %PS_INTERP_MODE @PRIV_SPACE_SET_BYTE ;
&@PRIV_SPACE_SET_BYTE %PS_INTERP_MODE '%>_ @DEFINE-WORD-VA1
// :! %>EXEC %INTERP_MODE_EXEC %>_ ;
&%>_ %INTERP_MODE_EXEC '%>EXEC @DEFINE-WORD-VA3
// :! %>SYMBOL %INTERP_MODE_SYMBOL %>_ ;
&%>_ %INTERP_MODE_SYMBOL '%>SYMBOL @DEFINE-WORD-VA3
// :! %>REF %INTERP_MODE_REF %>_ ;
&%>_ %INTERP_MODE_REF '%>REF @DEFINE-WORD-VA3
// @PRIV_SPACE_SET_BYTE ( UInt8 UInt8 -> nothing )
// | |
// | +-> address to set
// +-------> value to set
// As discussed over in the Nucleus source, interpreter modes take effect for
// exactly one word by default. However, we can override this behavior using
// @BEFORE_WORD ( Word -> nothing ), where the word referenced is of the
// signature ( Symbol <- nothing ). The symbol represents the word about to be
// processed exactly as it was passed to the interpreter. It must be left on
// the stack for the word to be handled (if removed, the word is silently
// ignored. This can almost certainly be used for all sorts of insane stuff I
// haven't thought of yet, have fun.)
// @CONDJMP2 ( Word Word Boolean -> nothing )
//
// Immediately executes the near word if the boolean is truthy, and the far
// word otherwise. Effectively all userspace conditionals can be built with
// this primitive and its else-less counterpart,
// @CONDJMP ( Word Boolean -> nothing )
// :! %REF_WORDS_UNTIL_SEMICOLON '; @EQ! %>REF %>EXEC @CONDJMP2 ;
&@CONDJMP2
&%>EXEC &%>REF
&@ZAPN2 &@EQ ';
'%REF_WORDS_UNTIL_SEMICOLON @DEFINE-WORD-VA
// @EQ ( @2 @1 <- Boolean )