-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On Forth lexeme translator #75
Comments
Retro does this kind of thing also, and is very forthy in design.
Also, LISP does exactly the 'quote mechanism as well, and it's very
fundamental to the cool things lisp can do... which also seem rolled into
Retro so it can do those cool code-processing things also.
Also related: J.
see jsoftware.com (i think). it's a ASCII re-do of the legendary APL <
https://en.wikipedia.org/wiki/APL_(programming_language)> which required a
special keyboard covered in special symbols.
J (and or APL) take this idea of 'prefixs' to the extreme, even extending
it with the notion of special verb characters and composition words... but
at the same time, both aimed more at a lisp-like 'execute only after all
code is known' mode, rather than forth's immediancy 'execure *right now*'
style a.k.a. "words should perform themselves". J is infix, with a
different rule for all 'order of operation' decisions -- they had to
abandon the traditional 'order of operations' stuff in favour of one simple
rule for the composition concept to make sense and work.
Anyhow, both concepts are really more 'high level language' oriented, and
pretty much fly directly in the face of Forth's KISS / "less is better"
principle, I feel.
…-- Remy
On Thu, Sep 13, 2018 at 6:51 AM ruv ***@***.***> wrote:
@rdrop-exit <https://github.com/rdrop-exit> in #73
<#73>, on September 3
<http://./73#issuecomment-418104686>
Yes, Forth already has a handful of simple prefix words that should only
be used judiciously and with care. That does not invalidate the point of *"Let
commands perform themselves"*, which extols the benefits of not being
syntax driven. The Forth approach is more akin to a sophisticated
interactive assembler than a traditional parsed language
compiler/interpreter.
What is the "syntax driven" and why it is bad?
Whatever it was.
1. The prefix words (or the *parsing words*) are a kind of syntax (or
even grammar).
2. Forth already has such words and it cannot live without them (for
the moment).
3. Every time when a prefix word is used it does not allow the next
word to perform themself.
4. The programmers (the users of a Forth system) need the
functionalities that are usually achieved via the prefix words, but the
prefix words bring the set of problems (see [ertl98]). Nevertheless, the
programmers create new prefix words since they don't have (or hasn't found)
another way to achieve a desired functionality.
If we remove (or replace by something) the space between a prefix word and
the next word — we will solve all the problems. This new "word" (as a
single lexeme) now performs themself. It is not a real word (in the same
way as the numbers), and so it does not have the mentioned problems.
In place of ' something and ['] something (that break copy-paste of code
fragments from outside a definition into inside it, and vise versa) we can
always use 'something (it is a quoting, it prevents something from
execution and returns its *xt*).
In place of S" abc" we can use "abc" and can forget about S" word at all.
In place of to a we could use to->a or ->a or to:a.
Is to:a a special syntax? Perhaps yes, but in the same degree as to a is.
Regarding the Forth text interpreter loop (that is referenced by "*Let
commands perform themselves*" tip) — it becomes simpler: it don't need to
know anything even about words and numbers, it just calls the *lexeme
translator*. And handling of words, numbers, strings, quotings, etc, —
can be added into lexeme translator as simple as new words into vocabulary.
So now the discussion is going not about whether it is necessary or not,
but about how to better implement it, and what API to choose. Many Forth
systems support this feature more than ten years already, and we need a
single unified API now. Can anybody suggest some improvements in this
regard?
Here is my two cents: Lexeme resolver mechanism API
<https://github.com/ruv/forth-design-exp/blob/master/docs/resolver-api.md>
.
*References*
[ertl98] M. Anton Ertl, State-smartness Why it is Evil and How to
Exorcise it
<https://www.complang.tuwien.ac.at/anton/euroforth/ef98/ertl98.pdf>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#75>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AO8-GBwvczDsjn28b-h1QHzt_oCQEZwVks5uaXPmgaJpZM4WmJVP>
.
--
-- Remy
|
Recognizers, and similar proposals to add more parsing machinery to Forth, are not necessary. Excising state-smartness from a Forth in no way requires their adoption, thankfully so as even if accepted into "standard" Forth it would be as an optional (experimental?) extension. |
But the space between "prefix" words is to let the parser find the end of the word. Now what you're going to have to do is identify the prefix, scan all the way to the end of an arbitrarily long string, match what the word is, put the string somewhere to be worked on, and then do the word. This makes no sense. You've now made the parser way more complicated than it needs to be and removed the ability of programmers to start a word with a quote symbol, because that will start the parser off looking for a string to compile. That idea makes no sense. Why is it a good idea to make the parser harder to write and understand, just to lose a space between an uncommon class of words and their argument? The way it works right now is perfectly okay. |
In reply to @gordonjcp
It is not the Forth text interpreter business. If you need it — you do it. If you don't need — the capability itself does not bother you. Actually
Nothing is removed. Shadowing depends on the order. Usually you choose (and it is by default) to give higher precedence to the regular words. It is the same as regarding the wordlists — you choose which is the first in the search order.
The way it works right now will continue to work. Nothing is removed. Only new capability is added; more precise, this capability is already presence in many Forth systems, but a common unified API should be designed and added. |
What is your (as a Forth system user) way to add support for the floating point numbers, without recompiling the Forth system? Or the hex numbers in form 0x12DF4?
State smartness becomes a problem when you try to postpone a state-smartness immediate word. So the point is to provide another way that allows to not use the state-smartness immediate words at all.
If this capability is even implemented in your Forth system — you (as the Forth system user) are not affected by it if you don't use it. |
Real code examples:
These words don't work correctly outside definitions. |
What is wrong with recompiling a Forth system? That's the whole point of meta-compilation. Personally I'd be very wary of a Forth that doesn't come with full source and a meta-compiler.
As I mentioned earlier solving state-smartness issues does not require recognizers.
Nor should they since they define compile-time behavior, and therefore should be compile-only directives. There are various approaches to having a name result in a non-default combination of compile-time and interpret-time behaviors, e.g. Stephen Pelc's NDCS proposal, various dual-xt approaches, Chuck Moore's approach in cmForth. Addressing state-smartness issues does not require bringing recognizers into the picture, they are superfluous, such needs can and should be addressed at a lower level whether or not the particular Forth implements any optional recognizers extension. |
Nothing wrong. It is just convenient to have some features on the level of libraries.
Yes, you just don't have another variant. So, let imagine an interval. At one end all lexemes are prefixed with prefix words. At another end there is no need for prefix words at all. How to choose the right point in this interval: what lexemes should be with a prefix, and what lexemes should be without a prefix? It seems that your point is: let an implementer to define it (i.e. the Forth system core level). And a user is forced to use the prefix words in all other cases, except the system hardcoded variants. My point is: let user to define it (i.e., the libraries and applications level).
Agree. Recognizer mechanism just provides technical capability to also solve the state-smartness issues on the user level (library), even if these issues are not solved by the Forth system itself.
Agree.
Moore's cmForth approach can be implemented via recognizers.
Agree. But in cmForth this issue is solved on the relatively high level. OTOH there is no any standard API that addresses state-smartness issues. Probably we should pass a process of design such API, — similar to the way that is going with recognizers. |
I don't understand what you mean by that, one can make as many variants as one wants,
In Forth the interpreter looks up and executes words for us, there is no need for a CALL prefix.
I assume the new standard will provide a way to designate compile-only definitions as part of whatever solution is adopted by the standard for dealing with non-default combination of compile-time and interpret-time behaviors. At the moment I use the following approach for designating compile-only definitions:
If I want a definition to be both immediate and compile-only I use :
Where
The proposals dealing with non-default combination of compile-time and interpret-time behaviors are a way of addressing the state-smartness issues. |
Recognizers make it easier to make Forth code unrecognizable.
… On Sep 14, 2018, at 8:11 PM, Mark W. Humphries ***@***.***> wrote:
I prefer hex numbers in the form $ 12df4.
Yes, you just don't have another variant.
I don't understand what you mean by that, one can make as many variants as one wants,
what's to prevent it?
Conceptually this approach leads to # prefix for numbers and CALL prefix to call a word.
In Forth the interpreter looks up and executes words for us, there is no need for a CALL prefix.
But there is no a standard directive for compile-only. Only exception can be thrown.
I assume the new standard will provide a way to designate compile-only definitions as part of whatever solution is adopted by the standard for dealing with non-default combination of compile-time and interpret-time behaviors.
At the moment I use the following approach for designating compile-only definitions:
: foobar .... ; compile-only
If I want a definition to be both immediate and compile-only I use :
: foobar ... ; directive
Where directive is defined as:
: directive ( -- ) immediate compile-only ;
OTOH there is no any standard API that addresses state-smartness issues. Probably we should pass a process of design such API, — similar to the way that is going with recognizers.
The proposals dealing with non-default combination of compile-time and interpret-time behaviors are a way of addressing the state-smartness issues.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I mean a prefix parsing word ( With recognizers mechanism one more variant is available: the numbers in form For my taste, in the case of numbers the form What is wrong with having numbers in the form |
That presumably won't be the case once the standard finally addresses the non-default combination of compile-time and interpret-time behaviors. In the meantime, I have no state-smart issues with words like
(note:
Cosmetically nothing really, de gustibus non est disputandum. |
Did somebody make a proposal with explicit specification on this? I'm aware of some articles and papers, but they are not a specification.
And your Forth system chooses one or another depending on state, does it? How to create an alias
But this approach relies on a quite confusing conception that FIND can return different xt depending on STATE. I would prefer a stable FIND that returns the same xt regardless the STATE.
It seems if we don't apply : $ ( <token> -- u ) ($) tt-lit ; immediate
: z" [compile] s" ['] drop tt-xt ; immediate Can anybody check this idea? |
I have no
That's one way yes.
I have no Keep in mind though, that this is how I addressed the state-smartness issues in my personal Forth, the standard committee will address these issues in their own way.
There's the question of which xt should |
On Sat, 15 Sep 2018, Mitch Bradley wrote:
Date: Sat, 15 Sep 2018 00:55:10 -0700
From: Mitch Bradley ***@***.***>
Reply-To: ForthHub/discussion
<reply+0025a5126a8d716c7e92401173f0e11a16e0d0c22b1a2f2892cf0000000117b47cd
***@***.***>
To: ForthHub/discussion ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [ForthHub/discussion] On Forth lexeme translator (#75)
Recognizers make it easier to make Forth code unrecognizable.
Yeah, as if that's a stretch ;-) ...
…--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Robert Sciuk [email protected]
97 Village Rd. 289.312.1278
Wellesley, ON. N0B 2T0
|
I would prefer that a same code fragment can be used outside definitions, inside definitions, inside postponing fragments, inside macro fragments — without changes. This feature can be achieved via recognizers (or via resolvers in my proposal). The following example will not work in your approach, but can work via resolvers: m: $ ($) tt-lit ; \ NB: even shorter definition
: compile-stuff ]] dup $ AD <> if . else drop then [[ ;
: test 123 [ compile-stuff ] ;
Conceptually the variable name does not matter. It could be even a DEFER that is switched among [...]
In the example above I meant only one xt per word and state-smartness. \ a classic state-smart parsing word.
: $ ( "number" -- u | ) ($) state @ if lit, then ; immediate
\ a wrapper (alias)
: h# [compile] $ ; immediate
\ variation in the wrapper definition
: h# [ ' $ compile, ] ; immediate
My hypothesis is that without using POSTPONE word you can't show any state-smartness issue for these words. An NDCS variant (by Stephen Pelc paper Special Words in Forth ) : $ ( "number" -- u | ) ($) ; ndcs: ($) lit, ;
\ how to define a wrapper?
\ The following variants will not work
: h# [compile] $ ; immediate
: h# [ ' $ compile, ] ; immediate
: h# postpone $ ; immediate
: h# ['] $ execute ; immediate How to make a wrapper in this case? |
Abstracting away the differences between interpretation and compilation is not on my radar. It's not something I would ever pursue in my Forths, so I can't really offer any constructive comments in this regard.
Could you describe what the point is of this code is and why I would want to structure code this way? It's clear as mud to me.
I don't think you're understanding what is meant by state-smart words in Forth. At any point in time your Forth's outer-interpreter is either in an interpreting state or a compiling state. Obviously the issue is not about eradicating this state in the outer interpreter. The state-smart word issue concerns the problems that can ensue from having a word that, as it executes, alters its behavior based on the then current state of the outer-interpreter. For example
I haven't studied Stephen Pelc's proposal, perhaps he can chime in and answer your questions about it. |
I see. I just meant that a state-smart word can be aware of the current state without
The following variant is state-smart as well, although it does not refer : fubar ( -- ) ['] translator defer@ ['] compiler = if frobnicate else twiddle then ; immediate |
It is a subject of the metaprogramming. For example I have \ Return control to the calling definition if the top value is not zero,
\ otherwise drop the top value (that is zero).
: ?ET ( 0 -- | x -- x ) \ exit on true returning this true
postpone dup postpone if postpone exit postpone then postpone drop
; immediate Using : ?ET ( 0 -- | x -- x )
]] dup if exit then drop [[
; immediate I suggested slightly better readable variant: : ?ET ( 0 -- | x -- x )
postpone{ dup if exit then drop }postpone
; immediate Using this approach, the fragment If I need to use numbers or string literals in such fragments — I would prefer to use them "as is" without any special preparing. |
Absolutely, and as you saw |
Unfortunately I'm not really familiar with current standard compliant ways of doing such things, I can only show you how I would do it in my own Forth in the meantime. My first instinct for a reusable low level word such as this would be to implement it as a primitive, in fact in my current Forth I have such a primitive, I named it My second instinct would be to just make a normal word and call it, this definition would work in my current Forth:
If for whatever reason I preferred the code to be inlined into another definition rather than called by another definition, I could do it this way:
I also have a primitive called
[snip...] |
Perhaps your
Quite. Just side note. I can guess, even in your Forth system the first variant In-line expansion is useful capability, but it does not solve the problem of code generation. What if we want to define : unless postpone{ 0= if }postpone ; immediate Definitely, it is interesting, how some problem can be solved in a certain Forth system. |
My
Thanks, I don't intend on running my code on other than my own Forths.
The stack effect diagram is explicit about the return stack effect, it's up to the programmer to be mindful whether that is in fact the stack effect he needs in a particular situation.
I'm not sure what the problem of code generation is in any practical sense. What is the pain caused by this problem?
I already have solutions for my needs, I'm most interested in gauging the "portability penalty", i.e. overhead and implementation complexity, to comply with the standard proposals. |
On 09/21/2018 11:36 AM, Mark W.
Humphries wrote:
My 0ditch is equivalent to ?dup 0;,
or ?dup 0= if exit then.
IF EXIT THEN is just ?EXIT
Similar: IF LEAVE THEN is just ?LEAVE
etc..
|
Yes, but usually implemented as a primitive rather than the equivalent high level Forth. |
Problem in mathematical sense, not everyday sense. This meaning is connected with conditions and questions, answers and solutions. Not with a pain. The code generation problem established above consists in question how to avoid the necessity of transformation of a source code fragment that should be postponed. Regarding "standard proposals". There should be a common basis for discussion. |
If your goal is simply to have a version of
|
I use a post fix. 0FFh , 10d, 10101b
I haven't found any STATE issues.
…On Sat, Sep 15, 2018 at 4:09 AM ruv ***@***.***> wrote:
In reply to <#m_-2682692739872462126_issuecomment-421534398> @rdrop-exit
<https://github.com/rdrop-exit>
I prefer hex numbers in the form $ 12df4.
Yes, you just don't have another variant.
I don't understand what you mean by that, one can make as many variants as
one wants,
what's to prevent it?
I mean the prefix word ($ in your case). The only available standard
variant is a prefix word. And this word either will be state-smart (bad),
or will not work in some states (bad).
With recognizers mechanism one more variant is available: the numbers in
form $12df4 can be supported along with usual numbers. In NDCS (or alike)
approach a prefix word (e.g. $) can be defined in special way without the
state-smartness issues.
For my taste, in the case of numbers the form $12df4 is just more
convenient than $ 12df4.
What is wrong with having numbers in the form $12df4?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#75 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFC6xe_TJtFwfHKsLCbBaHl-hCMaKDHyks5ubMPZgaJpZM4WmJVP>
.
|
I not have state at all, |
Just noting that every Forth I've ever used, I've modified binary mode to allow for '.' characters in place of zeroes. I find it much easier on the eyes and quick to see what the actual values are. This was especially nice when being used for mono-chromatic and sprites in memory, too.
So much easier on the eyes... ;-) |
nice trick! I add to r3
https://github.com/r3www/r3
El dom., 30 sept. 2018 a las 15:45, Jeffrey Massung (<
[email protected]>) escribió:
… I use prefix like colorforth, hex are $ff, bin are %101...
Just noting that every Forth I've ever used, I've modified binary mode to
allow for '.' characters in place of zeroes. I find it much easier on the
eyes and quick to see what the actual values are. This was especially nice
when being used for mono-chromatic and sprites in memory, too.
%.111....
%11111...
%11111...
%.1.1....
%11.11...
So much easier on the eyes... ;-)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#75 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAZ9nEKtsM9n9Gh8_PoXnLB5B_MZW2n0ks5ugRE7gaJpZM4WmJVP>
.
|
I use postfix BASE denomonation in my FISH.
101010b
2FDh
27d
I also allow for easier on the eyes formatting of the TIB input by not
computing dots when converting numbers:
10.10.01b
Only a trailing dot creates a double number
10.01.10.
…On Sun, Sep 30, 2018 at 1:45 PM Jeffrey Massung ***@***.***> wrote:
I use prefix like colorforth, hex are $ff, bin are %101...
Just noting that every Forth I've ever used, I've modified binary mode to
allow for '.' characters in place of zeroes. I find it much easier on the
eyes and quick to see what the actual values are. This was especially nice
when being used for mono-chromatic and sprites in memory, too.
%.111....
%11111...
%11111...
%.1.1....
%11.11...
So much easier on the eyes... ;-)
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#75 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFC6xdml-p1928mp5sdW38Qy16YZz6uCks5ugRE9gaJpZM4WmJVP>
.
|
Does anybody here who supports the idea of common API for the resolvers mechanism? Please, use the following reactions on this post:
|
@rdrop-exit in #73, on September 3
What is the "syntax driven" and why it is bad?
Whatever it was.
If we remove (or replace by something) the space between a prefix word and the next word — we will solve all the problems. This new "word" (as a single lexeme) now performs themself. It is not a real word (in the same way as the numbers), and so it does not have the mentioned problems.
In place of
' something
and['] something
(that break copy-paste of code fragments from outside a definition into inside it, and vise versa) we can always use'something
(it is a quoting, it preventssomething
from execution and returns its xt).In place of
S" abc"
we can use"abc"
and can forget aboutS"
word at all.In place of
to a
we could useto->a
or->a
orto:a
.Is
to:a
a special syntax? Perhaps yes, but in the same degree asto a
is.Regarding the Forth text interpreter loop (that is referenced by "Let commands perform themselves" tip) — it becomes simpler: it doesn't need to know anything even about words and numbers, it just calls the lexeme translator. And handling of words, numbers, strings, quotings, etc, — can be added into lexeme translator as simple as new words into vocabulary.
So now the discussion is going not about whether it is necessary or not, but about how to better implement it, and what API to choose. Many Forth systems support this feature for more than ten years already, and we need a single unified API now. Can anybody suggest some improvements in this regard?
Here is my two cents: Lexeme resolver mechanism API.
References
[ertl98] M. Anton Ertl, State-smartness Why it is Evil and How to Exorcise it
The text was updated successfully, but these errors were encountered: