Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this alive? #11

Closed
piranna opened this issue Jul 21, 2016 · 39 comments
Closed

Is this alive? #11

piranna opened this issue Jul 21, 2016 · 39 comments

Comments

@piranna
Copy link
Collaborator

piranna commented Jul 21, 2016

It was several months without news on the project, is there someone still working on this? I have retake nsh and would like to complete it with the missing parts on the POSIX standard...

@nfischer
Copy link
Collaborator

I've done some work on my own bash grammar. I'm game to update that to the level this one is at, plus the additional bash-specific features I've already implemented.

@parro-it
Copy link
Collaborator

Hi @piranna. I've started to work on this repo to help @dthree with its cash project.

It seems that cash is also stagnating from some months now. If I remember well, @dthree said he got some personal problem that cause him to reduce its OSS commitment.

The branch I started from scratch has reached the point, according to me, where it would greatly benefits from integration in another project. I did not loose interest in this project, it's still on my "nice thinks I want to complete sometime" list.
I was basically waiting for @dthree to solve his problems and start to integrate this in cash, then start to working on other projects by my own.

So, if you want to pick this for nsh that's awesome, I can help you fix issues you'll find and implement missing features. I'm busy working on libui-node at the moment, but we can try to find other contributors if needed.

@nfisher: I started the new branch to strictly follow the POSIX standard, I felt it would have been more difficult using your code as a starting point.

@dthree
Copy link

dthree commented Jul 21, 2016

Hi guys. Really sorry I haven't been able to put more time into these things and have admittedly been letting them stagnate. My life took some large turns and I've had very, very little time for OSS. I do hope that changes soon, but don't let me hold you guys up.

@parro-it
Copy link
Collaborator

@dthree: no worries, you clearly said that months ago. I hope the large turns are nice ones!

@parro-it
Copy link
Collaborator

On the technical side, as fair as i remember I was working on substitutions when I stp coding months ago.

That part requires to start modeling an API to interact with the shell, because the parser has to use the shell to invoke commands or do arithmetics or query the alias table.

@nfischer
Copy link
Collaborator

That part requires to start modeling an API to interact with the shell, because the parser has to use the shell to invoke commands or do arithmetics or query the alias table.

@parro-it is that necessary for the parsing process? I see how that's of course necessary for execution, but it seems to me like you could write a parser for that without any sort of semantic execution.

@piranna
Copy link
Collaborator Author

piranna commented Jul 21, 2016

nsh has been starved during the last months due to a bug on Node.js, but the other day found a temporal hack that could bypass it and finally was able to move it forward a bit, and at this moment it only lacks to fix variable substitution to fully support all the AST nodes provided by js-shell-parse (although I think i should not parse them and instead js-shell-parse or `bash-parser should give me them already parsed...).

So, if you want to pick this for nsh that's awesome, I can help you fix issues you'll find and implement missing features.

Do you have it published on npm or just only here on github? Is it in a "stabilized" status so I can be able to work with? How complete is it compared to js-shell-parse?

I'm busy working on libui-node at the moment, but we can try to find other contributors if needed.

@creationix was looking something similar to libui-node to use it on nucleus-js, but I forgot the name of the project, thanks :-) I think you should talk with him about how to integrate it...

That part requires to start modeling an API to interact with the shell, because the parser has to use the shell to invoke commands or do arithmetics or query the alias table.

Forget about that, just return an AST similar to the one of js-shell-parse and later the programs could be able to use it without problems. At this moment nsh is just an interpreter by directly walking the AST tree, but by doing an early process I could be able to walk on it just only once generating Javascript code and make it faster because the AST is transparent, so I think we should stick on it.

@parro-it is that necessary for the parsing process? I see how that's of course necessary for execution, but it seems to me like you could write a parser for that without any sort of semantic execution.

I think so, as I've said before bash-parser should be dumb and just generate an AST of what it sees, and left the interpretation of what it see to the program that uses that AST.

@parro-it
Copy link
Collaborator

Do you have it published on npm or just only here on github?

No, just here, but that's easy to fix...

Is it in a "stabilized" status so I can be able to work with?

What exactly do you mean by "stabilized"? I think I could merge the branch into master, publish to npm, but expect bugs, this is never used in production before...

How complete is it compared to js-shell-parse?

I'm honestly not sure... I try to follow POSIX standard as close as possible, the various kinds of substitutions are the only parts remaining to implement as fair as I remember see this list

@parro-it is that necessary for the parsing process? I see how that's of course necessary for execution, but it seems to me like you could write a parser for that without any sort of semantic execution.

The problem is that the result of substitutions could alter the AST the grammar should produce. I have to read the standard again to provide some example...

@creationix was looking something similar to libui-node to use it on nucleus-js, but I forgot the name of the project, thanks :-) I think you should talk with him about how to integrate it...

Nice, I'll comment on that issue...

@nfischer
Copy link
Collaborator

The problem is that the result of substitutions could alter the AST the grammar should produce. I have to read the standard again to provide some example...

@parro-it interesting to know. Let me know when you find the example, I'd be curious to learn 😃

@parro-it
Copy link
Collaborator

The first one I was able to recal is about aliases.

Taken from http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html:

2.3.1 Alias Substitution

After a token has been delimited, but before applying the grammatical rules in Shell Grammar, a resulting word that is identified to be the command name word of a simple command shall be examined to determine whether it is an unquoted, valid alias name. However, reserved words in correct grammatical context shall not be candidates for alias substitution. A valid alias name (see XBD Alias Name) shall be one that has been defined by the alias utility and not subsequently undefined using unalias. Implementations also may provide predefined valid aliases that are in effect when the shell is invoked. To prevent infinite loops in recursive aliasing, if the shell is not currently processing an alias of the same name, the word shall be replaced by the value of the alias; otherwise, it shall not be replaced.

If the value of the alias replacing the word ends in a , the shell shall check the next command word for alias substitution; this process shall continue until a word is found that is not a valid alias or an alias value does not end in a .

When used as specified by this volume of POSIX.1-2008, alias definitions shall not be inherited by separate invocations of the shell or by the utility execution environments invoked by the shell; see Shell Execution Environment.

This basically means we have to:

  1. tokenize the source code
  2. identify tokens eligible as alias name (COMMAND tokens if I remember...)
  3. query the shell to discover if identified tokens are effectively defined as aliases
  4. substitute all the alias names with the corresponding values
  5. for all substitued values, recurse to 1) until we don't find any more aliases
  6. proceed with parsing

For example, if you try this on shell:

a 1 2'

It should not correctly parse because of unclosed single quote.
But, if you define an alias:

alias a="echo '"

then a 1 2' become a valid command with a single argument: "1 2"

There are similar issues also with other kind of substitution.

@nfischer
Copy link
Collaborator

@parro-it ^ did not know that! Just tried it out, and it works like you suggested (at least for zsh). Cool!

So I agree, to correctly parse a shell script, it appears you would actually need to have some semantic information (at least for the aliases).

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

Is it in a "stabilized" status so I can be able to work with?

What exactly do you mean by "stabilized"? I think I could merge the
branch into master, publish to npm, but expect bugs, this is never used in
production before...

That would be good. Are you passing tests?

How complete is it compared to js-shell-parse?

I'm honestly not sure... I try to follow POSIX standard as close as
possible, the various kinds of substitutions are the only parts remaining
to implement as fair as I remember see this list

You are missing argument, command and process substitution to be on par
with js-shell-parse, take a look at
https://github.com/piranna/nsh/blob/master/lib/ast2js/index.js to see the
different nodes that it has and that I have implemented on nsh. On the
other hand you has for loops that's a nice addition... :-)

@parro-it is that necessary for the parsing process? I see how that's of
course necessary for execution, but it seems to me like you could write a
parser for that without any sort of semantic execution.

The problem is that the result of substitutions could alter the AST the
grammar should produce. I have to read the standard again to provide some
example...

You are missing the point, you only need to parse the script and generate
an AST tree, if you need to alter it in a second part then you are
interpreting it, that's a task of the shell or program using it. What if I
want to do a syntax highlighter? It doesn't know if a command is a binary,
a build-in or an alias, and in fact it doesn't matter. If you want
bash-parse to process them it's fine, but do it in a future iteration by
giving it an optional function given by the shells to process them.

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

I agree that would be a good thing to have, but left it for a future
iteration, by the moment just return the command name. Later, you could be
able to add an optional callback that if defined by the shell, it would be
called to lookup for the alias expansion and do that extra checks and
processing, but if not defined, just return it as is.
El 22/7/2016 2:30, "Nate Fischer" [email protected] escribió:

@parro-it https://github.com/parro-it ^ did not know that! Just tried
it out, and it works like you suggested (at least for zsh). Cool!

So I agree, to correctly parse a shell script, it appears you would
actually need to have some semantic information (at least for the aliases).


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#11 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAgfvmAHfWoB9IVF5tMGk51Eqhtx3SU2ks5qYA8kgaJpZM4JRyLt
.

@parro-it
Copy link
Collaborator

I agree that would be a good thing to have, but left it for a future
iteration, by the moment just return the command name.

Ok, I get your point now.

That would be good. Are you passing tests?

There are 68 passing tests and 3 skipped.

I'll proceed with following steps, in order:

  • Update to latest ava && xo releases. Fix eventual styling issues
  • Push to master
  • Setup travis CI
  • publish a 0.0.1 version
  • Work on missing substitution features as discussed here
  • Wait for your issues & PR

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

Cool, keep us informed :-)

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

Also if possible document a bit the format of the AST nodes, so I don't need to reverse engineer them.

@parro-it
Copy link
Collaborator

  • Update to latest ava && xo releases. Fix eventual styling issues
    DONE
  • Push to master

This will be a force push because I started jison branch from scratch , if anyone had cloned the repo, reclone it!

@parro-it
Copy link
Collaborator

❯ travis enable
Detected repository as vorpaljs/bash-parser, is this correct? |yes| 
not allowed to update service hook for vorpaljs/bash-parser

@dthree can you help me? Maybe you can make me a member of vorpaljs?

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

You can use easy-coveralls to add tests coverage support, too.

Disclaimer: easy-coveralls is mine ;-)

@parro-it
Copy link
Collaborator

Does it work only with mocha? I'm using AVA

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

Probably it would be updated to use ava, but I haven't used it so I don't know how it works.

@parro-it
Copy link
Collaborator

Ok, it seems it require to use nyc to generate coverage data, because of its AVA multi-process nature.
It would prety easy to setup with nyc and coveralls. ANyway I'll be happy to switch to easy-coveralls if and when it'll support AVA.

Coverage seems good: 87% but percentage is lowered by grammar.js which has a 74%.
It is the jison grammar, since I did not post it to GH coveralls doesn't show details.

I think I'll try to exclude the file from coverage statistics. It would be good to also get grammar rule coverage statistics, but I guess this is not easy to undertand looking at the built js file.

@parro-it
Copy link
Collaborator

Badges are actually referring to my own fork, waiting for help by dthree to proper setup travis

@parro-it
Copy link
Collaborator

Just published version 0.1.0 on npm.
You can try it:

require('bash-parser')('echo world')

should return this AST:

{
        type: 'list',
        andOrs: [{
            type: 'andOr',
            left: [{
                type: 'simple_command',
                name: 'echo',
                suffix: {
                    type: 'cmd_suffix',
                    list: [
                        'world'
                    ]
                }
            }]
        }]
    }

AST node types takes the same name of POSIX grammar rules that define them.
That help to refer to specific grammar rules, but could probably be improved on usability side...
Refer to this test file for examples of AST nodes, I will provide some docs soon.

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

ANyway I'll be happy to switch to easy-coveralls if and when it'll support AVA.

It's not on my short-term plans, pull-requests are welcome. I think easy-coveralls is really simple, so maybe you could be able to use it as basis to use it with AVA and later we can add support for it :-)

AST node types takes the same name of POSIX grammar rules that define them.
That help to refer to specific grammar rules, but could probably be improved on usability side...
Refer to this test file for examples of AST nodes, I will provide some docs soon.

Nomenclature is not intuitive (cmd_preffix for the environment variables and cmd_suffix for the arguments? Really?) and differ a lot from the one at js-shell-parse specially by having too much parsed layers, but by following the spec probably it will be better to maintain and expand, so it's good for me. Also, maybe it can help me to factor out the common parts of nsh to an independent project :-) I'll try to publish the 0.5 version this weekend and later move to use bash-parse.

@creationix
Copy link

@piranna were you thinking of glfw?

@parro-it
Copy link
Collaborator

Nomenclature is not intuitive (cmd_preffix for the environment variables and cmd_suffix for the arguments? Really?)

😈 no, it isn't. The whole POSIX standard is a mess!
But the trick to use grammar rule name is really useful while developing, I think we can improve AST structure in future releases. Let's open issues for the worst part (cmd_prefix and cmd_suffix are particularly bad 😉 )

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

From the tests (don't know about the spec) I would try to improve:

  • cmd_prefix (environment variables)
  • term, list and similar nodes are mostly no-ops, not sure if we should remove them to make AST simpler or left them to be fully spec compliant...

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

Let's open issues for the worst part (cmd_prefix and cmd_suffix are particularly bad )

If we stick to the spec to have it as reference then we should left them as-is, but at least we should consider the two points I put on the previous message.

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

@piranna were you thinking of glfw?

No, I was thinking about node-libui.

@parro-it
Copy link
Collaborator

@parro-it
Copy link
Collaborator

term, list and similar nodes are mostly no-ops, not sure if we should remove them to make AST simpler or left them to be fully spec compliant...

Spec doesn't dictate how an AST should be built. The advantage is that you easily know
a list node come from this rule, a cmd_prefix node from this other one and so on...

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

I have read the grammar spec and now thinks make more sense, since there was several use cases we were not taking in account... Ok, that's fine, but if possible I would use the same nomenclatures and style and case, so instead of andOrs I would use and_or and change all the left and right for their names on the spec, so you can be able to point the AST directly to the spec.

@parro-it
Copy link
Collaborator

I have read the grammar spec and now thinks make more sense, since there was several use cases we were not taking in account... Ok, that's fine, but if possible I would use the same nomenclatures and style and case, so instead of andOrs I would use and_or and change all the left and right for their names on the spec, so you can be able to point the AST directly to the spec.

Sure. I doesn't remember why I don't use an exact match for all nodes...

@piranna
Copy link
Collaborator Author

piranna commented Jul 22, 2016

Sure. I doesn't remember why I don't use an exact match for all nodes...

Probably due to the use of the standard style.

@parro-it
Copy link
Collaborator

Probably due to the use of the standard style.

I use xo, but yes, probably because it doesn't like underscores...

@parro-it
Copy link
Collaborator

@nfisher: I started the new branch to strictly follow the POSIX standard, I felt it would have been more difficult using your code as a starting point.

Sorry @nfisher I confuse you to be the author of js-shell-parse. What is your grammar repo link?

@dthree
Copy link

dthree commented Jul 22, 2016

@dthree can you help me? Maybe you can make me a member of vorpaljs?

Of course!

@nfischer
Copy link
Collaborator

Sorry @nfisher I confuse you to be the author of js-shell-parse. What is your grammar repo link?

@parro-it no worries. The grammar I was writing is also a javascript-based PEG. I wrote it for a transpiler project I started. Here's the source file.

It's not quite as far along as js-shell-parse, so I'd have to do a bit of refactoring first for us to be able to use it. Unfortunately, the Ohm parsing framework (the one I've been using) is still lacking important features to be able to comfortably parse bash, so it's far from complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants