Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong generated code for arguments with default value in Python target #4443

Open
renatahodovan opened this issue Oct 17, 2023 · 6 comments
Open

Comments

@renatahodovan
Copy link
Contributor

renatahodovan commented Oct 17, 2023

For this grammar - which is written to produce a Python parser:

grammar Args;

start [arg=10]
    : 'a'
    ;

ANTLR 4.13.1 generates the following Python code:

class ArgsParser ( Parser ):
    ...
    class StartContext(ParserRuleContext):
        __slots__ = 'parser'

        def __init__(self, parser, parent:ParserRuleContext=None, invokingState:int=-1, arg=None):  # Wrong initializer for arg.
            super().__init__(parent, invokingState)
            self.parser = parser
            self.arg = 10  # Superfluous init.
            self.arg = arg

    ...

    def start(self, arg):  # Missing initializer for the default value of arg.
        localctx = ArgsParser.StartContext(self, self._ctx, self.state, arg)
        ...

While this equivalent construct written for Java:

grammar Args;

start [int arg = 10]
    : 'a'
    ;

... results the following:

public class ArgsParser extends Parser {
        ...
	public static class StartContext extends ParserRuleContext {
		public int arg = 10;  // Note that the 'arg' member has its default value set in the StartContext class.
		public StartContext(ParserRuleContext parent, int invokingState) { super(parent, invokingState); }
                 // Note that the 'arg' argument has its default value set in the constructor of StartContext.
		public StartContext(ParserRuleContext parent, int invokingState, int arg = 10) {
			super(parent, invokingState);
			this.arg = arg;
		}
         ...

         // Note that the 'arg' argument has its default value set in the start method.
	public final StartContext start(int arg = 10) throws RecognitionException {
		StartContext _localctx = new StartContext(_ctx, getState(), arg);
		...

I think, that Python target should have something like this:

class ArgsParser ( Parser ):
    ...
    class StartContext(ParserRuleContext):
        __slots__ = 'parser'

        def __init__(self, parser, parent:ParserRuleContext=None, invokingState:int=-1, arg=10):
            super().__init__(parent, invokingState)
            self.parser = parser
            self.arg = arg

    ...

    def start(self, arg=10): 
        localctx = ArgsParser.StartContext(self, self._ctx, self.state, arg)
        ...
@ericvergnaud
Copy link
Contributor

ericvergnaud commented Oct 17, 2023

In the above example, the generated Java code does not compile. Java 8 doesn't not support type inference required for public arg = 10;, and no version of Java supports default parameters.
Your rule should read start [int arg]
AFAIK antlr4 does not support default rule parameter values. It requires a type and a name separated by a space. Both are simply inlined in the generated code. It supports default values for local variables only i.e. locals [int i=10].

@renatahodovan
Copy link
Contributor Author

@ericvergnaud I know that Java won't compile. The only reason I've pasted the Java output to show the way the Java runtime is generated compared to the Python version. The bug report is about to show that the Python code is wrong.

@ericvergnaud
Copy link
Contributor

Sorry, but what is wrong is the grammar i.e. start [arg=10] is simply not supported. Since the input (your grammar) is incorrect, the output (the generated code) cannot be correct. The bug is that the tool doesn't tell you that the syntax is wrong, not that the generated code is wrong. Would you mind changing the issue title accordingly ?

@renatahodovan
Copy link
Contributor Author

@ericvergnaud I still think that there is a misunderstanding here. I want ANTLR to copy-paste my argument with the initialization as is. I don't expect to interpret it or so. And this is the expected behaviour according to the documentation:

The args, locals, and return [...] are generally in the target language but with some constraints. The [...] string is a comma-separated list of declarations either with prefix or postfix type notation or no-type notation. The elements can have initializer such as [int x = 32, float y] but don't go too crazy as we are parsing this generic text manually in [ScopeParser](https://github.com/antlr/antlr4/blob/master/tool/src/org/antlr/v4/parse/ScopeParser.java).

Since Python has a no-type notation and since initializers are supported, I cannot see why start [arg=10] would be wrong.

I've checked another targets (Java and CPP) and they do exactly what is written. (For those targets, the content of the argument was adapted ofc. to match the expectations of the specific target language.)

One more note: the generated Python code of the grammar above is syntactically correct, it runs but behaves differently as the other targets.

@ericvergnaud
Copy link
Contributor

An initializer and a default parameter value are 2 different things.
The doc does say don't go too crazy, it seems you're trying to push the limit just a bit too far.
IIRC the Python generator only supports typed parameters without a default value, and caters for typed parameters required by our CI. That is unlikely to be changed, because multi-target support is paramount.
There is a slowly progressing PR to address multi-target embedded production code, which would be a good starting place to support what you are looking for. See PR #4345

@renatahodovan
Copy link
Contributor Author

@ericvergnaud Okay, I see the reason of the misunderstanding eventually! My initial sloppiness with the Java example made you think that I wanted to have multi-target support. But I didn't even think of such thing! I simply wanted to have an argument with a default value to be generated correctly into my Python language parser. I only intended to use Python runtime. I expected that this argument code will be copied just as is into the generated Python parser to the appropriate position. I saw that it was the intention of ANTLR, but it made some mistakes during the parser generation. (I put some notes into the generated Python language code snippet above.) Initially, I used the same grammar to generate a parser for the Java runtime to emphasise the differences how the various targets handles the same argument. But I see now that it was misleading, hence I've updated the original report to be more precise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants