-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor parsing pipeline #1200
base: main
Are you sure you want to change the base?
Conversation
This reverts commit da36fcb.
…such that it can be populated a construction time
…line # Conflicts: # core/src/main/scala/com/databricks/labs/remorph/intermediate/expressions.scala # core/src/main/scala/com/databricks/labs/remorph/intermediate/plans.scala # core/src/main/scala/com/databricks/labs/remorph/intermediate/trees.scala # core/src/main/scala/com/databricks/labs/remorph/intermediate/workflows/JobNode.scala
This reverts commit f955172.
I'm not quite sure about this. We now have more code in the individual PlanParsers and I am not sure that we need to pass the token stream around as we could just call one or more processors on it after it is created. Sort of like the optimizer applies many rules to the LogicalPlan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is the correct approach. The separate steps are now one big step and we have moved code from the generic PlanParser into each of the two implementations. Eventually we will have N implementations and any changes to the parsing sequence would have to be made in every instance.
I think that we need to start again with the comment processing and first write a design/strategy document, stealing from:
https://docs.google.com/document/d/1s3nvTklaFgt4a-u_lUhR5tQO_L-S_Br-YAtbF-XxhSw/template/preview
The existing implementation forbids that since the CommonTokenStream is transient. We need simultaneous access to the CommonTokenStream and the generated LogicalPlan. |
I'm not sure this is a problem since ANTLR's parsing sequence cannot be changed. |
I'm working on that, experimenting an approach that lets catalyst intact. |
I don't think we do - I think we process the token stream as soon as we get it. |
Can we close this now? We have no need for this change |
It's marked on hold in the project. A change is needed, we haven't decided which one yet. |
OK - but please remeber that you think a change is needed; I don't believe anyone else thinks that. This has nothing to do with allowing different transpilers in the project, if that is where you are coming from here? |
Refactors the LogicalPlan pipeline such that it's possible to use the original TokenStream to fetch the comment tokens and apply them appropriately to the plan items.
Progresses #869
Supersedes #1191