-
Notifications
You must be signed in to change notification settings - Fork 701
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace cabal project parsing with Parsec #8889
base: master
Are you sure you want to change the base?
Conversation
@ulysses4ever as announced, here is an early PR of the changes. Still lots of work, but I have a rough outline of the ticket. I would be glad if you would take a look! |
402f607
to
9016a3a
Compare
@grayjay I know you’re more of a solver person, but if you happen to have some time for advising and feel comfortable in this part of the code, your contribution would be priceless. |
@jgotoh there's a bi-weekly cabal devs meeting, where, I am sure, people would be delighted to hear your experience so far. The closest one is this Thursday (Apr 20th), 1 PM Eastern Time (US). The link to a Jitsi video call is posted before the meeting on #hackage at libera.chat (can be browsed using Element/Matrix or an IRC client). Are you interested? |
@ulysses4ever Thank you very much for you invitation! I've already wondered about the cabal devs main communication channel. I am definitely interested and will be glad to attend :) |
@jgotoh cool! let me know if you want me to mail you the jitsi link beforehand. |
I'm not familiar with this part of the code, but I tried to answer a couple questions based on my understanding of the parser behavior. |
c5f7509
to
5fd9791
Compare
@jgotoh Is there anyway I can help you with this? |
@andreabedini many thanks for your offer! I will push some changes this week adding Also ongoing small reviews of the code I push are helpful if you spot anything. |
Thank you for the invite, but unfortunately that would be 3:00 am here 😄 😞 I might just go through the PR in my own time and leave some comment (if there's anything I can comment on). |
96ef42c
to
e671cb1
Compare
parsec = parsecNumJobs | ||
|
||
parsecNumJobs :: CabalParsing m => m NumJobs | ||
parsecNumJobs = ncpus <|> numJobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See Distribution.Setup.Simple.Common.numJobsParser
for original, non-parsec parser
9e0127d
to
c75f6e2
Compare
Expects Invalid subsection warning by the Parsec ProjectConfig parser. Currently the warning is issued twice: First warning: "Warning: <ROOT>/else.project, else.project: Unrecognized section '_' on line 3" is issued by the legacy parser, Second warning: "Warning: dir-else/else.config:3:5: Invalid subsection "_"" by Parsec Parser. When we remove executing the legacy parser, we can remove the duplicate warning.
327e618
to
eededa4
Compare
That is good for a development check. Are you able to put the comparison and the call to the legacy function behind a flag in I ran a check of this with a recent test from #10629. The test passes but then I commented out these lines and it fails: $ git diff
diff --git a/cabal-install/src/Distribution/Client/ProjectConfig.hs b/cabal-install/src/Distribution/Client/ProjectConfig.hs
index 29bcb7605..df111de17 100644
--- a/cabal-install/src/Distribution/Client/ProjectConfig.hs
+++ b/cabal-install/src/Distribution/Client/ProjectConfig.hs
@@ -838,14 +838,14 @@ readProjectFileSkeleton
dir@DistDirLayout{distProjectFile, distDownloadSrcDirectory}
extensionName
extensionDescription = do
- legacyPcs <- readProjectFileSkeletonLegacy verbosity httpTransport dir extensionName extensionDescription
+ -- legacyPcs <- readProjectFileSkeletonLegacy verbosity httpTransport dir extensionName extensionDescription
exists <- liftIO $ doesFileExist extensionFile
if exists
then do
monitorFiles [monitorFileHashed extensionFile]
pcs <- liftIO $ readExtensionFile verbosity extensionFile
monitorFiles $ map monitorFileHashed (projectConfigPathRoot <$> projectSkeletonImports pcs)
- unless (legacyPcs == pcs) (error (show callStack ++ "\nParsec: " ++ show pcs ++ "\nLegacy: " ++ show legacyPcs))
+ -- unless (legacyPcs == pcs) (error (show callStack ++ "\nParsec: " ++ show pcs ++ "\nLegacy: " ++ show legacyPcs))
pure pcs
|
With #10644, I'm using a |
fetch pci | ||
|
||
fetch :: FilePath -> IO BS.ByteString | ||
fetch pci = case parseURI pci of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing the trim of #10629.
fetch pci = case parseURI $ trim pci of |
Great idea, thanks a lot for your support here! I will try to add the flag to test the implementation without the legacy parser.
The #10644 PR will probably be merged before this one here, right? Thanks for the heads up, I will need to migrate it to the Parsec implementation. |
It will if you approve it ;-) |
|
||
parseElseClauses :: [Field Position] -> IO (ParseResult (Maybe ProjectConfigSkeleton), ParseResult ProjectConfigSkeleton) | ||
parseElseClauses x = case x of | ||
(Section (Name _pos name) _args xs' : xs) | name == "else" -> do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use the literal "else" in place of name
and avoid the guard?
Implements #6101, #7748. Finally ready for review!
Please include the following checklist in your PR:
In this PR I implemented #6101 to replace the legacy cabal.project parser (
module Distribution.Client.ProjectConfig.Legacy
) with an implementation based on Parsec.My implementation is based heavily on the existing Parsec parser of .cabal files, see module
Distribution.PackageDescription.Parsec
.My goal was to recreate the exact grammar the Legacy Parser parses using the modern Parsec framework with FieldGrammars etc.
This means the Legacy Parser and Parsec Parser should always return the same ProjectConfig value for a file (at least in this PR).
About CI:
All of the validation checks of Ubuntu are running, unfortunately windows-latest ghc-9.10.1 fails, but windows ghc-9.8.2 succeeds.
The PR consists of several main parts:
Main Entrypoint Distribution.Client.ProjectConfig.readProjectFileSkeleton
The main entrypoint into the Parsec parser is in function
readProjectFileSkeleton
inDistribution.Client.ProjectConfig
and parses a cabal.project file.Note that the legacy parser can be executed still via
readProjectFileSkeletonLegacy
in the same file, I want to remove it but in another ticket because this PR is huge already.My current implementation of
readProjectFileSkeleton
also callsreadProjectFileSkeletonLegacy
to compare the Parsec value to the legacy value, throwing errors if they are not equal.Currently I use it to verify that the parsers' grammars match.
I will delete this functionality when the PR is ready to merge, but for now I will let the functionality stay in the PR so reviewers can try to build other projects to verify the parsers' values to not diverge.
Note that because .project files are parsed by both the Legacy parser and the Parsec parser, all warnings are currently emitted twice - once by each parser.
The function
readProjectFileSkeleton
currently uses an old variant ofreadAndParseFile
fromCabal/src/Distribution/Simple/PackageDescription.hs
(new variant is here) that is not based on SymbolicPaths yet but uses FilePath instead.If we want to migrate it to use SymbolicPaths I need some help here :) Maybe this should be part of a future ticket too.
readProjectFileSkeleton
callsDistribution.Client.ProjectConfig.Parsec.parseProject
which leads me to the next main part:Module Distribution.Client.ProjectConfig.Parsec
This module contains the Parsec parser of ProjectConfigs.
Function
parseProject
is a copy of functionparseProject
of the Legacy parser (Distribution.Client.ProjectConfig.Legacy.parseProject
) but it now calls the Parsec version ofparseProjectSkeleton
.parseProjectSkeleton
parseProjectSkeleton
is a port of functionparseProjectSkeleton
of moduleDistribution.Client.ProjectConfig.Legacy
.It does not use the deprecated type
Field
fromDistribution.Deprecated.ParseUtils
anymore butDistribution.Fields.Field
instead.I tried to change as little as possible here, so we still parse the fields/sections into ProjectConfigs, import and parse other cabal.project files and process the conditional structure.
An interesting bit includes the new processing of imports (Fields with name equaling "import"):
We need to use
liftPR
to be able to composeParseResults
that involveIO
actions (for example downloading imported cabal.project files via HTTP).Composing two actions involves executing a
ParseResult
resulting in aPRState
and executing anotherParseResult
passing in the previousPRState
.Unfortunately the
PRState
that is generated when executing aParseResult
does not contain the file source where the warnings/errors came from.So we need to print any warnings/errors that came up when parsing an imported file before we return the ParseResult, otherwise we lose the source file of the warnings/errors.
I added the implementation of liftPR for Parsec ParseResults to module
Distribution.Fields.ParseResult
because I needed its constructor.Please pay special attention to reviewing this implementation, as it was quite complex to develop.
It works in the tests :)
Another interesting bit is function
fieldsToConfig
inparseProjectSkeleton
. Here we produce ProjectConfig values by parsing the current fields with aFieldGrammar
. Afterwards we parse the sections (such as source-repository-package, ...) with functiongoSections
.FieldGrammar
The
ProjectConfig
FieldGrammar is defined in moduleDistribution.Client.ProjectConfig.FieldGrammar
.It took me some while to reverse engineer all the possible field names and find out their named field equivalents in the
ProjectConfig
record, but it should be complete now.Parsing of sections such as
source-repository-package
is not done in here, see the next point below.There are some fields such as
projectConfigDryRun
,projectConfigOnlyDeps
that afaik can not be specified in a cabal.project file and need to be passed in via command line flags.They are marked by comments in the following form:
-- cli flag: projectConfigDryRun
.I also needed to add some
Parsec
instances that I added to the modules defining the respective type.For example the Parsec instance of
OptimisationLevel
is in moduleDistribution.Simple.Compiler
directly below the type definition.Parsing of Sections
Parsing of sections happens in a
StateT
monad (SectionParser
) modifying theProjectConfig
we got out of the FieldGrammar.For the SectionParser I took a lot of inspiration again from the parser of .cabal files in module
Distribution.PackageDescription.Parsec
, see typeSectionParser
.Currently I need to implement parsing of
repository
sections here as I've missed it until now but I hope it is the last type of section that is missing.Integration Tests in cabal-testsuite/PackageTests/ProjectConfig/Parsec/cabal.test.hs
A suite of integration tests.
Parsing values and comparing them to expectations.
Note that I also run the legacy parser here to make sure my expected values do not differ from the non-Parsec implementation, this will be removed in the future.
Furthermore, I've tested the implementation by successfully building the following Haskell repositories: http2, cabal, tls, hashable, hspec, ghc-lib-parser, texmath, text, lens, megaparsec.
Future PRs
There are also other aspects that I want to address, but I want to do this in future PRs because the current one is too big already:
Cabal-tests/tests/ParserTests.warningTests
to test the correct output of warnings/errors of the Parsec parserDistribution/Client/ProjectConfig/Legacy.hs