-
Notifications
You must be signed in to change notification settings - Fork 696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate from the .cabal format to a widely supported format #7548
Comments
Note that some people have mentioned
|
However, I expect that there would be a dhall library for generating |
Is this
|
Also note that this already exists as dhall-to-cabal. While its goal was to generate .cabal files, that's not the only solution. A more integrated solution would be a cabal-install that can actually consume these files. I'm not saying this is the solution, just mentioning this as prior art. I'll step out of the conversation for now and let others share there thoughts, but if any one wants to talk about Dhall in particular here, I do have thoughts |
It is the current .cabal file in a not-home-grown syntax |
When I think about it, don't think these are mutually exclusive tickets: an exact printer gets us a reasonable source representation. This is good for a few reasons: we can derive translational tools from the same representation - we only need change the parser and the printer. This frees up efforts to migrate between formats!
Of the three suggestions, TOML is the most attractive. YAML is has too much variable syntax, and JSON is aesthetically (and mechanically) displeasing for me to write as a human. TOML's grammar is minimal and admits a small and easy to generate + verify parser and lexer (note:
👍 |
As an experiment, I took a library I wrote a few years ago and manually converted its Cabal file to TOML. I like it! The conversion can be totally systematic. TOML was noticeably nicer to edit—Emacs has a simple built-in TOML mode and I didn't have to worry about indentation/formatting. (I've done Haskell for over a decade now and I'm still not consistent in how I format Cabal files!) Structured commands for navigating and editing the TOML file would be nice; I don't know if something like this already exists, but if it doesn't, adding it to Emacs would be easy. I wouldn't even think of trying something like that for Cabal's custom syntax. I've used YAML a lot more than TOML in the past. Compared to YAML, I found needing to quote all my strings a bit annoying; on the other hand, TOML was much nicer to pick up and doesn't have weird corner cases to worry about. At work I recently ran into some weird YAML files that used anchors in a way that didn't work in Python—not something that would happen with TOML. In my dream world we would use an S-expression based syntax (like sexplib) but I know that is not to be :(. I immediately found that multiline strings were useful. Multiline strings and comments seems like the bare minimum for a human-oriented format; YAML and TOML support that, JSON doesn't. It's a bit long, but here's the whole file:
|
Another benefit: the format would be naturally extensible. Cabal could provide a section for plugin/tool/etc config, and tools would have no issues parsing values from there. I'm imagining something like this:
My experience has been that providing "extension points" in formats is always useful. We can't figure out everything people want to do with their libraries ahead of time but we can make the format adaptable. If people need something Cabal doesn't support, they can add it while still keeping a single canonical file for library-specific settings. |
For yaml there's also of course hpack. So anyone who wants to write cabal files in either yaml or dhall is welcome to do so. Note that we don't have exactprinters for either of those formats either, as far as I know. As I recall, due to the semantics of yaml, conditional clauses are rather unpleasant there, among a few other issues (and pretty-printing reorders things in unpleasant ways as well). (And also as emily notes, the yaml grammar is rather complicated as is). Toml does seem promising, but I worry that its support for conditionals or other more complex syntax wouldn't be particularly great either. Translations of some more complex files might be worthwhile, to experiment with this. Btw, note that cabal is already extensible, via "x-" fields. In any case, I think the right next step is to get the cabal grammar pinned down and to have an exactprinter for at least the format we already have and is widespread. |
I like this and am the maintainer of the translational tool hpack-dhall that can translate:
I am a bit wary about each format being capable of doing a faithful representation. For instance, hpack's conditionals can break dhall's typing. This is the trouble @gbaz just mentioned. |
I want this to work, but unfortunately I see some issues with all the proposed formats so far. I think TOML / YAML / JSON will never work, but Dhall, while it might not be a good fit today, can be made to work. TOML / YAML / JSONIf Cabal files were merely data they would be fine, but unfortunately they are code, due to the conditionals, and parameters used in those conditions. This is true of Cargo packages too, and the solution there has been to stuff syntax into strings. Firstly, this largely defeats the point as we still need application-specific parsers (and pretty printers!) to handle those strings. But more worryingly, I have reason to believe this has warped the design process of Cargo. See, for example, the back and for with @djc and me in rust-lang/rfcs#3143, where @djc agrees Cargo has backed itself into a corner, but objects to my further using strings, or trying to encode the information in a more structured but awkward and verbose way. I may have disagreed with @djc on which unpleasant choice too take, but I absolutely do agree that TOML forcing Cargo into this awkward situation is trajic, and no one shouldhave to pick between those unpleasant options in the first place. Cabal converting from the existing design avoids some of the distortion from TOML's perverse incentives, but I have no doubt the language of Cabal files will continue to evolve, and I don't want "TOML goggles" to mess things up going forward. DhalllDhall is an actual programming language, and therefore squarely fixes the above issues. And to be clear I would really like to endorse Dhall as it is the right sort of way to make these things conform to a standard. There are two quibbles with Dhall as it currently exists however, that I think should be addressed first:
So yeah, in conclusion I want Dhall to work, but it's important we we be able to restrict ourselves to a sort of "mini Dhall" so we can do this analysis and we will have to integrate Dhall with Cabal fairly deeply. I'm not sure whether the current Dhall implementation supports such a restricted "mini Dhall", but that can easily be fixed. |
Remember that Hackage is an append-only repository. It would be utterly disappointing if a future version of Cabal would be unable to build an old package just because it no longer parses its very own package format. So I don't think it would be wise to abandon a parser of Cabal files even after a very long grace period. And if we are to retain the parser and all its complexity, than what exactly are we to gain? What about other tooling (e. g., Stack)? With regards to editor support, why aim for a generic JSON autocompletion? These days we should not settle for anything less than a domain-specific language server, and custom format is not a hindrance for it. I'm sorry if my tone sounds harsh, but I'm afraid we are chasing an ideal to the detriment of compatibility, as it's very customary in Haskell community. |
Maybe Starlark is a decent option if logic is important to this project? |
Given that the quality standards and popularity standings for configuration languages change every decade, I'd rather focus on a good internal representation, support the old cabal format (and only this format) forever-guaranteed and let contributors add exact-parser-prettyprinters for whatever format works best for them. We also need a story for keeping in sync many files that contain the same information or for translation on the fly (e.g., when showing a .cabal form a Hackage webpage of a package). |
This might be total stupid ideaw, but how about using limited Haskell for configuration? |
@kamoii the main argument against that is that a Haskell program is not guaranteed to terminate |
The argument is also to not invent something new as much as possible. We want to leverage existing tooling, syntax highlighting, etc. A limited Haskell only lets us benefit from a fraction of this |
Two forgotten things in this discussion: First: JSON / YAML / ... and even Dhall would still need some stringly sublanguages, as @Ericson2314 hints. Consider build-depends: foo (>=0.4.0.0 && <0.4.1) || (>=0.5 && <0.6)
mixins: foo (Foo.Bar as AnotherFoo.Bar, Foo.Baz as AnotherFoo.Baz) "build-depends": {
"foo": {
"and": [ { "or": [ { ">=": "0.4.0.0" }
, { "<": "0.4.1" }
]
}
, { "or" : [ { ">=": "0.5" }
, { "<": "0.6" }
]
}
]
} (better would be model version numbers as I don't even try to model mixins. Dhall would look terrible as well (from
There is also EDIT: Also file globs (though I think that was a mistake to add them to If we use stringly sublanguages (like in @TikhonJelvis examples) we we will Writing a tool to automatically edit bounds is still difficult Second: Performance matters. Solver parses plenty of package descriptions Currently
That 1ms per file is a good goal. A solution is that That approach would make sense for revisions too, it might be substatially Another solution is that If we really want to change the format to something "used elsewhere", :build-depends
{ "foo"
(|| (&& (>= #(ver 0 4 0 0)) (< #(ver 0 4 1)))
(&& (>= #(ver 0 4) (< #(ver 0 6))))
)
}
:mixins
{ "foo"
(as [Foo Bar] [AnotherFoo Bar])
; the drawback is that everything is different, if EDN structure is used deeply:
; even the module names, as "Foo.Bar" is an expression in a sublanguage for module names,
; something general EDN tools are not aware of.
... TL;DR, I challenge JSON, ..., Dhall suggestors to model e.g.
in their favourite "syntax" format. Otherwise this discussion is just (IMO simple examples don't tell much, simple stuff is easy). |
Does "unlimited Haskell" as opposed to limited Haskell qualify as not something new? IMO the argument that Haskell is turing complete isn't that compelling, as the nix expression language is also. With a cabal file being just some Haskell expression of type import Cabal
main = buildPackage PackageOptions {...} -- dependencies and build configuration here we get to use all of the existing Haskell tooling and get around the sublanguage issues by representing everything as normal Haskell values, which if I understand correctly cabal today does anyway. Going further with this train of thought, it seems like any configuration format, JSON, YAML, Dhall, edn, TOML, etc. is basically some level of indirection that gets parsed into a Haskell value at build time, so why not just focus on making a more convenient EDSL for Cabal the library? |
There's a reason we encourage cabal files rather than custom setups -- far easier for external consumption (even with a not fully specified grammar). To get values out of a haskell executable it needs to either emit them (in which case the format it emits in is the actual spec) or you need to build and link into it directly. Either way you're compiling and building a haskell program every time you want to ask "what modules does this package provide." That is not feasible for, e.g., a package store such as hackage. |
The external consumption argument is very compelling. I guess we could have cabal generate a lockfile from a build specification in Haskell and have other tools read that. We already have cabal.project.freeze/stack.yaml.lock so there's precedent, but those files haven't historically been required. |
...but then you get in the same situation as now, it's just that cabal is doing the conversion instead of dhall2cabal/hpack/... |
In this issue there is an interesting discussion about how to handle other configuration formats than the builtinn cabal one: #5343 |
I see no problem that a e.g. YAML outer syntax has to be complemented by ad hoc expression syntaxes for certain fields (constraints etc.) that transcend YAML. Having an outer YAML syntax would still allow third-party tools easy access to certain contents of the .cabal file, and nice syntax (that is, the current syntax) for constraints can parsed from string fields using/adapting the existing cabal parsers. YAML-bombs can be avoided by restricting to a sublanguage of YAML. The syntax examples in #7548 (comment) look like straw-mans to me. |
I encourage anyone who thinks this is a good idea to think about how much fun |
What's wrong with using |
For the Haskell programmer, there is the obstacle of Anecdotally, I have just written a small tool (https://github.com/andreasabel/cabal-clean) to partially clean artefacts from |
I'd just like to add to point 2 (syntax highlighting): there is a movement towards using |
As a user, after the initialization, updating the modules list is my main interaction with cabal files, and it's a bit annoying as this makes ghcid reload the whole project. It would be nice if a new format or version could improve this. |
I fail to see how or why changing format would improve this. I.e., "fertilizer" in any language smells the same. |
I think this is a very important point and highly underappreciated. I don't know how many are really bugs, but there are currently 384 issues labeled "bug" in cabal alone. A change of this magnitude affects everyone, every developer has to spend time updating things on their side. Progress is great, but stability would be greatly appreciated (it makes a lot of economic sense to aim for more stability than what we have atm). Cabal has introduced many breaking changes in the last few years. I don't have to maintain the parser, of course, but, as a Haskell developer, I personally don't have a serious problem with the traditional cabal format. |
This depends on how we do the migration. A slightly different proposal:
This wouldn't be very disruptive to the ecosystem. Additionally, we could write an automated migration for all of hackage (this might need tweaks about how revisions work). The unfortunate thing will be that both formats will exist for quite some time. That could be confusing, even if the new format is the default and enforced on hackage. |
IMHO, your proposal would only postpone the described migration pain, not avoid it. Also, there's no line of Haskell developers requesting a different .cabal format ("oh if only Cabal would intake YAML instead, how greatly our productivity would increase"). Nobody really needs this... E.g., list one project property that one cannot express in the current .cabal, but would be able to in YAML? Also, the cost of your proposal would be maintaining now two formats fur a time long enough to burden the maintainers. Plus, the inevitable cost of migrating other tools in the toolchain. Since it's all volunteer- maintained, the likelihood of some tools migrating quickly, some - slowly, and some - not at all, is fairly high. Just look at how packages (or rather their maintainers) fail to update from, e.g., GHC-8 to GHC-9 (causing failures down or up the chain). In short, I see this as a change that nobody needs, with few if any appreciable benefits, and huge disruptive costs. |
I don't know. Nobody really needed ghcup either. You could install GHC anyway. It was just a stepping stone in usability. That's the same here. Note: I'm not really sure either way and if it's worth it. I'm just trying to challenge arguments. |
I'm all with you on the fact that the change of format is not (yet) well motivated (and weighted against the impact). But let's not throw maintainers under the bus here. It's not the maintainers that fail! It's the compiler that abruptly fails to accept code that was previously accepted perfectly fine. |
Replacing Setup.hs with, say, Ninja, would make me a lot happier than giving me a new surface syntax for Cabal files. I would the biggest problem with Cabal / cabal-install is that it's just too much code / accumulated too many features. Insofar that there is opportunity costs to everything, I might rather figure out how we can deprecate and remove a bunch of functionality than spend time on this. It feels like a case Wadler's law, to be honest. |
I've seen a lot of breaking changes, mainly in the packages that make incompatible changes in their API, without a care in the world about others that might depend on them. Compound this by typically large dependencies trees, and you'll understand how a newcomer feels about Haskell ecosystem... That's been my biggest gripe with the Haskell ecosystem in general - nowhere else have I seen such an amount of instability. I admit that it became better in the last couple of years or so. But there's still a lot of room for progress in this area. |
@mouse07410 yes. And this is a direct result of the compiler breaking existing packages that are perfectly fine. If you do not upgrade to a new compiler version, you can keep your existing packages just fine. Now almost every compiler release requires material changes to the package. Of course the maintainer ends up making most likely only the latest release compatible with the new compiler. Making older releases compatible is a work investment that needs justification. And now you are forced to update to that package, by proxy of the new compiler (you want), not anymore accepting the package (you used). In any case this is not the correct thread to discuss this. And I think we agree there are more pressing topics than a change of the .cabal format. As others have said, it's an open source project and everyone is free to spend time on what they deem interesting. |
Power users are mostly blind to usability issues. This is the problem. We're used to dealing with the warts. I think the current cabal file format is hostile towards new users. The rise of hpack is proof: people don't want to deal with it. But hpack causes more problems. |
If you want usability, use stack. It's one tool to install, does everything for you, including installing any other stuff you need, based specifically on what you put in your stack.yaml. Can also build in a Docker container if you need it, for no extra trouble. If you want separation of concerns, use ghcup, cabal-install, hpack as separate tools, at the cost of usability. |
You're on the cabal issue tracker. Stack is irrelevant to this issue. We're trying to figure out how to improve usability for cabal here. The hpack workflow has already been sufficiently explained to be worse for usability. |
I don't think so, it's been explained as complicating the tool design. But it would be better for usability. Likewise, integrating ghcup's automatic GHC installation would be a big improvement for usability, but a complication from a design perspective. |
Changing the .cabal format is an infrastructure thing that affects both the cabal-install tool and stack equally. |
You're moving goalposts. You told people on the cabal issue tracker that they should use stack if they want usability. At this point I'm not sure if you're trolling. Yes, it will affect stack as well, but stack doesn't do its own parsing of cabal files. The Cabal API used can stay largely the same. Pantry might need some adjustment, but that won't be hard. |
I'm saying, if you care about cabal-install usability, you should copy features from stack, as some people are asking for (#8605). But this is in tension with your desire to keep the design of cabal-install simple. You need to pick one: do you want a single tool that does everything (most usable), or do you want cabal-install a tool that just does one thing well (simpler design)? |
It tends to be pretty hard to stay behind in practice, for the reasons you indicated after this.
I think it is. Such local decisions compound and affect the community as a whole. Using this logic, projects would be making decisions about how people are affected by breaking changes considering only their own package. When you consider that similar thinking is being applied to many other parts of the Haskell ecosystem, then it's easier to see that the instability of it compounds and becomes too much. No single project will be able to completely stop it. It takes all projects together to do so. So it's not just that this conversation is important here, it's that it's important every time it comes up in any project that is core to the community.
True, although as a community we are all contributing, and we all rely on cabal & ghc do to it. Changes to those packages affect every one of us. Someone being free to spend their time on whatever they want doesn't mean that the change should be accepted in cabal or ghc. |
@ivanperez-keera yes, that was precisely my point. The failure to keep any resemblance of backwards compatibility, and upstream continuously hard breaking the whole ecosystem has massive ripple effects. If maintainers were to spend less time dealing with the churn of adapting to new compiler version that simply reject their code, you could upgrade the compiler, and still use the old library perfectly fine. If you decide to upgrade to a newer major version of that library, that is your choice, and not one force onto you by trying to stay somewhat current with the compiler. Right now new compiler, almost always implies a slew of new dependencies. Why? Because the compiler (with each 6mo release) introduces breaking changes, that make old code incompatible. It's insane! I think we completely agree on this topic. And while I see the relation to breaking the cabal format, I still would like to not derail this thread too much into that direction. Yes, it's everyones time and they are free to do what they want with their time, and what hopefully makes them happy. I did not say anything about me supporting such a change in anyway. I'm highly skeptical. I will remain open minded, but I would want to see a purportedly better format to be better in every dimension, not just read/write support for it in some other non-haskell language. I don't think cabal is perfect. And I absolutely hate the conditional logic in it. It's almost as bad a CPP from a flattening out standpoint. But I will give the cabal format that it's fairly straight forward. If someone can come up with a better format, that doesn't regress in any significant way, solves lots of warts the cabal format has, I'm happy to lend my support to it; I have not yet seen it though, but that doesn't mean it doesn't exist. |
Hey just wanna bring life to this old issue! I made a little example of how a Cargo style interface could make cabal way easier to use. We've been talking on reddit https://www.reddit.com/r/haskell/comments/14v3wo7/would_anyone_be_interested_in_hoot_a_cabal/ |
I’m just a single person of course, but I’m coming back to Haskel after a few years in Rust and other languages. I strongly disagree with the comments in this issue about this being an unnecessary change. I think I remember one of them being “there isn’t real demand, this is just one person asking for something” I don’t think that’s true. I think many of us who would benefit from cabal working similarly to other modern build systems are just less likely to be core Haskell contributors. I’m trying to get involved and make up for that now, but I suspect there are many others like me who would benefit. |
@seanhess I think nobody is questioning that using a "widely supported format" would be useful, that much is clear. The point is that there are two obstacles:
Any proposal for a new format would have to address those two concerns, and as far as I can see so far none did. I don't want to sound condescending, but this ticket is already really long1 so please read the conversation before commenting, or at least the comments I linked above and/or the ones with most reactions. Footnotes
|
Thanks for sharing! I read a little into that issue but those comments really shed a lot of light. I do still think that the advanced cabal examples (such as https://hackage.haskell.org/package/raaz-0.3.0/raaz.cabal) could be solved with something like Rust's feature flags |
I find myself strongly opposed to the idea of moving to a common format, so here are the counterpoints:
All in all the gains are speculative, overshadowed by the sheer girth of maintenance a change like this would entail. I fully agree with #7548 (comment), better tooling is both easier to define and far more desirable. |
I dunno... Doesn't seem that bad to me ¯\_(ツ)_/¯ (I'm being super tongue-in-cheek here, I just pasted the cabal file to chat-gpt and had it generate equivalents for me in json and yaml. I'm sure maybe there's some difficulty here I'm missing, but the yaml file in specific looks pretty neat in my eyes) |
All those duplicated |
Those files won't work for a huge variety of reasons, basically all of which have been discussed at length in this thread. This is why we need people to design protocols through thought and discussion, rather than asking machines to churn out "neat" looking but wrong slop. |
This issue has stalled for a while. The benefits of switching currently to a different format are severely outweighed by the cost of 1) unavailable features (e.g., syntax highlighting), 2) implementation effort, 3) temporary introduction of complexity in cabal (until all old formats can be removed, which would take years) and other tools that work with cabal files, 4) potential need for additional tooling, 5) the need for new learning by people in the community, 6) adaptation effort in the Haskell ecosystem (packages would have to update), 7) documentation that would have to be put together and maintained for years while the community transitions. Many of these points, and others, were very well captured by #7548 (comment) and #7548 (comment). Cabal has a long history of breaking the interface, and this would simply add to that and create more breakage of packages, in a community that already struggles to put enough energy to keep packages well maintained. There are currently 416 bugs open in Cabal's repo (out of 1521 issues total), many going back for many years. I propose that we simply close this item as "not for now" and focus on stability before moving on to bigger changes that would require a huge investment by the community at large. General note I would also like to ask people who propose new changes to take the seat of devil's advocate, and try to think also of good reasons not to do things, as well as the cost that it has for everybody (almost literally: imagine if you had to pay everyone in the community who's going to spend time as a consequence of this change $250/h; how much would that rack up to?), and how it makes the ecosystem better as a whole for everyone (beyond their own specific interest). Many of us are also investing a lot of effort into promoting haskell and getting it adopted into our companies, which would benefit the community at large. Understanding the impact for the client and the user is important to increasing Haskell's adoption. |
In the wake of the exact-printer initiative, I proposed another approach: why not say good bye to the .cabal file format and switch to something that is widely supported.
There is a few alternatives, the most important attribute would be that they are widely supported in industry.
Note that all of these are (mostly) isomorphic to JSON (scalars, lists, dicts), which is important for easy translation between them (e.g. for config generation purposes).
What would this give the Haskell ecosystem?
vscode
has a way of assigning JSON schema to a file, which gives completion and inline documentation for free everywhereWhat would it give to users?
jq
,yj
), which is important e.g. in a monorepo contextWhat are others doing?
Most modern package managers that don’t go the full turing-complete configuration route (e.g. Scala’s sbt, Erlang) usually converge their config on a widely supported syntax.
Examples:
package.json
,package-lock.json
)stack.yaml
,stack.yaml.lock
)project.yaml
)Cargo.toml
)elm-package.json
)pom.xml
)pyproject.toml
,poetry.lock
)Counterexamples:
go.mod
), though flat shasums and go packages have no configuration filerequirements.txt
), though see poetry aboveproject.clj
), clojure is a lisp, and sexps are already a data formatI don’t expect cabal would drop support for the cabal file format very soon, rather it would start out by generating a
.cabal
file from the.json/.toml/.yaml
for consumption by older version of cabal. Then after a multi-year grace period, the new format would become the standard and projects could drop their autogenerated .cabal files.The text was updated successfully, but these errors were encountered: