-
-
Notifications
You must be signed in to change notification settings - Fork 160
feat: add incremental mode support for Dialyzer (OTP 26+) #575
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I'm still working on the tests. |
|
This is a related issue to a test failure christhekeele/erlex#6 |
|
first of all thanks you are working in it. I think it's important to say in documentation/readme or somewhere that it use _build folder for cache. So when developers use common approach with cache they cache deps and _build folder before dialyzer step/job. But common practice is to use cache versioning based on mix lock file or something like that. In this case it would mean that dialyzer cache will be stored only in case deps are changed. So not big benefits for incremental feature. I think this is important to explain that developers have to check properly their cache logic what they have. I would suggest to introduce new cache which will directly cache folder where new file dialyzer files are stored. Because this new cache has to be updated everytime with some logic. For example you don't want to store new dialyzer's files from some branch same way as you store main branch to avoid usage branch's cache in main. But you want to store branch files due to multiple runnings of ci/cd if it's common for your development flow to have multiple runs of ci/cd per branch. So for example in GitHub Action you can use branch name in cache name with fallback from main branch, for runs from main only cache with main cache. Also I would suggest to use ci/cd runs number which should be incremental every run, because thanks to that it should store cache everytime when it runs, not only when key is changed. But this has to be checked because last time (a few months ago) when I checked GH Action Cache app it had bug/removed feature for flagging cache to be stored everytime when it runs, because we can assume that every run (often) should contains some code change. Is it know exact folder path for cached files? |
I was thinking about this. Is you thought to just document this behavior or to add some mechanism to the library to better facilitate this caching approach?
this is on my todo list. |
|
I think adding good documentation is base. Right now I don't see how library can give some tool or approach for it. Because it depends on own CI/CD configurations how people use cache. Something can be handled by library only in case library is able to set how to store these incremental files. Thanks library can partly allow to configure it for developers or use some common cache so developers will not have to configure new cache or something. |
|
@jeremyjh do you have any pointers for getting the tests to not time out in CI? They all work for me locally. |
@Ch4s3 Thanks for working on this! The problem must be in this branch; I reran master - it passed and test times haven't changed. #578 I don't immediately see the issue; there are tests timing out that aren't really doing much. I think I'd start by commenting all the new tests; if it passes then uncomment just the new output test. |
|
from documentation: it means that file location is same but content is different. I tried your PR and when I use BTW: You wrote you fixed |
I think that’s leftover from an incremental step in some commits I squashed. I need to rewrite the PR description. |
Let me double check that I didn't have some environment config for erlang causing an issue here
I can definitely confirm that on my production code base there's a speedup the measurement I included is from that app. I should clarify it is NOT faster to general the PLT and initial run, it is only faster for subsequent runs. |
|
@oliver-kriska once I have tests working, I'm going to test this in CI on my app and gather timing info as well as double checking the path information and I'll update docs here. |
|
I think I found the main problem |
|
I'm still working on this, I just had to take a break do to work/life business. |
|
I think the issue here was trying to test the system halt call on older OTP versions. Since the other code path that calls a halt isn't tested, I'm removing the test that verifies a halt on OTP < 26. Hopefully that's an acceptable tradeoff. |
|
I'm fiddling with how to correctly use this in CI. It seems like just |
|
Yes, with the new incremental mode you don't need a separate PLT step - an incremental run is both generating a new PLT and checking your code - internally it was always the same operation, this just explicitly combines them. |
|
@michalmuskala thanks for weighing in. Am I understanding correctly that incremental mode doesn't build the core language PLT so you need to also have that if you want to avoid getting errors from missing language functions? Or should we be using |
Incremental mode analyses whatever modules you give it, including OTP modules (they don't get any special treatment). That means that you do want to use Sadly, it's not quite as clean and orthogonal as it sounds, because if there is some sort of discrepancy in your For the mental model for how to use |
README.md
Outdated
|
|
||
| ### Incremental Mode | ||
|
|
||
| Dialyxir supports Dialyzer's incremental analysis mode (available in OTP 26+). When enabled, Dialyzer will reuse previous analysis results and only analyze changed modules, significantly speeding up subsequent runs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strictly, in incremental mode, Dialyzer analyses the modules that were changed since the PLT was last used (if there is one), plus the set of modules which Dialyzer thinks might depend on it. This perhaps sounds like a trivial difference, but if you have a core module that is used by a lot of your codebase, changing that will cause all the modules that depend on it to need to be re-analysed too (since you might have fixed/broken them with your change!).
In my experience, this leads to people asking: "I changed this one file but Dialyzer analyses hundreds of files - wasn't incrementality supposed to fix this?", with the answer being that incrementally tries to reduce re-analysing irrelevant files, but files can be relevant because they're in a dependency chain with something that has changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This perhaps sounds like a trivial difference
No, this makes perfect sense. I was struggling with how to explain this well, perhaps even to myself.
but files can be relevant because they're in a dependency chain with something that has changed
Right, the classic problem that changing a config file causes a massive recompilation.
README.md
Outdated
|
|
||
| **Analyzing specific applications:** | ||
|
|
||
| When using incremental mode, you can specify which applications to analyze using the `apps` and `warning_apps` options. This allows you to analyze entire applications, which can be more efficient for large codebases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows you to analyze entire applications, which can be more efficient for large codebases.
Strictly, this is true, but can lead to misleading results if you include modules to be analysed with the modules they (transtively) use function or type definitions from. My suggestion would be to analyse all your code in a repo, including any dependencies it has, in one go. Incrementality itself will do the heavy lifting to soundly work out what code isn't affected by a change, avoiding the risk of the mistake above. Of course, at a certain scale you may really have no choice (e.g. if your codebase is so large the Dialyzer analysis for the entire codebase becomes too big to fit into RAM).
README.md
Outdated
| - If `warning_apps` **is** specified, only those applications will have warnings reported. | ||
| - Applications in `apps` but not in `warning_apps` are still analyzed (to provide context for the analysis), but warnings will not be reported for them. | ||
|
|
||
| This is useful when you want to include dependencies in the analysis (so Dialyzer can find discrepancies in how you use them), but only see warnings for your own code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my mind, this is the primary motivation for apps and warning_apps, rather than trying to only analyse a subset of a codebase (since that's so error prone).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are great comments, I think I can do a lot better with the readme now. thanks!
|
Hi there 👋 I made Dialyzer's incremental mode. I tried to add a bit of context where I could on this PR. I hope that helps. |
This is what I'm figured out after some trial and error and the latest changes are an attempt to make passing those through as straightforward as possible. Thanks for further clarification.
Makes sense.
Awesome, this is such helpful context! I think I'm pretty close on this, I have the branch working against a production repo and giving me meaningful output. Thanks again, and also for your work on incremental mode. |
|
@TD5 could you weigh in again? I'm now just passing --apps and --warning-apps but not the files in incremental mode and I'm not seeing errors for code with trivial dialyzer errors picked up by classic mode. Based on this test setup from OTP, and the original commit message I'm assuming --apps and --warning-apps have to overlap. I'm not sure how I missed this before. |
That sounds plausible. Honestly, I wrote the code long enough ago that I can't remember the motivation there. |
I know the feeling! Thanks for the input. |
lib/dialyxir/project.ex
Outdated
| cond do | ||
| is_list(config[:warning_apps]) -> config[:warning_apps] | ||
| resolved == nil -> [] | ||
| true -> resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not crazy about this and may need to change it.
lib/mix/tasks/dialyzer.ex
Outdated
| valid_apps | ||
| end | ||
|
|
||
| defp app_exists?(app) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels a bit defensive but I tripped myself up on this a few times so I added a check.
|
@jeremyjh this should basically work now and I like the api, with the caveat that core_apps feels a bit hacky. I'm putting it down until after US Thanksgiving, but it should be in a reviewable state even though I plan to refactor a bit next week. |
lib/dialyxir/project.ex
Outdated
| end | ||
|
|
||
| # Returns core apps configured under :core_apps. | ||
| # This is a Dialyxir-only concept used for incremental app mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ch4s3 I haven't really had a chance to dig into the details here. Will the new core_apps key only be used for incremental mode or would it apply to normal builds as well? If we want a separate apps list for incremental maybe we should namespace the options so thats more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the idea, but I really don't love it. I can definitely namespace it, but I'm not 100% sure yet that its the right options api for this problem. I just couldn't find a reliable way to programatically get these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a different app list for incremental mode? Could we not use whatever the current options API resolves ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could use :apps but I wanted to support apps: :transitive and couln't figure out a good way to automatically pick up things like :erts, and :elixir but I'm afraid it's too clever by half. I'm exploring other options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without :elixir in apps you end up with missing function warnings for thing like String.t().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some new changes shaping up, and I'll publish a test app with this running in CI perhaps later this week.
|
@jeremyjh I have a coworker sanity checking the code, and I'm doing a little doc cleanup and refactoring, but we've confirmed this works locally on our macs, in ci for our large app, and in my test repo on GHA. |
I think this may be best to handle in the readme rather than in code. |
…r, docs, and CI caching - Rationale: Dialyzer gained incremental analysis in OTP ([commit 963c7d5](erlang/otp@963c7d5)), but Dialyxir lacked first-class support. Adding it speeds subsequent runs by reusing incremental PLTs and enables app-mode analysis for umbrellas without full rebuilds. - Usage: new `--incremental` flag/config integrates Dialyzer’s incremental pipeline; `apps` lists expand :transitive (deps + project) but still require explicit OTP apps; `warning_apps` stay project-only, are merged into `apps`, and non-project entries are filtered with warnings to match PR jeremyjh#575 guidance. - Introduce Dialyxir.AppSelection to centralize CLI/config resolution, normalize atoms/strings, expand flags, filter warning_apps, and build dialyzer args without duplicated task logic. - Refactor Dialyxir.Project app/warning_app resolution (shared helpers, transitive expansion extraction) and slim Mix.Tasks.Dialyzer arg assembly; remove duplicate normalization code. - Add targeted tests for the resolver plus existing task/project suites to guard app/warning_app semantics and incremental flows. - Refresh README and CI guides (GitHub Actions, GitLab, CircleCI) to explain incremental usage, how to cache `priv/plts` (optionally per-branch), and note macOS ulimit tips to avoid EMFILE during MD5 hashing.
|
@jeremyjh I think this is reviewable now. Below is a sketch of how to read through the changes or as best I could reason about how to approach it. 1) Start with the intent and surface changes
2) Core entry points (behavioral changes)
3) Centralized app/warning_app resolution
4) Incremental PLT handling
5) Docs and CI caching
6) Tests (behavioural coverage)
7) Fixtures and samples
8) Thing to look out for
|
README.md
Outdated
|
|
||
| Both `apps` and `warning_apps` accept: | ||
| - An explicit list of apps: `[:app1, :app2, ...]` | ||
| - The `:transitive` flag – automatically includes all dependencies + project apps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both :transitive and :project modes are deprecated in favor of :app_tree (the default, equivalent to mix app.tree) and :apps_direct (direct app dependencies). In most cases we only want to include only apps that would be deployed in the application, and not other project dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure how I missed this, I'll get it addressed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in most cases we want to add some_otp_stuff ++ [:app_tree] to :apps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my one concern is that :apps_direct includes deps and for the way :warning_apps is used in incremental mode you need just a list of your own apps and not the deps.
I really need to either have another flag here or to just require users to declare their app/apps. It's mostly only an issue for umbrella apps though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a new key, :apps_project that uses Mix.Project.apps_paths() (if defined) to get top level project applications. It only works in :warning_apps and is documented as such.
| When using incremental mode, you can tell Dialyzer which applications to analyse | ||
| using the `apps` and `warning_apps` options: | ||
|
|
||
| - `apps` – all applications that Dialyzer should know about and analyse upstream of your own code |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this key is only meaningful for incremental mode we need to indicate that in the option name, or nest it e.g. incremental: [apps: [...]].
Where possible - where the semantics are the same - we should use the same config names as https://github.com/jeremyjh/dialyxir?tab=readme-ov-file#dependencies - and let them optionally nested to override incremental mode. This will make adoption easier for people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll try nesting it. I looked at using plt_add_deps and plt_add_app but the semantics were a bit different and it made the existing code really hard to follow.
… app_tree/apps_direct Restructure incremental mode configuration to nest `apps` and `warning_apps` under `incremental` keyword list, and transition from deprecated `:transitive`/`:project` flags to `:app_tree`/`:apps_direct` flags. Configuration Structure: - Changed from flat `incremental: true, apps: [...], warning_apps: [...]` to nested `incremental: [enabled: true, apps: [...], warning_apps: [...]]` - Removed backward compatibility: no support for boolean `incremental: true` or top-level `apps`/`warning_apps` keys (WIP branch) Flag Transition: - Replaced `:transitive` with `:app_tree` (transitive dependencies + project apps) - Replaced `:project` with `:apps_direct` (direct dependencies + project apps) Dependency Resolution Refactoring: - Refactored `dep_apps()` and `direct_dep_apps()` to use same traversal mechanism as `include_deps` for consistency - Reuses existing code: `reduce_umbrella_children`, `load_external_deps`, `traverse_deps_for_apps`, `load_app`, `app_dep_specs` - `:app_tree` and `:apps_direct` now automatically include OTP apps declared as dependencies (like `:elixir`, `:logger`, `:crypto`, `:public_key`) - Core OTP apps (`:erts`, `:kernel`, `:stdlib`) must still be explicitly listed Updated files: - `lib/dialyxir/project.ex`: Config reading, dependency resolution refactoring - `lib/dialyxir/app_selection.ex`: Flag handling updates - `lib/mix/tasks/dialyzer.ex`: Documentation updates - `README.md`: Configuration examples - All test fixtures and tests: Updated to new structure and flags
…al mode - Add :apps_project flag that resolves to project apps (only works in warning_apps) - Prevent :app_tree and :apps_direct from being used in warning_apps (show warning and return empty list) - Require apps to be specified in incremental mode (cannot be nil) - Update documentation to reflect these changes
|
@jeremyjh there are some sort of tricky tradeoffs to consider here. The existing code for resolving dependencies is built around the way classic PLTs work and assumes that things like I can solve that by resolving deps in a less efficient way in a new code path, or by requiring the end user to declare them explicitly in UpdateI think this is ready for re-review. My compromise here is that users have to declare core language dependencies like |
document required OTP apps for :app_tree/:apps_direct configs add :elixir/:logger to the umbrella fixture’s :apps list
|
@Ch4s3 is there an issue with always adding |
@jeremyjh if the hard coded value is good with you I can absolutely add it to app_tree. I was just trying to avoid hard coding if I could. |
It is already hard coded, those are the default apps that go into the Erlang/Elixir PLTs and are included in the project one. |
|
@jeremyjh I assumed since it was deprecated that I should do something different. I'll add it into my code path. If you're you ok using the new apps_project flag for the warning apps then it's read for review. |
PR description
Summary
This PR introduces support for Dialyzer’s incremental mode (OTP 26+) in Dialyxir, including the ability to run incremental analysis based on passed in OTP applications.
The goal is to make Dialyxir work naturally with Dialyzer’s newer incremental workflow while remaining backwards compatible with existing setups.
What is this for
Dialyzer’s incremental mode is designed around:
--apps/--warning_appsDialyxir historically drives Dialyzer using classic PLTs and a whole app analysis. Incremental mode can significantly improve analysis of smaller changes on large code bases especially in CI environments.
What this PR adds
1. Incremental mode integration
Dialyxir can now pass
incremental: truethrough to Dialyzer, enabling OTP 26+ incremental analysis.2. Application-based analysis support
New support for Dialyzer’s:
--apps--warning_appsvia Dialyxir configuration.
3. Higher-level configuration options
New symbolic configuration values:
Supported behaviors:
apps: :project→ current project / umbrella appsapps: :transitive→core_apps ++ deps ++ project_appswarning_apps: :project→ only project appswarning_apps: :all→ all resolved appsThis allows large projects to opt into app-based incremental mode without manually maintaining long, fragile app lists.
Implementation overview
Adds app-resolution logic in
Dialyxir.Projectto compute:Introduces resolution helpers:
resolve_apps/1resolve_warning_apps/2resolve_app_args/2Updates the Mix task to:
:fileswhen running app-mode to avoid mixed invocation types.Backwards compatibility
This change is opt-in:
How to use
Example umbrella setup:
Run:
This enables Dialyzer’s native incremental mode
Preliminary results
Large codebase initial run
Large app 2nd run with error fix
You can see that the second run with a
+1/-0diff ran in 23.13s. This application has around 700k lines of elixir code.Demonstration App
github
ci run
Refs: https://www.erlang.org/doc/apps/dialyzer/dialyzer.html#incremental-mode
Closes: #498