Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redo dependency management #715

Open
d-xo opened this issue Jul 29, 2021 · 17 comments
Open

Redo dependency management #715

d-xo opened this issue Jul 29, 2021 · 17 comments

Comments

@d-xo
Copy link
Contributor

d-xo commented Jul 29, 2021

So I just had to update every single submodule in dappsys again since there was a one char change in ds-test and I just can't go on like this 😅. Just wanted to get some ideas down for now as to the best way forward.

As I see it the biggest issue with out current dependency management setup is the way that dapp remappings forces a consistent version for each package in the dependency tree, resulting in a huge maintaince overhead every time a change to a low level dependency is made.

There are however many nice features about the current approach:

  • federated: A package can be stored on any git host
  • immutable: The hash of the package content is commited to in the git log
  • auditable: Any version change is reflected in the commit. Dependency source code is available in the same repo as the top level project, and can be grepped through with ease.

How can we keep these nice features, while removing the pain associated with dapp remappings? I can see a few options:

Remove dapp remappings

The remappings make imports a little prettier, but we could get by just fine by using relative or absolute imports. This would keep all of the nice features, be almost no effort, and allow many package versions to exist in the same package tree.

The downsides are that this is a breaking change, and that imports wouldn't look so clean.

Expand dapp remappings

We could modify the import syntax supported by dapp remappings to allow users to specify a specific version, e.g. import "ds-test/test.sol@<VERSION> where VERSION is a reference to a git object (e.g. commit hash, branch or tag ref). This would allow uniquely specifying a specific package version in dependency tree. There is probably quite a lot of work involved to figure out the specifics here.

This has the advantage of being backwards compatible and keeping clean imports, but it is quite a lot more work, and makes the dapp import syntax even more non standard, meaning users of other frameworks will have an even harder time consuming packages that have been created with dapp.

Full nixifification

We could start to more heavily leverage nix for dependency management. This would keep most of the advantages above (although grepping through dependency source would probably get harder), while also potentially allowing for nice features like parallel builds and multi compiler support. It's not super clear if we would be able to use nix to avoid the remappings issue. This would also be a huge breaking change and would tie us forever to nix as a runtime dependency, which may not be desirable.

@MrChico
Copy link
Member

MrChico commented Jul 29, 2021

Another advantage of the nixification approach may be deduplication. Another problem with our current approach that you can have a million folders in the lib dir

@transmissions11
Copy link
Contributor

definitely remove remappings, 100%

@lucas-manuel
Copy link

lucas-manuel commented Jul 30, 2021

We are currently dealing with this in our maple-core repo. We've converted all of our contracts into submodules so that we can easily version contracts and test different permutations of contract versions in the context of the entire protocol in the main repo.

We have been thinking that we can take a similar approach to how NPM works in terms of package management for this.

Proposed Solution

This is still a high level idea, but figured I would post it in here to get a discussion going.

We could flatten all dependencies and compile using different dependency versions with the following approach:

  1. Do a git submodule status --recursive to get all dependencies and corresponding commit hashes
 0a5da56b0d65960e6a994d2ec8245e6edd38c248 lib/ds-test (heads/master)
 1c9242648ebf000d26f9078a5d7995e2cb888db8 lib/funds-distribution-token (heads/ch3055-break-out-funds-distribution-token-directory)
 459a1c3b0c31f0e1672af7d099b874d150b9c85b lib/funds-distribution-token/lib/math (remotes/origin/ch3087-break-out-math-directory-in-maple-core-to)
 0a5da56b0d65960e6a994d2ec8245e6edd38c248 lib/funds-distribution-token/lib/math/lib/ds-test (heads/master)
 6be5ffe54fc8982b7c1d0f45d4cb7bbd313107cd lib/funds-distribution-token/lib/openzeppelin-contracts (v3.3.0)
 459a1c3b0c31f0e1672af7d099b874d150b9c85b lib/math (heads/ch3087-break-out-math-directory-in-maple-core-to)
-0a5da56b0d65960e6a994d2ec8245e6edd38c248 lib/math/lib/ds-test
 1ada3b633e5bfd9d4ffe0207d64773a11f5a7c40 lib/openzeppelin-contracts (v3.2.0)
 507669cbe61d5537d11d1a0ddd292f78a8eb2a3d lib/subfactory (heads/develop)
  1. Copy entire directory structure into a tmp folder
  2. Recursively go through contracts and find and replace all instances of imports with relevant packageName_hash
  3. Copy all packages to root lib
  4. Remove duplicate packages
./temp
  modules
    ds-test_0a5da56b0d65960e6a994d2ec8245e6edd38c248
    funds-distribution-token_1c9242648ebf000d26f9078a5d7995e2cb888db8
    math_459a1c3b0c31f0e1672af7d099b874d150b9c85b
    openzeppelin_6be5ffe54fc8982b7c1d0f45d4cb7bbd313107cd
    subfactory_507669cbe61d5537d11d1a0ddd292f78a8eb2a3d
    openzeppelin-contracts_1ada3b633e5bfd9d4ffe0207d64773a11f5a7c40
  1. Compile with remappings
  2. Copy dapp.sol.json to root directory
  3. Delete tmp

@d-xo
Copy link
Contributor Author

d-xo commented Jul 31, 2021

@lucas-manuel I think that the above procedure may break bytecode verification since edits to the source files are reflected in the metadata hash appended to the resulting btyecode.

It's also not quite clear to me what the purpose of copying the subfolders into the temporary directory is? Could you expand a bit on which problem this addresses?

@d-xo
Copy link
Contributor Author

d-xo commented Jul 31, 2021

I think we could modify dapp rempapings to allow multiple verisons of a given package by using the context feature of the remappings format (docs) to create a unique set of mappings for each package that would point into the lib directory for that package.

As an example, given the following project structure:

├── lib
│   ├── ds-test
│   │   └── src
│   │       └── test.sol
│   └── bar
│       ├── lib
│       │   └── ds-test
│       │       └── src
│       │           └── test.sol
│       └── src
│           ├── bar.sol
│           └── bar.t.sol
└── src
    ├── foo.sol
    └── foo.t.sol

We would generate a remappings like this:

src/:bar/=lib/bar/src/
src/:ds-test/=lib/ds-test/src/
lib/bar/src/:ds-test/=lib/bar/lib/ds-test/src/

This would mean that import "ds-test/test.sol" would remap to import "lib/ds-test/src/test.sol" in src/*.sol and import "lib/bar/lib/ds-test/src/test.sol" in lib/bar/src/*.sol. Semantically this is equivalent to remapping import "ds-test/test.sol" to import "../lib/ds-test/src/test.sol" (which is not recognised as a valid remapping by solc).

This is obviously a semantic change to the current behaviour, but preserves backwards compatibility (since current projects must either have a consistent set of versions throughout the dependency tree or use a custom DAPP_REMAPPINGS), while allowing for multiple versions of the same package in the dependency tree.

It doesn't solve the issue of duplication noted by @MrChico and @lucas-manuel, but that feels slightly orthogonal and can be solved either by piggybacking on an exisitng package manger (e.g. nix) or by writing our own custom logic (e.g. something along the lines of the suggestion from lucas above).

As a final note, remapping context is very well supported across solc versions (at least as far back as 0.4.0).

@d-xo
Copy link
Contributor Author

d-xo commented Aug 2, 2021

I implemented the above remappings scheme in: #719

@d-xo
Copy link
Contributor Author

d-xo commented Aug 2, 2021

I think the issue of on disk submodule duplication can be addressed by modifying dapp update so that it clones dependencies to a centralized location and then uses the --reference flag to point git towards this location when calling git submodule update.

This will set the central cache up as an alternate, meaning that git will not need to download or even copy any new files when initializing the submodule, and allowing us to keep a single copy of the git repo for each dependency used in any dapp project on the same machine.

@deluca-mike
Copy link

deluca-mike commented Aug 2, 2021

@lucas-manuel I think that the above procedure may break bytecode verification since edits to the source files are reflected in the metadata hash appended to the resulting btyecode.

This seems like it's really a fault of putting the hash of human-readable relative source in the bytecode (instead of flattened absolute source code, or better yet, no hash in the bytecode at all), which seems misguided and unnecessary coupling. But that's a Solidity compiler issue (and for anyone opting to enables the ipfs/swam hash functionality).

It's also not quite clear to me what the purpose of copying the subfolders into the temporary directory is? Could you expand a bit on which problem this addresses?

It's to address the requirement to have just once instance of each source/module/dependency, and differentiate between commits/versions/releases of each, and modify the "temp" source imports to point to those specific modules and versions, without touching the user's actual source. If this can be achieved in a different way, without a temp folder, then obviously that's better.

Consider this:

Relative source:

├─ src
│   └─ foo.sol
|        import { Baz } from "baz/Baz.sol";
|        import { Bid } from "bid/Bid.sol";
|        contract Foo is Baz, Bid;
└- lib
    ├─ baz (v1.0.0)
    │   ├─ src
    │   |   └─ baz.sol
    |   |        import { Bar } from "bar/Bar.sol";
    |   |        import { Bid } from "bid/Bid.sol";
    |   |        contract Baz is Bar, Bid;
    |   └─ lib
    │       ├─ bar (v1.0.0)
    │       |   └─ src
    │       |       └─ bar.sol
    │       └─ bid (v1.0.0)
    │           ├─ src
    │           |   └─ bid.sol
    |           |        import { Bar } from "bar/Bar.sol";
    |           |        contract Bid is Bar;
    │           └─ lib
    │               └─ bar (v1.0.0)
    │                   └─ src
    │                       └─ bar.sol
    └─ bid (v2.0.0)
        ├─ src
        |   └─ bid.sol
        |        import { Bar } from "bar/Bar.sol";
        |        contract Bid is Bar;
        └─ lib
            └─ bar (v2.0.0)
               └─ src
                   └─ bar.sol

During compile:

├─ src
│   └─ foo.sol
|        import { Baz } from "baz-v1-0-0/Baz.sol";
|        import { Bid } from "bid-v2-0-0/Bid.sol";
|        contract Foo is Baz, Bid;
└- lib
    ├─ bar-v1-0-0
    |   └─ src
    │       └─ bar.sol
    ├─ bar-v2-0-0
    |   └─ src
    │       └─ bar.sol
    ├─ bid-v1-0-0
    |   └─ src
    │       └─ bid.sol
    |            import { Bar } from "bar-v1-0-0/Bar.sol";
    |            contract Bid is Bar;
    ├─ bid-v2-0-0
    |   └─ src
    │       └─ bid.sol
    |            import { Bar } from "bar-v2-0-0/Bar.sol";
    |            contract Bid is Bar;
    └─ baz-v1-0-0
        └─ src
            └─ baz.sol
                 import { Bar } from "bar-v1-0-0/Bar.sol";
                 import { Bid } from "bid-v1-0-0/Bid.sol";
                 contract Baz is Bar, Bid;

This also ensures that there are not 2 instances of the same source/contract, which causes function/interface collision issues in Solidity.

The idea is to bring all module to some flat root-level module directory, tagged/named by their commit hash or version, and ensure that all contracts still import from the specific files they were previously, when there was a hierarchy.

Consider how easy it is to use npm packages, and not having to worry that one of your dependencies is using a different version of a dependency you are already explicitly using.

@d-xo
Copy link
Contributor Author

d-xo commented Aug 2, 2021

This seems like it's really a fault of putting the hash of human-readable relative source in the bytecode (instead of flattened absolute source code, or better yet, no hash in the bytecode at all), which seems misguided and unnecessary coupling. But that's a Solidity compiler issue (and for anyone opting to enables the ipfs/swam hash functionality).

After thinking about this more, I think my initial point was incorrect. Changing the metadata hash doesn't seem like a big deal, we would either be uploading to etherscan (who ignore the metadata hash afaik), or we would be uploading to a service that accepts the compiler metadata json (e.g. sourcify), which should be enough to exactly reproduce the bytecode (including metadata hash), worst case the imports in the verified contract source would not match those in the source code on the repo.

This also ensures that there are not 2 instances of the same source/contract, which causes function/interface collision issues in Solidity.

This is very interesting, could you expand on this point? I did not realise that this could be a problem.

@deluca-mike
Copy link

deluca-mike commented Aug 2, 2021

This is very interesting, could you expand on this point? I did not realise that this could be a problem.

Sure. If B has oz-contracts as a lib such that B can be ERC20, and then A has B and oz-contracts as a lib, such that A can be B and Ownable, now while A should have the internal oz Context function _msgSender(), there is a collision because _msgSender() is defined both in the oz-contracts imported as a lib by A, and the separate yet identical oz-contracts code that is imported as a lib by the lib B. If all dependencies were deduped and at root level, then it would be trivial to know that both those _msgSender() functions are actually the same, and thus the compiler wouldn't throw and error.

Later, I can perhaps link a repo where this happens.

I won't pretend to be an expert on dependency management or how to handle the issue of a diamond pattern in dependencies or inheritance, but it seems to me like dependency management is a solved problem done excellently by others (like to nodes ecosystem) and that the solidity community should adopt those similar practices. Sure, we can all use npm or yarn to install our submodules, if they are npm packages, but if you want to stay pure (dapp-style packages that use git submodules) then we need to re-implement this solution that seems to have been working for years for others.

@d-xo
Copy link
Contributor Author

d-xo commented Aug 3, 2021

ah, very interesting! If I understand correctly we would still have the same issue if A and B imported slightly different versions of oz?

We can certainly apply the deduplication step within dapp remappings and force all imports of the same package version to remap to the same path on disk, which should help reduce the impact, but this rather seems to be a fundamental issue with allowing multiple versions of the same package into the dependency tree?

As I understand it, most package managers work around this issue by using version bounds and some constraint based version selection algorithm to choose a single package version that satisfies the bounds, which I feel is not appropriate for smart contract development (where I think the exact version of every dependency should be known ahead of time).

@deluca-mike
Copy link

deluca-mike commented Aug 3, 2021

If I understand correctly we would still have the same issue if A and B imported slightly different versions of oz?

Correct. But you can only do so much to help the developer.

this rather seems to be a fundamental issue with allowing multiple versions of the same package into the dependency tree?

Well, its fine to use multiple versions of oz-contracts if you want to use mix and match outside of inheritance. Such as A interacting with B and C, where B uses [email protected] and C uses [email protected]. You can technically already do this:

[submodule "lib/openzeppelin-contracts"]
	path = lib/openzeppelin-contracts
	url = https://github.com/OpenZeppelin/openzeppelin-contracts.git
[submodule "lib/openzeppelin-contracts-2"]
	path = lib/openzeppelin-contracts-2
	url = https://github.com/OpenZeppelin/openzeppelin-contracts.git

And just checkout specific version for each, and then import from lib/openzeppelin-contracts or lib/openzeppelin-contracts-2 in source where necessary.

And that's another thing: you shouldn't need to specify lib or modules, just like in js you don't need to specify nodule_modules when importing in js. It's implied that any import that is not a local file or path is a modules in the common module directory. So the remappings should handle replacing oz-contracts with lib/openzeppelin-contracts or {root_dir}/lib/openzeppelin-contracts;

Further, similar to how package.json defines the entry point, I feel something similar should exist here, instead of forcing (or expecting) contracts to exist in contracts or src.

As I understand it, most package managers work around this issue by using version bounds and some constraint based version selection algorithm to choose a single package version that satisfies the bounds

Very likely. I am not really familiar with their inner workings. But I do know I can specifically install certain releases, branches, and even commits, and that they will be tagged as such. Not sure how that plays out if a dependency I also require itself requires a non-compatible version of what I expecily already installed.

which I feel is not appropriate for smart contract development

Actually, I think the problem is easier with smart contracts, since the scale is smaller and we're able to be much more strict in the rules. Outside of js-based smart contract packages, people already have almost nothing close in terms of package/module management, so really, even a subset of npm's features would be a huge win.

@d-xo
Copy link
Contributor Author

d-xo commented Aug 4, 2021

@deluca-mike I have added a deduplication step the to the remappings generation procedure in #719. I think that with this addtion the new package local remapping are semantically exactly the same as the tmp dir / source code rewritting scheme propossed above.

And that's another thing: you shouldn't need to specify lib or modules, just like in js you don't need to specify nodule_modules when importing in js. It's implied that any import that is not a local file or path is a modules in the common module directory. So the remappings should handle replacing oz-contracts with lib/openzeppelin-contracts or {root_dir}/lib/openzeppelin-contracts;

I'm not sure I understand what you mean here, I believe that this is already the case?

Further, similar to how package.json defines the entry point, I feel something similar should exist here, instead of forcing (or expecting) contracts to exist in contracts or src.

This is quite an interesting idea. We could have some kind of structured metadata file that would let a package locally define their DAPP_SRC or DAPP_LIB directories. I think we would still need the global DAPP_SRC and DAPP_LIB vars to allow setting a default for legacy or non dapp projects that do not provide this metadata themselves. I have also bee thinking of allowing multiple expected src and lib directories (e.g. allowing DAPP_SRC=src,contracts) which would make it easier to mix dapp and npm style projects without having to manually edit remappings.

@MrChico
Copy link
Member

MrChico commented Aug 4, 2021

We could possibly reuse .dapprc files locally as a package.json situation. Or just plain json files. Or revive the old Dappfile concept. In practice I feel like 95% pr the time src or contracts is what people use

@deluca-mike
Copy link

I'm not sure I understand what you mean here, I believe that this is already the case?

It's possible this is already the case and I had forgotten that I could do:

import { DSTest } from "ds-test/src/test.sol"

rather than

import { DSTest } from "lib/ds-test/src/test.sol"

@d-xo
Copy link
Contributor Author

d-xo commented Aug 4, 2021

@deluca-mike you can already do

import { DSTest } from "ds-test/test.sol"

Which will be remapped to whichever on disk path dapp remappings selects for that import. #719 is a change that modfies the path selected by dapp remappings to allow multiple versions of a package to exist within a project dependency tree.

The import

import { DSTest } from "lib/ds-test/src/test.sol"

is an absolute import and will always attempt to import from lib/ds-test/src/test.sol relative to the project root (note that if your package is consumed by another, this will point to the root of the parent package when building the parent).

@d-xo
Copy link
Contributor Author

d-xo commented Aug 4, 2021

We could possibly reuse .dapprc files locally as a package.json situation. Or just plain json files. Or revive the old Dappfile concept.

I guess we could souce each packages local .dapprc as we construct the mappings to try and make use of any DAPP_SRC that has been set there, but it feels a little sketchy somehow.

I think I would maybe like to add a dapp.yaml that has an optional src and lib field, and if the file exists and src or lib are set, we use those for the source and lib dirs, otherwise we default to looking in the directories set in DAPP_SRC and DAPP_LIB?

In practice I feel like 95% pr the time src or contracts is what people use

Definitely agree with this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants