Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace resolved field by hash #64

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions text/0000-lock-without-registry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
- Start Date: 2017-05-09
- RFC PR:
- Yarn Issue:

# Summary

The lockfile yarn.lock should not include the base registry (`https://registry.npmjs.org`).

# Motivation

In yarn.lock, the `resolved` field includes registry such as `https://registry.npmjs.org`.
In China, most developers will set it to `https://registry.npm.taobao.org` for speed; but it seems slow for travis-ci and circleci.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also add as a motivation that the current approach leads to developers leaking their internal artifact repository sites to the public internet via yarn.lock if they have their company's artifact repository configured in a .npmrc or .yarnrc file.

# Detailed design

Replace the `resolved` by a `hash` field.
Copy link
Member

@bestander bestander May 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think hash would not be enough.

resolved "https://registry.yarnpkg.com/abbrev/-/abbrev-1.1.0.tgz#d0554c2256636e2f56e7c2e5ad183f859428d81f"

should be converted into

  resolved "/abbrev/-/abbrev-1.1.0.tgz#d0554c2256636e2f56e7c2e5ad183f859428d81f".

Where https://registry.yarnpkg.com will be substituted by default and can be overridden with a setting in .yarnrc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it would require everyone to update their lockfiles to have the partial tarballs URLs, that may limit the usefulness of this feature.
How about having a config that would swap registries, e.g.

registry-replace "https://registry.yarnpkg.com=https://registry.npm.taobao.com;https://registry....=...."

It would allow people from remote areas seamlessly switch registries for existing projects

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bestander as noted in my motivation comment, retaining the registry URL within in a yarn.lock file could leak company repositories, so entirely removing them from a yarn.lock file would be nice.

Actually it would require everyone to update their lockfiles to have the partial tarballs URLs, that may limit the usefulness of this feature.

I can see that it would require intervention by each project to upgrade their yarn.lock files to follow the new approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you still release your yarn.lock to open source then you want the https://registry.yarnpkg.com domain to be present.
registry-replace "https://registry.yarnpkg.com...." setting could be used to replace the URLs at fetch time and your project could be configured not to leak artifactory domain

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why insn't the hash enough @bestander? Isn't the url completely inferred from repository+name+version?
From package.json, yarn can already find what is the correct url (based on the registry in the configuration), so it should be the same in this case?

Here the hash would be present to ensure the artefact is the right one only, not for localisation at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tarball URL is returned by npm backend https://github.com/yarnpkg/yarn/blob/master/src/resolvers/registries/npm-resolver.js#L189, Yarn is not constructing it.

I don't think URL reconstruction is justified

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a way, the whole point of this RFC to discuss do that: constructing the URL on the fly :)

When you say it is the npm backend constructing the URL, do you mean that IF we wanted to reconstruct it, we would need to store in the lockfile that it is a npm dependency so that we know we need the npm backend to reconstruct the full URL?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True :)

I mean that npm is responsible for sending us the URL, for compatibility reasons we probably don't want to move this logic to Yarn.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well you don't need to move it in yarn, you simply need to ask npm again every time :)

I understand this is a tricky place to make changes, but personally, I feel like yarn is broken by design by the choice of storing urls directly in the lock file (and I'm curious to see how npm is going to tackle this with their new coming version).
As it currently is, it cannot scale, you can't use proxy caches easily, you potentially expose private information publicly, and so on.

The alternative of using a registry-rewrite is interesting too though, so maybe it will be simpler (in terms of retro-compatibility) to use that and keep the registry URL in the lockfile.
Having the URL in the lockfile also has the merit to ease the use of multiple repositories (not proxies of existing ones) side to side (even though I'm not sure how you can add a dependency with a different registry easily currently with yarn).
In that case the repository base-url (e.g., https://registry.npmjs.org) would act as a kind of identifier of a given repository, and users could configure mirrors (e.g., https://registry.npm.taobao.org) for these repositories (i.e, the rewrite you talked about) in their configuration.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that npm is responsible for sending us the URL, for compatibility reasons we probably don't want to move this logic to Yarn.

Okay, so now I see what you mean, and I agree that entirely removing the resolution URL from yarn.lock would mean that yarn would need to generate its own URLs internally, which has all the downsides of additional code for a spec yarn doesn't control.

If the URL must follow a specific spec, could the logic used to generate the value of dist.tarball be refactored out of npm and yarn, and shared?

The alternative of using a registry-rewrite is interesting too though, so maybe it will be simpler

Another thought that came to mind is what happens where there are multiple registries that need to be overwritten? I've been assuming a developer only needed to override a single registry, but what if a yarn.lock has references to multiple artifact registries? A developer would need to grep the yarn.lock file, enumerate all registries used, and write registry-rewrite configs for each.

The `url` in `resolved` is unnecessary; keeping `hash` is enough.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get an example of what the resolved field would look like following this standard? (Kind of like your examples in yarnpkg/yarn#3330)


# How We Teach This

Just set the registry before `yarn install` if you do not want to use `https://registry.npmjs.org`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference to the npm registry needs to be replaced with a reference to the Yarn registry - https://registry.yarnpkg.com

Or use `yarn install --registry=https://registry.npm.taobao.org`.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably need to emphasize on the documentation website that install will use the .yarnrc configured registry, the command line configured registry, or will fallback to the default.

# Drawbacks
More effort is needed in order to support users who really need the whole `resolved` field in their project.

# Alternatives

Don't change the lockfile, but change the real registry by

`yarn install --registry=https://registry.npm.taobao.org`

# Unresolved questions

No questions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an open question, I would like to ask how will this be rolled out to all Yarn using projects? Will Yarn replace the entire yarn.lock file? Will Yarn only use the new format for changed resolutions in the yarn.lock file?