-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Subresource loading with Web Bundles #590
Comments
We had given feedback on this proposal here: WICG/webpackage#648 (comment), in case it's of interest |
I was going to flag annevk/orb#32 here (using "no-cors" for subresource-from-bundle fetches), but I stumbled across https://bugs.chromium.org/p/chromium/issues/detail?id=1316660 so I take it that is already being tackled? Also filed WICG/webpackage#735 as I don't think registered protocol handlers ought to be involved. When I last discussed this kind of feature in detail, the overarching concern I had is that you end up having to reinvent the network protocol. For an initial visit by the end user you will have improved transfer due to compression being possible across responses without leaking information, but now you've lost caching and partial invalidation. So at some point you start discussing some kind of update protocol for the bundle, at which point things really start to get complex and largely duplicative. |
Thanks for the reply and filing an issue. Responses inline and there. This issue is related to "Signed Exchange". That is not related to "Subresource Loading with Web Bundles". They are different features. The latter does't use signed-exchange at all. Unfortunately, both "Signed Exchange" and "Subresource Loading with Web Bundles" are using the same repository for a historical reason. I guess that is the root cause of the confusions, which we want to resolve somehow eventually... :( Regarding "subresource loading with WebBundles", the request's mode is "cors" by default. https://github.com/WICG/webpackage/blob/main/explainers/subresource-loading.md#requests-mode-and-credentials-mode |
Okay, so even if a request for a subresource was made using "no-cors", it's okay because the bundle itself was requested using "cors". And ORB isn't impacted because the lookup inside the bundle happens way before you're about to hit the network and a bundle cannot be used in a response to a normal subresource request, only in response to a request whose destination is "webbundle". Upon scanning the explainer (which is great by the way, thanks for writing it up!) again I found https://docs.google.com/document/d/11t4Ix2bvF1_ZCV9HKfafGfWu82zbOD7aUhZ_FyDAgmA/edit about subsequent requests and that does indeed suggest that's both a direction this might be going in and that it's a hard unsolved problem. That makes me rather hesitant to endorse bundling as a solution. |
Thanks! Responses inline below:
Yes, that's right. We've added "webbundle" destination exactly for that reason.
Thanks. That is an area which we'd like to explore together in v2. In v1, we support fetching multiple requests in a single request with Web Bundles efficiently. It addressed some use cases (eg. #624, which doesn't need any caching support). That's already beneficial. However, we are aware that this couldn't address a loading issue for large JS apps, which relies on user-land bundling solution (eg. webpack) today and still difficult to take an advantage of a browser's cache easily. Thus, we need a kind of protocols by which we can avoid to transfer a resource if a browser has a cache, on the top of what v1 has achieved. There are several proposals (listed in here too):
The common factor of these proposals comes from: A large JS apps need a new primitive which enables:
We agree that this is a hard, unresolved problem, but it is worth exploring to figure out what is the smallest web platform primitive which user-land solution can't achieve efficiently. We'd like to avoid duplication here. I hope this is a good summary. Sooner or later, we'd like to write an Explainer with more details for v2. I hope we can work together there! |
I'm not convinced that more efficient ads is sufficient for adding quite a bit of complexity to the web platform. It would also make them rather easy to block, which I suspect goes counter to your goals. And without the "v2" part it's hard to judge what the complexity of this feature might end up being and if that justifies its cost. And if it is indeed preferable to further investment in network protocol solutions. |
There is ongoing conversation on the TAG design review: w3ctag/design-reviews#616 |
I don't think it's contrary to the goals of this feature. Here is some positive feedback from Google Ads shared as part of the current Blink Intent to Ship thread: Google Ads (use case) (origin trial participant)
The points raised in the issue sound compelling to me. |
Confirming that "rather easy to block" is not a problem from an ads perspective. We're not trying to circumvent ad blockers here. |
Re this concern,
Our intent with v2 isn’t to reinvent the network protocol. Instead, we plan to use immutable subresource URLs to deal with updates. The browser would simply share a list of subresources it has with their versionized/hashed URLs, and the server would provide the ones that are not already cached in the browser. We don’t have any intention of dealing with lower mechanisms for HTTP Cache, such as cache-control, if-modified-since, and there is no need for it. These mechanisms would only be used on the bundle itself, not the subresources within. Example from one of the proposals under consideration: The first visit (cold cache):
The second visit (warm cache)
|
I don't know if you followed the story of draft-ietf-httpbis-cache-digest or not, but this is exactly the sort of thing that we spent a good amount of time on at that time. It's possible that some of the reasons we abandoned that line of investigation can be worked around by carefully constraining the problem, but this looks a lot like that framing.
I don't know how to reconcile this with an "[...] intent with v2 isn’t to reinvent the network protocol." This is exactly what is being proposed. The same outcome can be attained by sending two requests for the separate resources, modulo some minor gains in byte efficiency. Ultimately, the problem you are attempting to solve here is fundamentally hard. My understanding of bundling has always been that it offers advantages to the extent that it allows us to populate more points in the space of trade-offs between atomic and monolithic resources. However, I don't see how this particular design would ultimately be better. The introduction of resource maps might be a net gain, but I am not seeing the advantages from the bundling aspect of the design. |
I noticed that neither the explainer nor the spec mentions anything about what implications this would have on speculative HTML parsing. It would be good to have a paragraph explaining the implications. Notably, under speculative fetch in the HTML spec, there's a list of elements that may affect subsequent speculative fetches. Also, at present, the speculation-sensitive information travels in HTML attributes. It seems worthwhile to at least mention the novelty of the text content of an element becoming speculation-sensitive. |
Hey @martinthomson! I kinda followed the story :) FWIW, what killed Cache Digests is not any particular issue, but lack of implementer interest.
I don't think it creates an entirely new protocol, but provides a way for HTTP to request multiple resources from a single origin in a single bundle, and then have those resources be compressed as a single entity.
The way I see it:
I tried in the past to tackle some of the compression benefits as part of Compression Dictionaries but that effort was shot down as being overly broad and hence too dangerous from a security perspective. WebBundles seem like a reasonable way to move the bundling responsibility to the origin as part of its build process (and hence, have them only bundle together non-credentialed resources), avoiding the risks of cross-resource compression revealing secrets from credentialed resources. |
Re:
A good point! Thanks. I've filed an issue WICG/webpackage#747. |
After discussing this at some length, we realized that there are a number of moving parts that are hard to disentangle. BundlesThe idea of bundling resources and delivering them as a single unit is hard to object to. You can get crude versions of that in a number of ways (HTML inlining, JS bundlers, and On its own, divorced from some of the other features that build on this, it isn't necessarily a compelling feature, but we can pretty easily convince ourselves that it isn't bad for the web in any way, at least in the abstract. We haven't really spent a lot of time looking into the details of the design of the format yet, because a lot of our attention has been drawn to other more challenging aspects of the proposal (or suite of proposals, you might say). For instance, the use of magic numbers elicited a lengthy conversation about their value relative to media types for use on the web (CORB seems to have tipped the balance toward media types). I think that I understand why magic numbers have been proposed in this specific case and it might be justified, but we haven't really sat down to understand that aspect in any detail. No doubt there are other aspects of the format that would elicit similar discussions. Resource IdentificationI still reserve concerns about how bundle components are identified, or whether they need to be. We've had a number of conversations about this, but haven't really resolved anything (at least from my recollection) other than that the problem is hard. That one resource can now speak for another subverts the URL resolution process as the primary means of establishing authority. The use of scoping only partially mitigates that concern. The use of UUIDs and the definition of a new URI scheme (or is it a resource specifier; see below) introduces another concept into the mix. UUIDs are useful for managing collision risk, but they don't provide any uniqueness guarantee if you allow for adversarial content being loaded. Their use in CSP seems inadvisable in that light, particularly since the list of bundle registrations is mutable, which might allow an attacker to supplant an allowed uuid-in-package resource. New IndirectionsThe notion of resource maps adds a new layer of indirection to the platform. JS have formalized this in their narrow domain with their language around specifiers and URLs in a way that makes a fair bit of sense. Permitting the use of arbitrary specifiers that are mapped before being treated as URLs is a powerful tool in its own right and one that requires careful consideration. This is a powerful tool that probably requires its own consideration. Here we need to consider the implications on the various security functions we have built, like CSP. CompressionSpecification-wise, this might be free if it uses content-codings. This probably doesn't need too much discussion, except to note the effect on performance of different strategies, particularly when it comes to the subset piece. Bundle Subsets/Selective FetchThis stuff is the subject of the recent discussion here. This stuff is highly speculative and I don't think we are able to take a position on the design being proposed just yet. I stand by my statement that this is a new protocol, even if it falls short of a total reimagining of the protocol stack. Its dependence on Vary/Variants and an understanding of bundle content for good performance gives it something of a difficult deployment challenge to overcome. Combining TheseObviously, you don't realize many benefits until you put a few of these pieces together, but you don't need to solve everything before you get some useful features. |
Request for Mozilla Position on an Emerging Web Specification
Other information
See chrome status page: https://chromestatus.com/feature/5710618575241216
Chrome is doing the origin trial (1, 2) for this feature. I'm filing this request because it might be a good time to ask.
Thanks!
The text was updated successfully, but these errors were encountered: