-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC 98: Remote channels for cross-browsing-group communication #98
Conversation
edc3703
to
ff05826
Compare
ff05826
to
65f40ea
Compare
65f40ea
to
1f8064c
Compare
Gentle ping here. This has been open for a week, so per the process I could go ahead and merge it, but I'm pretty sure you don't want me to do that ;) |
rfcs/remote_channel.md
Outdated
field provides the line/column numbers of the original exception, | ||
where available. | ||
|
||
In addition there is a `RemoteWindow.executeScriptNoResult(fn, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on my experience on bfcache tests, I feel executeScriptNoResult()
might be hard to use, because:
- Mixing
await executeScriptNoResult()
andawait executeScript()
might complicates the timeline. Havingawait executeScript()
only would be clear, because it serializes the script evaluation on caller and remote context. - The choice between
await executeScriptNoResult()
andawait executeScript()
might depend on internals ofexecuteScript()
. When I was migrating my bfcache tests from COOP-based framework to test_driver-based framework, I had to make changes aroundexecuteScriptNoResult()
-equivalent that were hard to explain without knowing the internal structure of the framework, so I switched to theexecuteScript()
-only interface.
So I'd prefer removing executeScriptNoResult()
.
Still there would remain complexities around navigation: instead of executeScriptNoResult()
, I created promises / async callbacks triggered by executeScript()
but not blocking async executeScript()
in bfcache tests. But by removing executeScriptNoResult()
I expect we can make the complexities only navigation-specific, not affecting RemoteWindow
interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So (for the context of others) the technical concern here is that for bfcache tests in particular you need all sockets to be closed before initiating the navigation. The problem is that if you send a message, execute some script that closes all sockets, and then send the result, you end up re-opening a socket for the result. So there are broadly two choices:
- Don't try to send a result at all.
postMessage
works like this, andexecuteScriptNoResult
does the same: we simply don't provide a channel to send the result on so there's no concern about sockets being reopened. - Send a result, but ensure we navigate after the result has been sent, so nothing will implictly reopen a socket. Practically this means something like
setTimeout(async url => {await closeAllSockets(); startNavigation(url)}, 0)
. The problem with that is that we are currently linting forsetTimeout
so the infra discourages even usage that's required like this.
I don't quite get the argument about serialization; in the case that you're putting the actual action of the remote script in a timeout the test window can guarantee the script has been sent, but the actual action of the script hasn't happened at the time the value is returned, so it doesn't seem like that method offers particuarly stronger guarantees around ordering. I suppose one advantage is that any exception that happens when evalutaing the script is reported through the return value, rather than just happening in the remote window.
So, I think I'm mostly OK with dropping executeScriptNoResult
because, as you say, it's exposing an implementation detail and it seems cleaner to start with a smaller API. My concern is that for some kinds of tests (notably bfcache), understanding that implementation detail is very important, and there's an argument that making it an explicit part of the API will help people write correct tests. So I'd like to hear other opinions here.
rfcs/remote_channel.md
Outdated
at the time of navigation, otherwise the page will be excluded from | ||
bfcache. In the current prototype this is handled as follows: | ||
|
||
* A `pause` method on `SendChannel`. This causes a server-initiated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented at web-platform-tests/wpt#29803 (comment), I'm not sure how pause
works and whether pause
is needed (the draft impl doesn't use pause
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the problem pause is intended to solve is where you have something like:
await remote.executeScript(() => startNavigation());
await remote.executeScript(() => ensureNavigationHappened());
With a poll()
based interface this can work as written; the remote simply doesn't call poll()
after startNavigation
and before the navigation is complete. But with websockets, we don't have any synchronisation between the navigation and sending the ensureNavigationHappened
message. If the test sends the second message before the socket is closed on the remote, it can end up being lost. So pause solves this by effectively providing a synchronisation point around the socket being closed:
await remote.executeScript(() => startNavigation());
remote.pause();
// We now know the remote now won't listen for messages until it reconnects to the socket after navigation
await remote.executeScript(() => ensureNavigationHappened());
From the point of view of the remote, closeAllSockets()
and pause()
do race, but the effect of pause()
is a server-initiated disconnect of the read channel for the remote, and the effect of closeAllSockets
is a client-initiated disconnect of all channel-related sockets, so the ordering isn't very important.
We can avoid requiring pause by not sending requests to a remote after initiating a navigation, before getting a confirmation that the navigation is actually complete. In practice this means that the target page has to be a different RemoteWindow
(i.e. have a different uuid) and has to process a message i.e. something like:
await remote.executeScript(() => startNavigation());
await navigationTargetRemote.executeScript(() => goBack());
await remote.executeScript(() => ensureNavigationHappened());
That's what the bfcache tests are doing now, and why pause
isn't needed there, but either way the ergonomics are a bit tricky. I think we could drop it from the initial featureset, but these races around navigation are tricky to understand and debug.
rfcs/remote_channel.md
Outdated
via the main test window, make it hard to build an ergonomic | ||
cross-context messaging API. | ||
|
||
### Proposal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to clarify (somewhere in the RFC) what are the dependencies of this RFC in terms of implementation.
IIUC this RFC can be implementable by pure JavaScript outside WPT infra + existing server-side stash, which is good because we can run the tests by running wpt server and bare browsers, without further depending on WPT test infra.
This RFC mentions integration with WebDriver BiDi, test_driver and testharness, but these are out of scope of this RFC itself.
Is my understanding correct?
Why the changes to resources/testharness.js
and tools/wptrunner/wptrunner
are needed in web-platform-tests/wpt#29803 ?
(Other changes, in JS/HTML files, tools/wptserve/wptserve/stash.py
, and websockets/handlers/msg_channel_wsh.py
look like "implementable by pure JavaScript outside WPT infra + existing server-side stash")
they are unable to communicate. For example windows opened with the | ||
`noopener` attribute will not have a handle to their parent, nor will | ||
the parent have a handle to the child. Similarly, cross-origin | ||
navigations with the `cross-origin-opener-policy` header appropriately |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this RFC going to replace the framework in the COOP/COEP tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't expect the RFC to require moving the COOP/COEP tests. However I expect the RFC to provide all the features needed for those tests, and would anticipate the actual migration happening as a followup once the implementation has landed.
rfcs/remote_channel.md
Outdated
synchronize the navigation starting (which will close the socket) with | ||
writing the response. | ||
|
||
TODO: the naming here isn't great. In particular a `RemoteWindow` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't RemoteContext
work (as proposed in #91)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because RemoteContext
is already used by testharness.js to mean something slightly different:
/*
* A RemoteContext listens for test events from a remote test context, such
* as another window or a worker. These events are then used to construct
* and maintain RemoteTest objects that mirror the tests running in the
* remote context.
*/
In the long term it would make sense to unify these features, but in the short term a naming conflict will cause breakage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe RemoteGlobalScope
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would RemoteGlobal
work? RemoteGlobalScope
is quite a mouthful.
rfcs/remote_channel.md
Outdated
the transition may even be seamless. | ||
|
||
testdriver integration is possible. For example we could add | ||
`RemoteContext.testdriver.click` to execute a click in the remote |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would RemoteContext.testdriver.click
be a pure JavaScript wrapper like RemoteContext.executeScript(() => test_driver.click())
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That doesn't work as written, because all testdriver commands have to go via the test window since that's the one that webdriver is using. So it would desugar into testdriver.click(RemoteContext.uuid)
, and testdriver would learn to look up contexts from a uuid
parameter in the URL, as well as from the internal identifier it uses today.
without providing the API surface that only makes sense in a test | ||
window. | ||
|
||
It may be possible in the future to replace the backend with a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For further integration with WebDriver, testdriver etc., discussion about benefits and risks would be helpful (not necessarily now though, given that this is "Possible Future Additions").
One one hand ("Risk" side), I basically prefer not integrating the framework too much to WebDriver/test_driver etc., to make the framework work with minimum dependencies and work with bare browsers, and not to make debugging harder. We might want to JSON.stringify()
instead of WebDriver BiDi format if we want to reduce complexity around serialization, given that the current tests considered so far don't need WebDriver BiDi capability beyond JSON.stringify()
.
So I'm interested in hearing more about the benefits (from the point of view of test writers and WPT infra maintainers) when we consider the further integration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, future additions here would be subject to the same RFC process. So the idea was just to sketch out some possibilites in case people felt there were specific parts that should be given additional consideration in the initial design, or would be high value to work on sooner.
In terms of the serialization format, we are already using extra complexity beyond JSON-stringify when serializing a SendChannel
for message responses. We obviously could special case that, but in general I think using plain JSON as a serialization format for data where we might want to support non-JSON types is a bad idea (e.g. current WebDriver mostly uses plain JSON but also has super-hacky support for serializing Element objects which doesn't generalise to other types and is generally a mess). Aiming for a featureset closer to structured clone seems more future proof, but structured clone doesn't actually define a serialization. So, given that WebDriver BiDi has very similar requirements, and actually defines the wire format, it seems sensible to use that as a reference point. The fact that, going forward, we could use WebDriver BiDi instead of the stash as the backend in the wptrunner implementation is good, but it's not the primary motivation here.
rfcs/remote_channel.md
Outdated
|
||
TODO: the naming here isn't great. In particular a `RemoteWindow` | ||
could actually be some other kind of global like a worker, and | ||
`start_window()` is a pretty nondescript method name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a similar thing to start_window()
in my bfcache tests and I was also not sure about the better naming, better semantics, etc. (and thus I just made it const executor = new Executor(uuid);
and avoided explicit naming).
So I'm also curious how this should be named.
Commented somehow verbosely to move discussion forward. As for The RFC is more about replacing the existing cc/ @ArthurSonzogni and @foolip. |
@hiroshige-g Thanks for the detailed comments, it's much appreciated! I've responded to the technical concerns, hopefully in a way that also provides context for other reviewers. I'll update the RFC for the more editorial issues. Please let me know if I miss[ed] anything. |
Co-authored-by: Ms2ger <[email protected]>
* Remove executeScriptNoResult * Rename `start_window()` to `start_window_channel()` and add `window_channel()` without the auto-connect behaviour. * Change the (de)serialization model to automatically create local objects (like structuredClone) rather than requiring a `toLocal()` call. * Updates from PR feedback.
85856e3
to
a5d3c2b
Compare
Ping @web-platform-tests/wpt-core-team for review on this. |
[PR 29803](https://github.com/web-platform-tests/wpt/pull/29803) | ||
contains a prototype implementation of this. | ||
|
||
<!-- LocalWords: UUID WebDriver wptrunner testharness APIs UI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's LocalWords?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh sorry this is emacs embedding the extra dictionary entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to keep them, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well it's convenient for me when editing, but I could remove them once the overall RFC is approved, right before merging.
Co-authored-by: Philip Jägenstedt <[email protected]>
Require a message type. Also support connect and close messages. Provide the message type in the callback.
Ping, again. This has now gone another week since the last round of updates without further review. |
send messages to the remote. Alternatively the `RemoteWindow` may be | ||
created first and its `uuid` property used when constructing the URL. | ||
This API is provided by a `RemoteGlobal` object. The `RemoteGlobal` | ||
object doesn't handle creating the browsing context (or other global |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good clarification!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some missing punctuation.
Rendered
Implementation