Multi output policies #502

H-Park · 2019-10-09T03:23:02Z

Example: in pysc2, some actions require positional arguments, such as move commands.

This might be too environment specific, but is there a want to allow this?

araffin · 2019-10-09T08:49:13Z

Hello,

Could you be more precise?
You mean having different action spaces at the same time?

Miffyli · 2019-10-09T08:52:52Z

@araffin
I understood it that way, yes. I.e. Support for spaces.Tuple/spaces.Dict but for actions rather than observations, as discussed in #133 .

The major modification is same for both: Currently everything is more or less bundled up into arrays, which makes overall processing of things quite a bit tidier. A support for tuple/dict spaces would require storing variable length arrays/tuples somewhere along the line as well as creating dynamic number of placeholders and whatnot.

However, if the next backend is going to be non-graph one (i.e. TF2 eager mode and/or PyTorch), this should be easier to do with those frameworks. I hate to move more and more stuff to "next-backend" project, but this sounds much more reasonable enhancement alongside new backend, rather than implementing it in current graph-version and then translating it to the new backend.

H-Park · 2019-10-09T14:47:45Z

Is there a timeline in place for this transition?

Miffyli · 2019-10-09T14:50:39Z

No specific timeline other than "one day™". The first step would be to support TF2.0 via the backwards-compatible API in 1.14 (see #366), but that would be mostly renaming function calls.

araffin added the enhancement New feature or request label Oct 9, 2019

Miffyli mentioned this issue Nov 23, 2019

V3.0 implementation design #576

Closed

H-Park closed this as completed May 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi output policies #502

Multi output policies #502

H-Park commented Oct 9, 2019

araffin commented Oct 9, 2019

Miffyli commented Oct 9, 2019

H-Park commented Oct 9, 2019

Miffyli commented Oct 9, 2019 •

edited

Loading

Multi output policies #502

Multi output policies #502

Comments

H-Park commented Oct 9, 2019

araffin commented Oct 9, 2019

Miffyli commented Oct 9, 2019

H-Park commented Oct 9, 2019

Miffyli commented Oct 9, 2019 • edited Loading

Miffyli commented Oct 9, 2019 •

edited

Loading