The E-step and stop_gradient

Nice to see there's already an implementation of this!

I just stumbled across [tensorflow's "stop_gradient" function](https://www.tensorflow.org/api_docs/python/tf/stop_gradient). In the examples of where the function might be needed, they mention "The EM algorithm where the M-step should not involve backpropagation through the output of the E-step."

Does this also apply when using the EM algorithm for routing? I don't think I read anything about this in the paper, but then again the paper is very sparse with information about the backpropagation...
Not calculating the gradients for the E-step might considerably speed up training, I believe.
Thoughts?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

The E-step and stop_gradient #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

The E-step and stop_gradient #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions