Simplify `Embedding` #2084

mcabbott · 2022-10-15T00:20:46Z

Embedding has some special code for OneHotMatrix which (1) will break with latest changes, and (2) doesn't allow higher-rank arrays the way that "index" input does:

julia> Embedding(26 => 10)(rand(1:26)) |> size
(10,)

julia> Embedding(26 => 10)(rand(1:26, 2,3,4,5)) |> size
(10, 2, 3, 4, 5)

julia> Embedding(26 => 10)(onehot(rand(1:26), 1:26)) |> size
(10,)

julia> Embedding(26 => 10)(onehotbatch(rand(1:26, 2), 1:26)) |> size  # can't go further
(10, 2)

So this PR simplifies & adds reshape.

I did this after forgetting that #1656 exists, some overlap. This PR does not attempt to fix outputsize. Some other changes there have already happened elsewhere.

mcabbott · 2022-10-15T00:24:05Z

src/layers/basic.jl

+(m::Embedding)(x::AbstractVector{Bool}) = m.weight * x  # usually OneHotVector
+(m::Embedding)(x::AbstractMatrix{Bool}) = m.weight * x  # usually OneHotMatrix


These could instead call Flux.onecold. The result will differ on e.g. [true, true, false], not sure we care too much either way?

For performance in the one hot case? If it's onecold-compatible, then folks should use OneHotArray for performance. At least with *, we do the mathematically expected operation.

For OneHotArray these should be identical, right? Result and performance.

For a one-hot BitArray, the results will agree. I would guess that onecold is faster but haven't checked.

For a generic BitArray, I'm not sure which is mathematically expected really. I think you're saying that * is.

Yes, what you wrote is what I meant re: performance. I was adding that in the one-hot bit array case, we can direct people to OneHotArray if their concern is performance.

Yeah whenever I've come across this type of operation in papers, I see it written as *. There's an implicit assumption that x is one-hot, so maybe onecold could be better here if it were made to error for [true, true, false], etc. But I think silently choosing the first "hot" index is wrong.

Yes. Mixing two embedding vectors seems less wrong. But probably nobody ever hits this & it's just a way to decouple from OneHotArray types. I don't think we should document that boolean indexing is an option.

So I think we are happy with the current implementation in the PR?

Yes I think so.

I see we had a very similar discussion in #1656 (comment) BTW, I forgot... but same conclusion.

src/layers/basic.jl

…t without 5 named variables, and show that the point of onehot is variables which aren't 1:n already. Also show result of higher-rank input.

darsnack

Looks ready to me. Is there more you wanted to do here?

src/layers/basic.jl

darsnack · 2022-10-17T13:26:24Z

src/layers/basic.jl

+(m::Embedding)(x::AbstractVector{Bool}) = m.weight * x  # usually OneHotVector
+(m::Embedding)(x::AbstractMatrix{Bool}) = m.weight * x  # usually OneHotMatrix


So I think we are happy with the current implementation in the PR?

mcabbott · 2022-10-17T13:32:21Z

The "more" is #2088, really. Will merge when green.

mcabbott commented Oct 15, 2022

View reviewed changes

src/layers/basic.jl Outdated Show resolved Hide resolved

mcabbott added 2 commits October 15, 2022 21:51

don't specialise on OneHotMatrix, but do call reshape

1f541b4

also give a non-random example where the vectors can be printed. Do i…

3a22bab

…t without 5 named variables, and show that the point of onehot is variables which aren't 1:n already. Also show result of higher-rank input.

mcabbott force-pushed the embedding branch from 55c64c3 to 3a22bab Compare October 16, 2022 01:51

darsnack approved these changes Oct 17, 2022

View reviewed changes

restrict type of field to AbstractMatrix

4676a82

mcabbott merged commit dfd4549 into FluxML:master Oct 17, 2022

mcabbott deleted the embedding branch January 10, 2023 11:28

darsnack mentioned this pull request Jan 10, 2023

return OneHotArray on proper reshape? FluxML/OneHotArrays.jl#30

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify `Embedding` #2084

Simplify `Embedding` #2084

mcabbott commented Oct 15, 2022

mcabbott Oct 15, 2022

darsnack Oct 15, 2022

mcabbott Oct 15, 2022

darsnack Oct 15, 2022

mcabbott Oct 15, 2022

darsnack Oct 17, 2022

mcabbott Oct 17, 2022

darsnack left a comment

darsnack Oct 17, 2022

mcabbott commented Oct 17, 2022

		(m::Embedding)(x::AbstractVector{Bool}) = m.weight * x # usually OneHotVector
		(m::Embedding)(x::AbstractMatrix{Bool}) = m.weight * x # usually OneHotMatrix

Simplify Embedding #2084

Simplify Embedding #2084

Conversation

mcabbott commented Oct 15, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darsnack left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mcabbott commented Oct 17, 2022

Simplify `Embedding` #2084

Simplify `Embedding` #2084