Entropy loss without einops #133

jacobgorm · 2024-06-12T15:23:57Z

jacobgorm
Jun 12, 2024

hi,

I am trying to use LFQ for an existing codebase that has traditionally not used Einops for anything. I am using the Lucidrains LFQ code as a starting point, but am starting from scratch because find that it has too many knobs that makes the code too hard to reason about. Also, I frankly don't understand Einops well enough to get the Entropy loss working in my implementation, and I don't like adding too many external dependencies outside of pytorch, or use copy-pasted code that I don't completely understand or will be able to read half a year from now, so I was wondering if anyone would be able to help me write a super-simple LFQ without any Einops.

I've noticed that pytorch already has the Categorical distribution https://pytorch.org/docs/stable/distributions.html that appears to allow me to calculate the entropy, and the mean should be easy, so I am just looking for code that will get a typical image tensor (b,c,h,w) into the right shape for reproducing the entropy loss as per the Magvit-2 paper.

Here is the code that I currently have:

class LFQ(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, x):
        ones = torch.ones_like(x)
        quantized = torch.where(x > 0, ones, -ones)
        commit_loss = F.mse_loss(quantized.detach(), x).mean()
        x = x + (quantized - x).detach()
        return x, 0.1 * commit_loss

Any help would be much appreciated.

MisterBourbaki · 2024-06-13T07:54:32Z

MisterBourbaki
Jun 13, 2024

Hi @jacobgorm ,
Thanks for your interest in VQ and this repo! Regarding your question about Einops and the computation of both entropy-based losses, if I am not mistaken Einops is used only here . This only means the mean is applied to all dimensions but the last two ones:

_dims = tuple(range(per_sample_probs.dim() - 2))
avg_prob = per_samples_probs.mean(dim=_dims)

This snippet should do the trick, even if it is not beautiful :)

0 replies

jacobgorm · 2024-06-15T22:52:57Z

jacobgorm
Jun 15, 2024
Author

Thanks a lot!

However, I am also having trouble converting the distance calculation etc. Is it equivalent to this formula:

       flatten = input.reshape(-1, dim)
        dist = (
            flatten.pow(2).sum(1, keepdim=True)
            - 2 * flatten @ codebook
            + codebook.pow(2).sum(0, keepdim=True)
        )

Used in older versions of this repo?

0 replies

MisterBourbaki · 2024-06-16T07:38:15Z

MisterBourbaki
Jun 16, 2024

If you are talking about this line for the computation of the loss, einsum is not part of the Einops package, but comes directly from Pytorch :)

Regardless, I think the formula you wrote looks good, except it should be the transpose of the codebook here. Play with toy examples of different shapes, and you will find if it works or not!

0 replies

jacobgorm · 2024-06-16T21:58:28Z

jacobgorm
Jun 16, 2024
Author

Here is how far I got:

class LFQ0(nn.Module):

    def __init__(self):
        super().__init__()

    def forward(self, x):
        ones = torch.ones_like(x)
        quantized = torch.where(x > 0, ones, -ones)

        dim = x.size(0)
        codebook_size = 1 << dim
        mask = 2 ** torch.arange(dim - 1, -1, -1)

        all_codes = torch.arange(codebook_size)
        bits = ((all_codes[..., None].int() & mask) != 0).float().to(x.device)
        codebook = 2 * bits - 1
        codebook = codebook.transpose(1,0)

        flatten = x.reshape(-1, dim)
        distance = (
            flatten.pow(2).sum(1, keepdim=True)
            - 2 * flatten @ codebook
            + codebook.pow(2).sum(0, keepdim=True)
        )

        per_sample_probs = (-distance * 100.).softmax(dim = -1)
        per_sample_entropy = entropy(per_sample_probs).mean()

        _dims = tuple(range(per_sample_probs.dim() - 2))
        avg_prob = per_sample_probs.mean(dim=_dims)

        codebook_entropy = entropy(avg_prob).mean()

        entropy_loss = F.relu(per_sample_entropy - codebook_entropy)

        commit_loss = F.mse_loss(quantized.detach(), x).mean()

        x = x + (quantized - x).detach()
        return x, 0.25 * commit_loss + 0.1 * entropy_loss

notice that I had to use F.relu() to prevent occasional negative entropy loss values, not sure if that is a sign that something is not right.

0 replies

MisterBourbaki · 2024-06-17T08:45:58Z

MisterBourbaki
Jun 17, 2024

This looks good! The non positivity of the loss is not an issue, it is expected for the loss to sometimes be negative, due to the difference between two terms.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Entropy loss without einops #133

{{title}}

Replies: 5 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Entropy loss without einops #133

jacobgorm Jun 12, 2024

Replies: 5 comments

MisterBourbaki Jun 13, 2024

jacobgorm Jun 15, 2024 Author

MisterBourbaki Jun 16, 2024

jacobgorm Jun 16, 2024 Author

MisterBourbaki Jun 17, 2024

jacobgorm
Jun 12, 2024

MisterBourbaki
Jun 13, 2024

jacobgorm
Jun 15, 2024
Author

MisterBourbaki
Jun 16, 2024

jacobgorm
Jun 16, 2024
Author

MisterBourbaki
Jun 17, 2024