Problem with makemore_part2_mlp.ipynb #50

dewijones92 · 2024-05-07T08:56:42Z

loss = -prob[torch.arange(32), Y].log().mean()
loss
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[20], line 1
----> 1 loss = -prob[torch.arange(32), Y].log().mean()
      2 loss

IndexError: shape mismatch: indexing tensors could not be broadcast together with shapes [32], [228146]

The text was updated successfully, but these errors were encountered:

dewijones92 · 2024-05-07T08:57:12Z

btw, I have a fix
I will make a PR now

thanks so much for this AWESEOME video series @karpathy :)

The loss calculation in the code was causing a shape mismatch error due to inconsistent tensor shapes. The error occurred because the entire `Y` tensor was being used to index the `prob` tensor, which had a different shape. The original line of code: `loss = -prob[torch.arange(32), Y].log().mean()` was causing the issue because: 1. `torch.arange(32)` creates a tensor of indices from 0 to 31, assuming a fixed batch size of 32. However, the actual batch size might differ. 2. `Y` refers to the entire label tensor, which has a shape of (num_samples,), where num_samples is the total number of samples in the dataset. Using the entire `Y` tensor to index `prob` resulted in a shape mismatch because `prob` has a shape of (batch_size, num_classes), where batch_size is the number of samples in the current minibatch and num_classes is the number of possible output classes. To fix this issue, the line was modified to: `loss = -prob[torch.arange(prob.shape[0]), Y[ix]].log().mean()` The changes made: 1. `torch.arange(prob.shape[0])` creates a tensor of indices from 0 to batch_size-1, dynamically adapting to the actual batch size of `prob`. 2. `Y[ix]` retrieves the labels corresponding to the current minibatch indices `ix`, ensuring that the labels align correctly with the predicted probabilities in `prob`. By using `Y[ix]` instead of `Y`, the shapes of the indexing tensors match during the loss calculation, resolving the shape mismatch error. The model can now be trained and evaluated correctly on the given dataset. These changes were necessary to ensure the correct calculation of the loss for each minibatch, enabling the model to learn from the appropriate labels and improve its performance. Fixes karpathy#50

dewijones92 linked a pull request May 7, 2024 that will close this issue

Fix shape mismatch error in loss calculation #51

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with makemore_part2_mlp.ipynb #50

Problem with makemore_part2_mlp.ipynb #50

dewijones92 commented May 7, 2024

dewijones92 commented May 7, 2024

Problem with makemore_part2_mlp.ipynb #50

Problem with makemore_part2_mlp.ipynb #50

Comments

dewijones92 commented May 7, 2024

dewijones92 commented May 7, 2024