Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Book 2 (load/save) #259

Merged
merged 3 commits into from
Aug 1, 2023
Merged

Book 2 (load/save) #259

merged 3 commits into from
Aug 1, 2023

Conversation

Narsil
Copy link
Collaborator

@Narsil Narsil commented Jul 27, 2023

Depends on #258 Drafted until merge.

@Narsil Narsil marked this pull request as draft July 27, 2023 13:40
use candle::{DType, Device, Result, Tensor};

struct Model {
first: Tensor,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it makes more sense to use candle_nn::Linear here?
Or maybe we really want to go back to basics here in which case we should mention that Linear exists but for the sake of this tutorial we won't use it here and have explicit tensors instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea. I was wondering how to complexify this example without delving into too much details to keep it hello world like.

I will keep this example that simple, and expand by making your own Linear, then referring to candle_nn for layers in general.

For Conv1d and Linear for instance, wdyt ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this comment is more about #258 (This PR I converted to draft because it's more about the save/load surface to get simple cheatsheet more complete on that front.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, you could even even not have backprop/gradient descent for a tensor 101 though maybe it's too simplistic. But certainly good to only introduce candle_nn later in the process (so that users can understand that there is no magic behind the hood there).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I want to keep the training loop in a different place.

IMO after having hello world + there's no magic you have several options:

  • Run a real model (90% of people)
  • Train a model (10% of people)

It's an old stat, we're closer to 99% vs 1% now that AI is much more mainstream (but there's still a lot of finetuners out there, mostly using script already written by people)

@Narsil Narsil changed the title Adding new surface for savetensors (global load, global save). Adding new surface for safetensors (global load, global save). Jul 27, 2023
@Narsil Narsil marked this pull request as ready for review August 1, 2023 13:00
@Narsil Narsil changed the title Adding new surface for safetensors (global load, global save). Book 2 (load/save) Aug 1, 2023
@Narsil Narsil merged commit babee9f into main Aug 1, 2023
10 checks passed
@Narsil Narsil deleted the book_2 branch August 1, 2023 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants