Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up show for gpu models #2066

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Speed up show for gpu models #2066

wants to merge 1 commit into from

Conversation

mcabbott
Copy link
Member

The present show methods have about a 20s startup delay when the model is on the GPU. This comes from the checks like any(isnan, x) which print friendly warnings. Perhaps we should remove them, to save startup time?

This PR replaces them with an "(on GPU)" annotation, like so:

julia> m
Chain(
  Dense(2 => 3, σ),                     # 9 parameters
  Dense(3 => 2),                        # 8 parameters  (some NaN)
  NNlib.softmax,
)                   # Total: 4 arrays, 17 parameters, 324 bytes.

julia> m |> gpu
Chain(
  Dense(2 => 3, σ),                     # 9 parameters (on GPU)
  Dense(3 => 2),                        # 8 parameters (on GPU)
  NNlib.softmax,
)                   # Total: 4 arrays, 17 parameters, 576 bytes.

Maybe that ends up quite noisy for bigger models. For this reason I didn't make it some bright colour.

@codecov-commenter
Copy link

Codecov Report

Base: 86.26% // Head: 86.20% // Decreases project coverage by -0.05% ⚠️

Coverage data is based on head (d39838b) compared to base (dffaef0).
Patch coverage: 66.66% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2066      +/-   ##
==========================================
- Coverage   86.26%   86.20%   -0.06%     
==========================================
  Files          18       18              
  Lines        1361     1363       +2     
==========================================
+ Hits         1174     1175       +1     
- Misses        187      188       +1     
Impacted Files Coverage Δ
src/layers/show.jl 70.23% <66.66%> (-0.50%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@darsnack
Copy link
Member

I think it might be confusing for users. Someone used to show on the CPU might interpret the absence of NaN messages as no NaNs.

Is it possible to make this method more "official" with knobs to turn on/off the details? Then the default show calls this with the details turned off.

@mcabbott
Copy link
Member Author

Yes this is my concern too.

I wonder a bit how useful these things really are, how often do people get NaN? Maybe typically through the loss becoming infinite etc?

Can we pre-compile GPU stuff? It just seems a little crazy to wait 20s for the printing code to warm up...

@ToucheSir
Copy link
Member

Can we pre-compile GPU stuff?

We could definitely try using SnoopPrecompile. Based on JuliaLang/julia#46296 and JuliaLang/julia#46373 though, that may have to wait until the next 1.7 and 1.8 point releases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants