Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA scalar operations error on GPU in show functions #3770

Open
ali-ramadhan opened this issue Sep 9, 2024 · 1 comment
Open

CUDA scalar operations error on GPU in show functions #3770

ali-ramadhan opened this issue Sep 9, 2024 · 1 comment
Labels
GPU 👾 Where Oceananigans gets its powers from user interface/experience 💻

Comments

@ali-ramadhan
Copy link
Member

I feel like this impacts usability, especially interactive use, but it's not a bug in the sense that your simulation scripts are fine. Is it worth trying to fix this?

Maybe not. After all, executing model.velocities.u.data in the REPL with a GPU model produces a similar error to the one below.

One nuclear option is to allow scalar operations in show methods, but in this particular example it's for a SubArray{OffsetVector{CuArray}} so not a type we have control over without piracy.


MWE:

using Oceananigans

grid = RectilinearGrid(GPU(), size=(12, 12, 12), x=(0, 1), y=(0, 1), z=k->k)

znodes(grid, Center(), Center(), Center())

produces this error when trying to show the result

12-element view(OffsetArray(::CuArray{Float64, 1, CUDA.DeviceMemory}, -2:15), 1:12) with eltype Float64:
Error showing value of type SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true}:
ERROR: Scalar indexing is disallowed.
Invocation of getindex resulted in scalar indexing of a GPU array.
This is typically caused by calling an iterating implementation of a method.
Such implementations *do not* execute on the GPU, but very slowly on the CPU,
and therefore should be avoided.

If you want to allow scalar iteration, use `allowscalar` or `@allowscalar`
to enable scalar iteration globally or for the operations in question.
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] errorscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:155
  [3] _assertscalar(op::String, behavior::GPUArraysCore.ScalarIndexing)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:128
  [4] assertscalar(op::String)
    @ GPUArraysCore ~/.julia/packages/GPUArraysCore/GMsgk/src/GPUArraysCore.jl:116
  [5] getindex
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/indexing.jl:50 [inlined]
  [6] getindex
    @ ~/.julia/packages/OffsetArrays/hwmnB/src/OffsetArrays.jl:438 [inlined]
  [7] isassigned(A::OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, i::Int64)
    @ Base ./multidimensional.jl:1587
  [8] isassigned
    @ ./subarray.jl:386 [inlined]
  [9] isassigned(::SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true}, ::Int64, ::Int64)
    @ Base ./multidimensional.jl:1582
 [10] alignment(io::IOContext{Base.TTY}, X::AbstractVecOrMat, rows::Vector{Int64}, cols::Vector{Int64}, cols_if_complete::Int64, cols_otherwise::Int64, sep::Int64, ncols::Int64)
    @ Base ./arrayshow.jl:68
 [11] _print_matrix(io::IOContext{Base.TTY}, X::AbstractVecOrMat, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64, rowsA::UnitRange{Int64}, colsA::UnitRange{Int64})
    @ Base ./arrayshow.jl:207
 [12] print_matrix(io::IOContext{Base.TTY}, X::SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true}, pre::String, sep::String, post::String, hdots::String, vdots::String, ddots::String, hmod::Int64, vmod::Int64)
    @ Base ./arrayshow.jl:171
 [13] print_matrix
    @ ./arrayshow.jl:171 [inlined]
 [14] print_array
    @ ./arrayshow.jl:358 [inlined]
 [15] show(io::IOContext{Base.TTY}, ::MIME{Symbol("text/plain")}, X::SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true})
    @ Base ./arrayshow.jl:399
 [16] (::REPL.var"#55#56"{REPL.REPLDisplay{REPL.LineEditREPL}, MIME{Symbol("text/plain")}, Base.RefValue{Any}})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:273
 [17] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:569
 [18] display(d::REPL.REPLDisplay, mime::MIME{Symbol("text/plain")}, x::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:259
 [19] display
    @ ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:278 [inlined]
 [20] display(x::Any)
    @ Base.Multimedia ./multimedia.jl:340
 [21] #invokelatest#2
    @ ./essentials.jl:892 [inlined]
 [22] invokelatest
    @ ./essentials.jl:889 [inlined]
 [23] print_response(errio::IO, response::Any, show_value::Bool, have_color::Bool, specialdisplay::Union{Nothing, AbstractDisplay})
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:315
 [24] (::REPL.var"#57#58"{REPL.LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:284
 [25] with_repl_linfo(f::Any, repl::REPL.LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:569
 [26] print_response(repl::REPL.AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:282
 [27] (::REPL.var"#do_respond#80"{Bool, Bool, REPL.var"#93#103"{REPL.LineEditREPL, REPL.REPLHistoryProvider}, REPL.LineEditREPL, REPL.LineEdit.Prompt})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:911
 [28] #invokelatest#2
    @ ./essentials.jl:892 [inlined]
 [29] invokelatest
    @ ./essentials.jl:889 [inlined]
 [30] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/LineEdit.jl:2656
 [31] run_frontend(repl::REPL.LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:1312
 [32] (::REPL.var"#62#68"{REPL.LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ~/.julia/juliaup/julia-1.10.5+0.x64.linux.gnu/share/julia/stdlib/v1.10/REPL/src/REPL.jl:386

This doesn't happen for regularly spaces dimensions:

julia> xnodes(grid, Center(), Center(), Center())
12-element view(OffsetArray(::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, -2:15), 1:12) with eltype Float64:
 0.041666666666666664
 0.125
 0.20833333333333334
 0.2916666666666667
 0.375
 0.4583333333333333
 0.5416666666666666
 0.625
 0.7083333333333334
 0.7916666666666666
 0.875
 0.9583333333333334

because there is no data on the GPU to show

julia> xnodes(grid, Center(), Center(), Center()) |> typeof
SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Tuple{UnitRange{Int64}}, true}

julia> znodes(grid, Center(), Center(), Center()) |> typeof
SubArray{Float64, 1, OffsetArrays.OffsetVector{Float64, CuArray{Float64, 1, CUDA.DeviceMemory}}, Tuple{UnitRange{Int64}}, true}
@ali-ramadhan ali-ramadhan added GPU 👾 Where Oceananigans gets its powers from user interface/experience 💻 labels Sep 9, 2024
@glwagner
Copy link
Member

Technically we cannot fix this problem here as long as the nodes view into a CuArray, since we don't own any of those types. We might have to submit a PR to CUDA for this? I believe CUDA supports one-level-nested CuArray but not twice.

Another solution is to re-form the nodes so that they are a plain view. This should work, it's merely laziness / convenience that we make a view into an OffsetArray, rather than extracting a view directly from the parent (which would require computing interior indices.

Another solution would be to use Field for the nodes, or even an abstract operation that computes them on the fly. This would allow the nodes to participate in abstract operations. Now that we have a Makie recipe too, one argument for not doing this (inconvenient plotting) may be eliminated. There might be other inconveniences though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GPU 👾 Where Oceananigans gets its powers from user interface/experience 💻
Projects
None yet
Development

No branches or pull requests

2 participants