Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: redesign tensormap structure #140

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

WIP: redesign tensormap structure #140

wants to merge 20 commits into from

Conversation

Jutho
Copy link
Owner

@Jutho Jutho commented Jul 29, 2024

Plan:

  • New fusion tree iterator, that accepts sector iterators for outgoing and uses new ordering that sorts fusion trees such that left most values are changing the quickest in the list: outgoing[1], outgoing[2], vertex[1], inner[1], outgoing[3], vertex[2], inner[2], …
  • Encode all structure information at the level of the HomSpace and experiment with different strategies for caching this information
  • Store all the tensor data in one sequential DenseVector
  • (optional): Experiment whether it is beneficial to use a Memory object in Julia 1.11 and beyond.

Copy link

codecov bot commented Jul 30, 2024

Codecov Report

Attention: Patch coverage is 85.89909% with 109 lines in your changes missing coverage. Please review.

Project coverage is 69.96%. Comparing base (419f88c) to head (ce25bf2).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/tensors/tensor.jl 77.39% 26 Missing ⚠️
src/tensors/truncation.jl 63.63% 24 Missing ⚠️
src/tensors/abstracttensor.jl 62.50% 15 Missing ⚠️
src/spaces/homspace.jl 86.25% 11 Missing ⚠️
src/tensors/linalg.jl 90.62% 9 Missing ⚠️
src/fusiontrees/iterator.jl 94.44% 8 Missing ⚠️
src/tensors/tensoroperations.jl 82.14% 5 Missing ⚠️
src/fusiontrees/manipulations.jl 80.95% 4 Missing ⚠️
src/tensors/vectorinterface.jl 81.81% 4 Missing ⚠️
src/fusiontrees/fusiontrees.jl 92.30% 1 Missing ⚠️
... and 2 more
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #140      +/-   ##
==========================================
- Coverage   79.72%   69.96%   -9.76%     
==========================================
  Files          42       40       -2     
  Lines        4961     4905      -56     
==========================================
- Hits         3955     3432     -523     
- Misses       1006     1473     +467     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Jutho
Copy link
Owner Author

Jutho commented Sep 11, 2024

I've been continuing with this PR and will soon push an update. One set of questions that comes up is what we do with indexing, in particular to TrivialTensorMap.

  • Defining TrivialTensorMap as a type alias is already less convenient, since the sectortype is no longer part of the parameter list of TensorMap. I am hesitant to make unions such as S<:Union{ComplexSpace,CartesianSpace,GeneralSpace} since that is not extensible, so I would just get rid of the type alias.
  • Indexing into TrivialTensorMap can use no argument t[] (return the data as N=N1+N2-dimensional tensor), t[i1,...,iN] (returning a scalar, equivalent to t[][i1,...,iN]), t[(nothing,nothing)] where nothing is a substitute for the trivial fusion tree (and thus t[(nothing,nothing)] == t[]).

The question is whether we want to keep all of those. Without a type alias, they would start with sectortype(t) == Trivial || throw(error...).

@Jutho
Copy link
Owner Author

Jutho commented Sep 11, 2024

Another question is what the interface needs to be to get the data vector of a TensorMap; do we just use t.data or do we want an explicit interface for that. Before the data was the list of blocks, and so blocks(t) and block(t,c) was the public interface to access the raw data. However, simple vector space methods, in particular everything from VectorInterface.jl, can now be rewritten to act directly on the data vector without needing any of the structure info. So acting on the data field is quite common.

@lkdvos
Copy link
Collaborator

lkdvos commented Sep 11, 2024

I think my first reaction is to indeed just get rid of TrivialTensorMap, as having to special-case this was already not so convenient. I really don't like the getindex(::TrivialTensorMap, ::Nothing, ::Nothing) method, and I think that unifying the non-symmetric and symmetric tensors might also promote writing code that is not specific to no symmetry.

This being said, I would definitely like to keep the indexing behavior for non-symmetric tensors, both with scalar getindex and setindex methods, and even to consider allowing slices/view. It is still not too straightforward to fill up a symmetric tensor, so I don't think we want to get rid of this.
I have to say that I would even be in favor of having a getindex for symmetric tensors, which would act in a way similar to CUDA arrays, where you can do it but you have to either be in the REPL, or explicitly declare that you want to do scalar indexing into a symmetric tensor, with warnings etc for the performance pitfalls. For Abelian symmetries we could even consider allowing setindex, with the same caveats?

I don't have a strong opinion on having an interface for accessing the data. From what I understand, the fields of a type are never really considered public, and for TensorMap this has also never really been the case, the exact way the data is stored is kept hidden as much as possible. Given that it is mostly VectorInterface that can use direct access to the field to speed up the implementation, I'm okay with that being "don't try this at home direct field access" 😁 . I also don't think there is a lot of benefit to have this as part of the interface, and would really consider it an implementation detail.

@Jutho
Copy link
Owner Author

Jutho commented Sep 11, 2024

I am all in favor of getting rid of the (nothing, nothing) indexing.

For the nonsymmetric / trivially symmetric tensors, the identity

t[scalarindices...] == t[][scalarindices...]

implies that maybe the left one is not strictly necessary, as it is still accessible (in an even more flexible way that also allows slicing etc) as long als t[] still works. However, maybe we rather want to deprecate instead t[].

The reason that I would consider providing an interface to at least get the data, is that unfortunately there are many external libraries that only work on AbstractArray or AbstractVector instances, e.g. DifferentialEquations.jl, several nonlinear optimisation libraries, … . At least with the new data structure, it will become easier to use these in combination with TensorMap instances, as it will be a matter of doing

some_external_function(f, t.data)

where f is a function that encodes an operation on tensors, but now needs to be written as

function f(data)
 # reconstruct tensor
 t = TensorMap(data, space)
 # act
 tnew = # act on t ...
 # return data in case output is also a tensor (e.g. ODE right hand side)
 return tnew.data
end

We could even go one step further and make AbstractTensorMap{T} <: AbstractVector{T} with length(t) = dim(t). In that case, getindex and setindex! with a single index need to be supported and acts directly on t.data.

So I think this requires some careful deliberation weighing the pros and cons of different approaches.

@lkdvos
Copy link
Collaborator

lkdvos commented Sep 11, 2024

I think for the question of t[][inds...] vs t[inds...], my preference goes to the latter. The main reasons being that it is just a little more intuitive, it is a correct overload of Base.getindex, and is more intuitive for people that are used to Array objects. Additionally, t[] necessarily has to work with views, which can easily become confusing, and while diagonalt[inds] = val could in principle be made to work, this becomes a lot more difficult with diagonalt[][inds] = val.
I have no real problem with having to copy over some of the AbstractArray implementations to make these convenience methods work.

I am not a big fan of AbstractTensorMap <: AbstractVector in the way you describe, because really that only applies to TensorMap. For AdjointTensorMap, this now has a very weird interplay with what getindex should do, as the data is now non-trivially permuted. For DiagonalTensorMap you in principle no longer have dim(diagt) = length(diagt.data), and for BlockTensorMap this becomes a big hassle to make the indexing work as wanted because now it would be an AbstractArray, but not in the way that is intuitive. I think it would create more problems than it would solve.

As a sidenote, we already have some functionality for the type of functions you describe through FiniteDifferences.to_vec:

t0 = input_tensor
v0, from_vec = to_vec(to)
function my_wrapper_fun(v::Vector)
    t = from_vec(v)
    [...]
    return to_vec(result)[1]
end

I also just realised that this allows a little more fine-grained control over the fact that our data does not have the inner product you would expect whenever non-abelian symmetries are involved, i.e. inner(x.data, y.data) != inner(x, y) in general. This is a really subtle problem that makes me even more hesitant to make the data part of the interface 🙃

@Jutho
Copy link
Owner Author

Jutho commented Sep 11, 2024

There are all good points; the AbstractTensorMap <: AbstractVector suggestions was just something I wanted to throw out there, not something I would myself be in big favor of. But I do think that it is important for further progress that we can easily interface with libraries that only accept array-like data, and I do think we should have a public interface for that. I am not too concerned about AdjointTensorMap and BlockTensorMap; in the cases where you want to use e.g. ODE solvers or optimisation methods, the tensors that you will act on will be normal TensorMaps I assume. I don't think that situation is dramatically different from some of the AbstractArray wrapper or otherwise custom types.

Regarding the inner product; in principle we could consider absorbing a factor of sqrt(dim(c)) in the data, and then taking it out where needed. Not sure how much overhead that would create and whether it would be compensated by its benefits.

src/auxiliary/linalg.jl Outdated Show resolved Hide resolved
src/fusiontrees/iterator.jl Outdated Show resolved Hide resolved
src/spaces/homspace.jl Outdated Show resolved Hide resolved
d₂ = dim(dom, s₂)
r₂ = (inner_offset₂ + 1):(inner_offset₂ + d₂)
inner_offset₂ = last(r₂)
# TODO: # now we run the code below for every f₂; should we do this separately
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely an optimization I did in TensorTrack, it might be worth it to separate that out (although if it is cached it should also not matter too much)

src/tensors/abstracttensor.jl Outdated Show resolved Hide resolved
src/tensors/tensor.jl Show resolved Hide resolved
Comment on lines 130 to 131
size(datac) == size(b) || throw(DimensionMismatch("wrong size of block for sector $c"))
copy!(b, datac)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we instead have setindex!(t::AbstractTensorMap, c::Sector) we could make the size check and haskey check automatic

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that's part of the indexing discussion then. So you would prefer to also do block access via getindex and setindex!? I was a bit hesitant in channeling all these different functionalities through the same interface.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would indeed prefer this, mostly for the setindex! part, I like that we can avoid handing the user a view to then use in-place, and the ability to re-use some of the boundschecking functionality is also convenient. Additionally, I feel like it really is quite intuitive and a proper use of getindex/setindex, so I cannot really think of a reason not to do this, as we also do it for the fusiontrees

src/tensors/tensor.jl Outdated Show resolved Hide resolved
src/tensors/tensor.jl Outdated Show resolved Hide resolved
Comment on lines 606 to 607
T = promote_type(scalartype(TT₁), scalartype(TT₂))
A = promote_type(storagetype(TT₁), storagetype(TT₂))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this can ever pose problems, but we could ensure compatibility between T and A by:

Suggested change
T = promote_type(scalartype(TT₁), scalartype(TT₂))
A = promote_type(storagetype(TT₁), storagetype(TT₂))
A = promote_type(storagetype(TT₁), storagetype(TT₂))
T = scalartype(A)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neither of the two are probably a good idea:

julia> promote_type(Vector{Float64}, Vector{ComplexF32})
Vector (alias for Array{T, 1} where T)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://discourse.julialang.org/t/type-promototion-of-vectors/30227/6
https://github.com/JuliaLang/julia/blob/e52a46c5c192bfe16853ae0b63ac33b280fba063/base/range.jl#L1282

It seems like this is not really something that the Base promotion system is designed to resolve. From
what I can find, the current behaviour is to only promote if at least one of the eltypes would not change, which is probably not what we want.

I would maybe suggest inserting our own promote_storagetype, which defaults to Vector{promote_scalartype} and we can then further specialise however we want? It seems like this is something that would be resolved with the new Memory types too, as we can then just hardcode that and only have to deal with the different addrspace implementations for GPU/CPU...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see no logic in the response of StefanKarpinski to Matt's question. Maybe we can just use VectorInterface.promote_add(A1, A2)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That indeed seems to work, I think that should be fine. It does rely on type inference, but I guess we can see if it causes any issues and still change it if we have to?

@lkdvos
Copy link
Collaborator

lkdvos commented Sep 12, 2024

I wouldn't mind absorbing the normalization in the data, but I think in that case we should also consider changing the normalization of the fusiontrees, etc, to the isotopic normalization (I think that would be equivalent?) which is definitely not something I look forward to having to do. On the other hand, it does seem like the isotopic normalization convention appears somewhat more often in our use-cases (string-nets etc), so maybe it might be worth it...

@Jutho
Copy link
Owner Author

Jutho commented Sep 12, 2024

Another question that comes up as I continue with this PR is that all the linear algebra factorisations and things like applying functions to 'square' TensorMaps, current implementation works as:

  • apply operation to individual blocks
  • collect the output blocks (possibly of different output arguments, e.g. in the case of factorisations)
  • construct output tensormaps, thereby directly using the blocks as the actual data

With the new approach, you cannot construct the tensormap from the blocks in a data sharing way, so you allocate new data and copy the blocks into this. This causes (a potentially great deal of) extra allocations.

Maybe this can be resolved by writing mutating versions of these operations in MatrixAlgebra that accept the output blocks as arguments? I don't know if there are better suggestions?

@lkdvos
Copy link
Collaborator

lkdvos commented Sep 13, 2024

For the MatrixAlgebra, indeed, I cannot really think of a different solution. However, considering that LinearAlgebra does not expose this for these functions, it might be that the overhead of the extra copy is really just negligible for those operations? In any case, I think the code looks a bit nicer too if we don't have to first create all the block data and only then construct the output tensor, which also separates out the allocation and implementation cleanly.

for c in blocksectors(A)
    eig!((block(V, c), block(D, c)), block(A, c)) 
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants