Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Port] Have added Wilson's algorithm and loop erased random walks #24

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

etiennedeg
Copy link
Member

this is a port of #1576

@codecov
Copy link

codecov bot commented Nov 7, 2021

Codecov Report

Merging #24 (e3b67ed) into master (6ab2160) will decrease coverage by 0.01%.
The diff coverage is 97.91%.

@@            Coverage Diff             @@
##           master      #24      +/-   ##
==========================================
- Coverage   99.45%   99.44%   -0.02%     
==========================================
  Files         106      107       +1     
  Lines        5554     5602      +48     
==========================================
+ Hits         5524     5571      +47     
- Misses         30       31       +1     

@jpfairbanks
Copy link
Member

@nassarhuda, are you familiar with Wilson's algorithm? Could you take a look here?

function wilson_rst end
@traitfn function wilson_rst(g::AG::(!IsDirected),
distmx::AbstractMatrix{T}=weights(g);
seed::Int=-1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need a seed, when we can pass in a random generator?

Comment on lines +16 to +18
tree = wilson_rst(g)
probability_of_tree = prod([weights(g)[src(e), dst(e)] for e in tree])
probability_of_tree /= det(laplacian_matrix(g)[2:nv(g), 2:nv(g)]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a julia or jldoctest block here?

Comment on lines +31 to +32
start1 = rand(rng, 1:nv(g))
start2 = rand(rng, 1:nv(g))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
start1 = rand(rng, 1:nv(g))
start2 = rand(rng, 1:nv(g))
start1 = rand(rng, vertices(g))
start2 = rand(rng, vertices(g))

might make more sense, although at least for SimpleGraph, this will not be of type Int but of type eltype(g)

Comment on lines +141 to +142
seed::Int=-1,
rng::AbstractRNG=GLOBAL_RNG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seem here, do we need seed?

f::Union{Set{V},Vector{V}}=Set{Integer}(),
seed::Int=-1,
rng::AbstractRNG=GLOBAL_RNG
)::Vector{Int} where {T <: Real, U, AG <: AbstractGraph{U}, V <: Integer}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember ::SomeType after a function does actually force convert to be used on the result of that function.

@simonschoelly
Copy link
Member

@etiennedeg @gjh6 I think this is something useful to have. Given that this PR has been lying dormant for quite a while, there might be some things that you would do differently? Then maybe it is better, if you would refactor this before me or someone else continues with reviewing.

rng = getRNG(seed)
end

visited = Vector{Integer}(undef, 1)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vector should be typed with eltype(g) (passed by dynamic dispatch)

i += 1
end

length(f) == 0 || visited_view[cur_pos] in f || throw(ErrorException("termiating set was not reached"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
length(f) == 0 || visited_view[cur_pos] in f || throw(ErrorException("termiating set was not reached"))
length(f) == 0 || visited_view[cur_pos] in f || throw(ErrorException("terminating set was not reached"))

break
end
nbrs = neighbors(g, cur)
length(nbrs) == 0 && break
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this check only for the first vertex? Graph is undirected so if we move to an adjacent neighbor, this vertex must have at least one neighbor ?

end
nbrs = neighbors(g, cur)
length(nbrs) == 0 && break
wght = [distmx[cur, n] for n in nbrs]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
wght = [distmx[cur, n] for n in nbrs]
wght = @view distmx[cur, nbrs]

Maybe this is better?

Comment on lines +164 to +166
if v in visited_view
cur_pos = indexin(v, visited_view)[1]
visited_view = view(visited, 1:cur_pos)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if v in visited_view
cur_pos = indexin(v, visited_view)[1]
visited_view = view(visited, 1:cur_pos)
new_cur_pos = findfirst(v, visited_view)
if !isnothing(new_cur_pos)
cur_pos = new_cur_pos
visited_view = view(visited, 1:cur_pos)
end

end

visited = Vector{Integer}(undef, 1)
visited_view = view(visited, 1:1)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to carry a view? The only place where it is used is line 164-165 (and only in one line with my proposed change). The indexing of visited_view is the same as visited because it always start at the first index of visited, so in line 152, 156 and 178, we can replace visited_view by visited.

end

visited_vertices = Set(walk)
unvisited_vertices = setdiff(Set([i for i = 1:nv(g)]), visited_vertices)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is more efficient to just flag visited vertices (with a linear pass over walk) than to do complex set operations (which I think are not even linear...). We just maintain a BitVector is_visitedof size nv(g).
In loop_erased_randomwalk, e in f becomes is_visited[e], which is also more efficient.


walk = loop_erased_randomwalk(g, start1, distmx=distmx, f=[start2], rng=rng)

tree = SimpleGraph(nv(g))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tree is only used to together with add_edge!, which is a costly operation. It is more efficient to directly store the unique edges.

end

visited_vertices = Set(walk)
unvisited_vertices = setdiff(Set([i for i = 1:nv(g)]), visited_vertices)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
unvisited_vertices = setdiff(Set([i for i = 1:nv(g)]), visited_vertices)
unvisited_vertices = setdiff(Set(vertices(g)), visited_vertices)

@gdalle gdalle added the enhancement New feature or request label Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants