faster erdos_renyi #212

abraunst · 2023-01-07T14:53:53Z

To compare I've used

for N in (500, 5000), is_directed in (true, false), p in (10/N, 0.5, 1-10/N)
    @btime erdos_renyi($N, $p; is_directed=$is_directed)
end

That gives

634.683 μs (2601 allocations: 387.77 KiB)
  15.538 ms (5023 allocations: 8.35 MiB)
  20.433 ms (5023 allocations: 9.27 MiB)
  407.949 μs (1285 allocations: 187.91 KiB)
  9.689 ms (2517 allocations: 4.18 MiB)
  14.804 ms (2517 allocations: 4.64 MiB)
  7.411 ms (26507 allocations: 4.08 MiB)
  1.679 s (70029 allocations: 622.30 MiB)
  2.491 s (80029 allocations: 1.35 GiB)
  4.633 ms (13244 allocations: 2.04 MiB)
  1.066 s (35020 allocations: 311.18 MiB)
  1.666 s (40020 allocations: 691.04 MiB)

which seems uniformly better than master:

1.230 ms (2637 allocations: 356.81 KiB)
  68.167 ms (5003 allocations: 7.38 MiB)
  47.454 ms (7675 allocations: 7.74 MiB)
  583.637 μs (1307 allocations: 174.48 KiB)
  30.522 ms (2502 allocations: 3.69 MiB)
  22.270 ms (3823 allocations: 3.86 MiB)
  15.997 ms (26913 allocations: 3.66 MiB)
  42.191 s (70005 allocations: 526.66 MiB)
  10.789 s (107180 allocations: 1.17 GiB)
  7.240 ms (13430 allocations: 1.82 MiB)
  18.297 s (35003 allocations: 263.33 MiB)
  4.420 s (53510 allocations: 597.51 MiB)

codecov · 2023-01-07T15:10:05Z

Codecov Report

Merging #212 (1cace9f) into master (f6f8db6) will increase coverage by 0.14%.
The diff coverage is 100.00%.

❗ Current head 1cace9f differs from pull request most recent head 5e38794. Consider uploading reports for the commit 5e38794 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #212      +/-   ##
==========================================
+ Coverage   97.28%   97.43%   +0.14%     
==========================================
  Files         115      113       -2     
  Lines        6789     6543     -246     
==========================================
- Hits         6605     6375     -230     
+ Misses        184      168      -16

abraunst · 2023-01-07T15:38:59Z

It's a miracle that tests have passed because some of them depend on the specific random instance (that surely changed)...

abraunst · 2023-01-16T11:52:39Z

Is this what you proposed in #150 (comment) for n,p graphs, @simonschoelly ?

abraunst · 2023-05-10T07:10:20Z

Can I do something to make this more easily reviewable? The code essentially

(undirected case) samples a $p$-sparse binary vector of size $\binom{N}{2}$ and then maps it into the upper triangle of a $N\times N$ matrix
(directed case) samples a $p$-sparse binary vector of size $N(N-1)$ and then maps it into the non-diagonal elements of a $N\times N$ matrix

and then constructs the corresponding graph. The sampling of the binary vector is done by randsubseq which is by the way the function used for the same purpose by the SparseArrays standard library.

EDIT: and the speedup is somewhere between 2x and 18x in these tests...

gdalle · 2024-01-29T12:08:28Z

Sorry for the latency, and thanks for the contribution. @etiennedeg should we keep this one or #150?

@abraunst what kind of tests can guarantee correctness despite the randomness?

etiennedeg · 2024-01-29T13:29:50Z

More likely this one. I can try to review it this week (probably not before Wednesday, but I should be able to find some time)

abraunst · 2024-01-30T09:20:00Z

Sorry for the latency, and thanks for the contribution. @etiennedeg should we keep this one or #150?

@abraunst what kind of tests can guarantee correctness despite the randomness?

Maybe something like checking for average degree with a reasonable tolerance?

…do Garcia

etiennedeg · 2024-01-31T09:39:27Z

When I benchmark against the implementation proposed by @simonschoelly in #150, I get this (only for undirected graph because simon's code seems broken for directed graphs):

julia> for N in (500, 5000), is_directed in (false, ), p in (10/N, 0.5, 1-10/N)
           @btime erdos_renyi_ss($N, $p; is_directed=$is_directed)
       end
  164.086 μs (504 allocations: 77.45 KiB)
  2.410 ms (504 allocations: 1.02 MiB)
  2.538 ms (504 allocations: 1.95 MiB)
  1.927 ms (5007 allocations: 792.03 KiB)
  351.821 ms (10007 allocations: 95.79 MiB)
  435.251 ms (10007 allocations: 190.78 MiB)

julia> for N in (500, 5000), is_directed in (false, ), p in (10/N, 0.5, 1-10/N)
           @btime erdos_renyi_pr($N, $p; is_directed=$is_directed)
       end
  450.744 μs (1296 allocations: 191.47 KiB)
  10.534 ms (3017 allocations: 4.76 MiB)
  17.142 ms (3017 allocations: 5.69 MiB)
  5.269 ms (13225 allocations: 2.03 MiB)
  1.107 s (42520 allocations: 361.60 MiB)
  2.319 s (47520 allocations: 788.97 MiB)

I appreciate the conciseness of your code but simon's seems 4 times faster with 4 times less allocations.

Maybe something like checking for average degree with a reasonable tolerance?

That equivalent to check the number of edges, so it seems rather weak. Maybe something like degree distribution ?

abraunst · 2024-01-31T09:39:27Z

I've rebased

abraunst · 2024-01-31T09:41:37Z

When I benchmark against the implementation proposed by @simonschoelly in #150, I get this (only for undirected graph because simon's code seems broken for directed graphs):

julia> for N in (500, 5000), is_directed in (false, ), p in (10/N, 0.5, 1-10/N)
           @btime erdos_renyi_ss($N, $p; is_directed=$is_directed)
       end
  164.086 μs (504 allocations: 77.45 KiB)
  2.410 ms (504 allocations: 1.02 MiB)
  2.538 ms (504 allocations: 1.95 MiB)
  1.927 ms (5007 allocations: 792.03 KiB)
  351.821 ms (10007 allocations: 95.79 MiB)
  435.251 ms (10007 allocations: 190.78 MiB)

julia> for N in (500, 5000), is_directed in (false, ), p in (10/N, 0.5, 1-10/N)
           @btime erdos_renyi_pr($N, $p; is_directed=$is_directed)
       end
  450.744 μs (1296 allocations: 191.47 KiB)
  10.534 ms (3017 allocations: 4.76 MiB)
  17.142 ms (3017 allocations: 5.69 MiB)
  5.269 ms (13225 allocations: 2.03 MiB)
  1.107 s (42520 allocations: 361.60 MiB)
  2.319 s (47520 allocations: 788.97 MiB)

I appreciate the conciseness of your code but simon's seems 4 times faster with 4 times less allocations.

Maybe something like checking for average degree with a reasonable tolerance?

That equivalent to check the number of edges, so it seems rather weak. Maybe something like degree distribution ?

That looks rather unexpected to me. I'll have a better look at that.

EDIT: the problem most likely is that SimpleGraphFromIterator is slow because it cannot preallocate memory for the adjacency lists and just push!es, requiring three or four allocations per vertex instead of one.

gdalle · 2024-01-31T10:30:31Z

EDIT: the problem most likely is that SimpleGraphFromIterator is slow because it cannot preallocate memory for the adjacency lists and just push!es, requiring three or four allocations per vertex instead of one.

Could we sizehint! it if the iterator has a length?

abraunst · 2024-01-31T10:38:23Z

EDIT: the problem most likely is that SimpleGraphFromIterator is slow because it cannot preallocate memory for the adjacency lists and just push!es, requiring three or four allocations per vertex instead of one.

Could we sizehint! it if the iterator has a length?

seems difficult because these are per-vertex lists, whereas the iterator in question is on the full list of edges...

etiennedeg · 2024-01-31T10:50:15Z

When removing the SimpleGraphFromIterator call (and replacing it by an empty graph), I get

  1.550 ms (4314 allocations: 174.41 KiB)
  25.122 ms (64000 allocations: 2.94 MiB)
  46.153 ms (124133 allocations: 5.70 MiB)
  16.399 ms (73717 allocations: 2.20 MiB)
  3.212 s (38215347 allocations: 774.12 MiB)

abraunst · 2024-01-31T12:20:05Z

When removing the SimpleGraphFromIterator call (and replacing it by an empty graph), I get

  1.550 ms (4314 allocations: 174.41 KiB)
  25.122 ms (64000 allocations: 2.94 MiB)
  46.153 ms (124133 allocations: 5.70 MiB)
  16.399 ms (73717 allocations: 2.20 MiB)
  3.212 s (38215347 allocations: 774.12 MiB)

Are you sure? This seems too strange, The number of allocations should be about N. Cannot reproduce locally.

etiennedeg · 2024-01-31T13:50:46Z

Weird, done it again, I got:

  107.043 μs (505 allocations: 57.03 KiB)
  2.767 ms (505 allocations: 532.59 KiB)
  2.753 ms (505 allocations: 1004.34 KiB)
  1.091 ms (5006 allocations: 553.33 KiB)
  300.378 ms (5006 allocations: 48.11 MiB)
  321.689 ms (5006 allocations: 95.64 MiB)

I don't know how I got the previous results

etiennedeg · 2024-01-31T14:12:15Z

The biggest advantage of Simon's is that the neighbors are already sorted, so it is much less costly to build the adjacency lists. No matter what, with this approach, we are going to pay a cost to retrieve the adjacency lists.

abraunst · 2024-01-31T14:37:57Z

The biggest advantage of Simon's is that the neighbors are already sorted, so it is much less costly to build the adjacency lists. No matter what, with this approach, we are going to pay a cost to retrieve the adjacency lists.

The order is the same (Simon's code actually copies randsubseq which gives a sorted result). A modified SimpleGraphFromIterator already improves things significanly:

63.205 μs (505 allocations: 95.14 KiB)
  1.880 ms (505 allocations: 1.50 MiB)
  2.563 ms (505 allocations: 2.89 MiB)
  668.423 μs (5007 allocations: 955.06 KiB)
  236.858 ms (10007 allocations: 143.50 MiB)
  307.496 ms (10007 allocations: 286.03 MiB)

against Simon's

39.659 μs (504 allocations: 77.61 KiB)
  849.891 μs (504 allocations: 1.02 MiB)
  721.155 μs (504 allocations: 1.95 MiB)
  432.917 μs (5007 allocations: 791.56 KiB)
  120.317 ms (10007 allocations: 95.76 MiB)
  113.731 ms (10007 allocations: 190.78 MiB)

function SimpleGraphFromIterator(edges, nv)
    deg = zeros(Int, nv)
    ne = 0
    @inbounds for e in edges
        u, v = e.src, e.dst
        @assert (u,v) ⊂ eachindex(deg)
        deg[u] += 1
        deg[v] += 1
        ne += 1
    end

    fadjlist = zeros.(Int, deg)
    deg .= 0

    @inbounds for e in edges
        u, v = e.src, e.dst
        fadjlist[u][deg[u]+=1] = v
        fadjlist[v][deg[v]+=1] = u
    end
    return SimpleGraph(ne, fadjlist)
end
```

etiennedeg · 2024-01-31T14:47:06Z

Simon's, but replacing the call to _sample_with_replacement! by randsubseq!, so does not improve.

  217.053 μs (504 allocations: 77.73 KiB)
  3.721 ms (504 allocations: 1.02 MiB)
  4.865 ms (535 allocations: 2.00 MiB)
  2.517 ms (5023 allocations: 833.01 KiB)
  493.546 ms (10052 allocations: 95.83 MiB)
  642.595 ms (10055 allocations: 190.89 MiB)

abraunst · 2024-01-31T14:55:14Z

Simon's, but replacing the call to _sample_with_replacement! by randsubseq!, so does not improve.

  217.053 μs (504 allocations: 77.73 KiB)
  3.721 ms (504 allocations: 1.02 MiB)
  4.865 ms (535 allocations: 2.00 MiB)
  2.517 ms (5023 allocations: 833.01 KiB)
  493.546 ms (10052 allocations: 95.83 MiB)
  642.595 ms (10055 allocations: 190.89 MiB)

amazing... these are almost the same code!

EDIT: the cause is probably slowness in push! which supposedly will become fast in 1.11

etiennedeg · 2024-01-31T15:49:14Z

There is a gotcha in your implementation, the adjacency lists are not sorted... (This is a requirement for SimpleGraph). So you indeed have to pay more to sort the adjacency lists.

EDIT: If we change trianglemap to not use modulo offset by 1, then the edges are returned in an order similar to simon's.

abraunst · 2024-01-31T17:16:57Z

There is a gotcha in your implementation, the adjacency lists are not sorted... (This is a requirement for SimpleGraph). So you indeed have to pay more to sort the adjacency lists.

EDIT: If we change trianglemap to not use modulo offset by 1, then the edges are returned in an order similar to simon's.

oh I see. Indeed, this can be easily fixed. Thanks for all the feedback! I will try to come up with something cleaner in the next few days, but otherwise we can use some versione of Simon's code. 😄

etiennedeg · 2024-01-31T17:24:46Z

Well, in fact it was not only the mod1 that caused vertices to not be sorted. Using your old definition of trianglemap (which follows simon's ordering), I was able to cook something, but it is not quite as fast.

abraunst mentioned this pull request Jan 7, 2023

faster implementation of erdos_renyi #150

Open

abraunst force-pushed the faster-gnp branch from 1cace9f to fc90363 Compare May 14, 2023 09:08

gdalle added the enhancement New feature or request label Jun 16, 2023

abraunst added 4 commits January 31, 2024 09:48

faster erdos_renyi

8658e94

faster trianglemap, idea by Federico Florio, Gabriele Gatti and Ricar…

566d86c

…do Garcia

cosmetics

d1cdaee

cosmetics

5e38794

abraunst force-pushed the faster-gnp branch from fc90363 to 5e38794 Compare January 31, 2024 09:39

gdalle mentioned this pull request Mar 5, 2024

Add Bernoulli random graph #200

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

faster erdos_renyi #212

faster erdos_renyi #212

abraunst commented Jan 7, 2023 •

edited

Loading

codecov bot commented Jan 7, 2023 •

edited

Loading

abraunst commented Jan 7, 2023 •

edited

Loading

abraunst commented Jan 16, 2023

abraunst commented May 10, 2023 •

edited

Loading

gdalle commented Jan 29, 2024 •

edited

Loading

etiennedeg commented Jan 29, 2024

abraunst commented Jan 30, 2024

etiennedeg commented Jan 31, 2024

abraunst commented Jan 31, 2024

abraunst commented Jan 31, 2024 •

edited

Loading

gdalle commented Jan 31, 2024

abraunst commented Jan 31, 2024

etiennedeg commented Jan 31, 2024

abraunst commented Jan 31, 2024 •

edited

Loading

etiennedeg commented Jan 31, 2024

etiennedeg commented Jan 31, 2024

abraunst commented Jan 31, 2024

etiennedeg commented Jan 31, 2024 •

edited

Loading

abraunst commented Jan 31, 2024 •

edited

Loading

etiennedeg commented Jan 31, 2024 •

edited

Loading

abraunst commented Jan 31, 2024

etiennedeg commented Jan 31, 2024

faster erdos_renyi #212

Are you sure you want to change the base?

faster erdos_renyi #212

Conversation

abraunst commented Jan 7, 2023 • edited Loading

codecov bot commented Jan 7, 2023 • edited Loading

Codecov Report

abraunst commented Jan 7, 2023 • edited Loading

abraunst commented Jan 16, 2023

abraunst commented May 10, 2023 • edited Loading

gdalle commented Jan 29, 2024 • edited Loading

etiennedeg commented Jan 29, 2024

abraunst commented Jan 30, 2024

etiennedeg commented Jan 31, 2024

abraunst commented Jan 31, 2024

abraunst commented Jan 31, 2024 • edited Loading

gdalle commented Jan 31, 2024

abraunst commented Jan 31, 2024

etiennedeg commented Jan 31, 2024

abraunst commented Jan 31, 2024 • edited Loading

etiennedeg commented Jan 31, 2024

etiennedeg commented Jan 31, 2024

abraunst commented Jan 31, 2024

etiennedeg commented Jan 31, 2024 • edited Loading

abraunst commented Jan 31, 2024 • edited Loading

etiennedeg commented Jan 31, 2024 • edited Loading

abraunst commented Jan 31, 2024

etiennedeg commented Jan 31, 2024

abraunst commented Jan 7, 2023 •

edited

Loading

codecov bot commented Jan 7, 2023 •

edited

Loading

abraunst commented Jan 7, 2023 •

edited

Loading

abraunst commented May 10, 2023 •

edited

Loading

gdalle commented Jan 29, 2024 •

edited

Loading

abraunst commented Jan 31, 2024 •

edited

Loading

abraunst commented Jan 31, 2024 •

edited

Loading

etiennedeg commented Jan 31, 2024 •

edited

Loading

abraunst commented Jan 31, 2024 •

edited

Loading

etiennedeg commented Jan 31, 2024 •

edited

Loading