Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove REL TABLE GROUP and make REL TABLE subsume it #4398

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

SterlingT3485
Copy link
Collaborator

@SterlingT3485 SterlingT3485 commented Oct 21, 2024

The high level idea is to remove the notion of REL TABLE GROUP and make REL TABLE subsume its functionality which is defining relationship between different pairs of node tables .

DDL

Allowing multiple FROM TO in CREATE REL TABLE

CREATE REL TABLE R (FROM N1 TO N1, FROM N1 TO N2,date STRING)
  • Internally, we create rel table group if found multiple from to pairs
  • R rel table group
  • R_N1_N1, R_N1_N2 rel table

All DDLs should work on both rel table & rel table group. Check for

  • Drop table
  • Rename table
  • Alter table drop/rename/add column
    Rel table group should just

Drop an internal table under rel table group should be protected.

DML

Bind rel table group as multi-labeled relationship.

SET & DELETE should work directly.

CREATE is allowed only if both src and destination node labels are given, e.g.

CREATE REL TABLE R (FROM N1 TO N1, FROM N1 TO N2)
Create (a) // Error
Create (a:N1)-[R]->(b:N1)

Query

Bind rel table group as multi-labeled relationship. All existing cases should pass.

MATCH (a)-[:R]->(b)

MATCH (a)-[:R_N1_N1|:R_N1_N2]

Copy

First we should provide backward compatibility meaning we still expose internal rel group table for copy, e.g.

Copy R FROM '' (from=N1, to=N1);

we should ask for two extra columns along side from, to to indicate node labels, e.g.

from, from_label, to, to_label
1, N1, 2, N1
1, N1, 2, N2
...

so that we can directly copy with

COPY R FROM

Workflows

  • - Change grammar and binds to rel table group entry if multiple FROM TO pairs.
  • - Merge the create rel table group grammar into create rel table. And we can remove the create rel table group grammar.
  • - Check all DDLs works with rel table group. Finish the #DDL section
  • - Finish #DML section. The only change should be checking the special case if src & dst node labels are given for a rel group relationship.
  • - Check all queries work as expected.
  • - Provide backward compatibility for COPY
  • - Support directly COPY into rel table groupwith from_label & to_label columns
  • - Test export/import rel table group between different version. Backward compatibility.
  • - show_tables() should not print rel

@SterlingT3485 SterlingT3485 force-pushed the sterling_rel_table branch 2 times, most recently from 834b66f to 91a1b58 Compare October 21, 2024 19:45
@SterlingT3485 SterlingT3485 marked this pull request as draft October 21, 2024 19:47
Copy link
Contributor

@andyfengHKU andyfengHKU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool I think we are on the same page. Let's discuss one more time when we wanna support the second type of Copy for rel group.

scripts/antlr4/generate_grammar.cmake Outdated Show resolved Hide resolved
Copy link

Benchmark Result

Master commit hash: b2005748bc73dd9323b86a5a1b88d64294797f83
Branch commit hash: ee07aee66198e9904875d8b172d4b9e6d5aa980a

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 658.02 635.23 22.79 (3.59%)
aggregation q28 11740.99 11743.78 -2.79 (-0.02%)
filter q14 141.00 122.51 18.49 (15.10%)
filter q15 145.62 118.52 27.10 (22.87%)
filter q16 321.52 296.68 24.84 (8.37%)
filter q17 464.06 438.50 25.56 (5.83%)
filter q18 1955.56 1966.88 -11.32 (-0.58%)
filter zonemap-node 102.64 77.87 24.77 (31.81%)
filter zonemap-node-lhs-cast 102.89 78.68 24.22 (30.78%)
filter zonemap-rel 5487.64 5616.89 -129.26 (-2.30%)
fixed_size_expr_evaluator q07 561.03 533.77 27.26 (5.11%)
fixed_size_expr_evaluator q08 772.68 749.43 23.25 (3.10%)
fixed_size_expr_evaluator q09 772.54 745.99 26.55 (3.56%)
fixed_size_expr_evaluator q10 254.85 233.54 21.31 (9.13%)
fixed_size_expr_evaluator q11 249.33 224.18 25.15 (11.22%)
fixed_size_expr_evaluator q12 248.45 224.48 23.98 (10.68%)
fixed_size_expr_evaluator q13 1492.46 1463.62 28.84 (1.97%)
fixed_size_seq_scan q23 136.62 110.08 26.54 (24.11%)
join q29 620.36 612.03 8.33 (1.36%)
join q30 1439.45 1437.68 1.77 (0.12%)
join q31 8.98 12.15 -3.16 (-26.03%)
ldbc_snb_ic q35 465.41 485.76 -20.35 (-4.19%)
ldbc_snb_ic q36 28.43 25.12 3.31 (13.18%)
ldbc_snb_is q32 80.31 84.88 -4.57 (-5.39%)
ldbc_snb_is q33 14.11 13.79 0.32 (2.33%)
ldbc_snb_is q34 158.83 135.49 23.34 (17.22%)
multi-rel multi-rel-large-scan 1786.77 1752.55 34.22 (1.95%)
multi-rel multi-rel-lookup 32.62 39.76 -7.14 (-17.96%)
multi-rel multi-rel-small-scan 96.68 90.25 6.43 (7.12%)
order_by q25 152.27 133.61 18.66 (13.96%)
order_by q26 476.51 441.47 35.04 (7.94%)
order_by q27 1466.47 1456.93 9.54 (0.65%)
scan_after_filter q01 186.77 162.13 24.64 (15.20%)
scan_after_filter q02 171.39 149.72 21.68 (14.48%)
shortest_path_ldbc100 q37 3598.00 3506.85 91.15 (2.60%)
shortest_path_ldbc100 q38 60.27 48.11 12.16 (25.28%)
shortest_path_ldbc100 q39 53.20 60.73 -7.54 (-12.41%)
shortest_path_ldbc100 q40 67.08 65.67 1.41 (2.15%)
var_size_expr_evaluator q03 2083.46 2051.70 31.77 (1.55%)
var_size_expr_evaluator q04 2247.64 2296.80 -49.16 (-2.14%)
var_size_expr_evaluator q05 2672.00 2656.51 15.49 (0.58%)
var_size_expr_evaluator q06 1347.35 1336.68 10.67 (0.80%)
var_size_seq_scan q19 1473.35 1462.70 10.65 (0.73%)
var_size_seq_scan q20 2576.55 2699.74 -123.19 (-4.56%)
var_size_seq_scan q21 2297.58 2307.55 -9.97 (-0.43%)
var_size_seq_scan q22 133.18 127.60 5.58 (4.37%)

Copy link

Benchmark Result

Master commit hash: f9155a74c59d51ba9e99a7f311607f0e5797429d
Branch commit hash: 357c4753e0ce7e32cb6ec688f9322c22cafa4cbf

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 640.23 635.60 4.63 (0.73%)
aggregation q28 11858.01 11636.12 221.89 (1.91%)
filter q14 125.61 119.00 6.61 (5.56%)
filter q15 123.58 119.50 4.08 (3.42%)
filter q16 299.08 296.97 2.11 (0.71%)
filter q17 444.36 438.50 5.86 (1.34%)
filter q18 1979.60 1929.52 50.08 (2.60%)
filter zonemap-node 86.33 77.90 8.43 (10.82%)
filter zonemap-node-lhs-cast 88.35 78.18 10.17 (13.01%)
filter zonemap-rel 5471.40 5421.92 49.48 (0.91%)
fixed_size_expr_evaluator q07 547.20 536.39 10.81 (2.02%)
fixed_size_expr_evaluator q08 758.57 753.42 5.15 (0.68%)
fixed_size_expr_evaluator q09 757.28 750.19 7.09 (0.94%)
fixed_size_expr_evaluator q10 242.36 231.24 11.12 (4.81%)
fixed_size_expr_evaluator q11 237.07 224.96 12.10 (5.38%)
fixed_size_expr_evaluator q12 237.72 227.12 10.60 (4.67%)
fixed_size_expr_evaluator q13 1471.71 1471.20 0.51 (0.03%)
fixed_size_seq_scan q23 120.44 111.63 8.82 (7.90%)
join q29 637.96 641.57 -3.61 (-0.56%)
join q30 1404.12 1371.96 32.16 (2.34%)
join q31 11.57 10.15 1.42 (13.95%)
ldbc_snb_ic q35 413.80 402.24 11.55 (2.87%)
ldbc_snb_ic q36 31.82 31.71 0.11 (0.35%)
ldbc_snb_is q32 79.17 84.73 -5.56 (-6.57%)
ldbc_snb_is q33 16.55 16.92 -0.37 (-2.19%)
ldbc_snb_is q34 140.94 131.01 9.93 (7.58%)
multi-rel multi-rel-large-scan 1667.11 1581.83 85.28 (5.39%)
multi-rel multi-rel-lookup 33.59 20.03 13.56 (67.72%)
multi-rel multi-rel-small-scan 92.51 79.41 13.10 (16.50%)
order_by q25 128.96 122.49 6.48 (5.29%)
order_by q26 460.39 447.82 12.57 (2.81%)
order_by q27 1488.71 1432.18 56.53 (3.95%)
scan_after_filter q01 172.50 159.10 13.41 (8.43%)
scan_after_filter q02 155.79 150.43 5.36 (3.56%)
shortest_path_ldbc100 q37 3468.60 3429.94 38.67 (1.13%)
shortest_path_ldbc100 q38 64.57 62.93 1.64 (2.61%)
shortest_path_ldbc100 q39 55.97 52.71 3.25 (6.17%)
shortest_path_ldbc100 q40 70.67 78.79 -8.12 (-10.30%)
var_size_expr_evaluator q03 2098.05 2047.48 50.57 (2.47%)
var_size_expr_evaluator q04 2279.62 2219.59 60.04 (2.70%)
var_size_expr_evaluator q05 2642.87 2532.77 110.10 (4.35%)
var_size_expr_evaluator q06 1345.41 1313.71 31.70 (2.41%)
var_size_seq_scan q19 1498.70 1439.68 59.02 (4.10%)
var_size_seq_scan q20 2545.53 2527.72 17.81 (0.70%)
var_size_seq_scan q21 2305.10 2271.55 33.55 (1.48%)
var_size_seq_scan q22 128.88 123.07 5.81 (4.72%)

Copy link

Benchmark Result

Master commit hash: f9155a74c59d51ba9e99a7f311607f0e5797429d
Branch commit hash: 566661b93cfa3dde4644debdc24b99e9ef79fd2f

Query Group Query Name Mean Time - Commit (ms) Mean Time - Master (ms) Diff
aggregation q24 643.20 635.60 7.60 (1.20%)
aggregation q28 11637.51 11636.12 1.39 (0.01%)
filter q14 129.75 119.00 10.75 (9.03%)
filter q15 143.86 119.50 24.36 (20.38%)
filter q16 302.85 296.97 5.88 (1.98%)
filter q17 452.01 438.50 13.51 (3.08%)
filter q18 1985.75 1929.52 56.24 (2.91%)
filter zonemap-node 88.00 77.90 10.10 (12.97%)
filter zonemap-node-lhs-cast 88.56 78.18 10.38 (13.28%)
filter zonemap-rel 5526.77 5421.92 104.84 (1.93%)
fixed_size_expr_evaluator q07 557.73 536.39 21.35 (3.98%)
fixed_size_expr_evaluator q08 774.81 753.42 21.39 (2.84%)
fixed_size_expr_evaluator q09 770.91 750.19 20.72 (2.76%)
fixed_size_expr_evaluator q10 245.70 231.24 14.46 (6.25%)
fixed_size_expr_evaluator q11 240.71 224.96 15.75 (7.00%)
fixed_size_expr_evaluator q12 240.12 227.12 13.00 (5.73%)
fixed_size_expr_evaluator q13 1493.31 1471.20 22.11 (1.50%)
fixed_size_seq_scan q23 118.39 111.63 6.76 (6.06%)
join q29 620.09 641.57 -21.48 (-3.35%)
join q30 1453.12 1371.96 81.16 (5.92%)
join q31 7.68 10.15 -2.47 (-24.34%)
ldbc_snb_ic q35 415.01 402.24 12.77 (3.17%)
ldbc_snb_ic q36 32.07 31.71 0.35 (1.12%)
ldbc_snb_is q32 85.21 84.73 0.48 (0.57%)
ldbc_snb_is q33 16.49 16.92 -0.44 (-2.57%)
ldbc_snb_is q34 140.60 131.01 9.60 (7.32%)
multi-rel multi-rel-large-scan 1756.55 1581.83 174.72 (11.05%)
multi-rel multi-rel-lookup 30.93 20.03 10.90 (54.45%)
multi-rel multi-rel-small-scan 58.82 79.41 -20.59 (-25.92%)
order_by q25 134.78 122.49 12.29 (10.03%)
order_by q26 459.06 447.82 11.24 (2.51%)
order_by q27 1489.80 1432.18 57.62 (4.02%)
scan_after_filter q01 173.96 159.10 14.86 (9.34%)
scan_after_filter q02 157.42 150.43 7.00 (4.65%)
shortest_path_ldbc100 q37 3375.70 3429.94 -54.23 (-1.58%)
shortest_path_ldbc100 q38 52.26 62.93 -10.67 (-16.95%)
shortest_path_ldbc100 q39 53.12 52.71 0.41 (0.77%)
shortest_path_ldbc100 q40 64.91 78.79 -13.87 (-17.61%)
var_size_expr_evaluator q03 2089.38 2047.48 41.90 (2.05%)
var_size_expr_evaluator q04 2286.86 2219.59 67.28 (3.03%)
var_size_expr_evaluator q05 2637.20 2532.77 104.43 (4.12%)
var_size_expr_evaluator q06 1345.89 1313.71 32.18 (2.45%)
var_size_seq_scan q19 1486.38 1439.68 46.70 (3.24%)
var_size_seq_scan q20 2522.32 2527.72 -5.40 (-0.21%)
var_size_seq_scan q21 2303.83 2271.55 32.28 (1.42%)
var_size_seq_scan q22 130.57 123.07 7.50 (6.09%)

@SterlingT3485 SterlingT3485 marked this pull request as ready for review October 24, 2024 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants