GraphScope/interactive_engine/benchmark at main · alibaba/GraphScope

History

Name		Name	Last commit message	Last commit date
parent directory ..
bin		bin
config		config
data		data
dbs/kuzu		dbs/kuzu
example		example
queries		queries
src/main		src/main
Makefile		Makefile
README.md		README.md
assembly.xml		assembly.xml
pom.xml		pom.xml
summer_code_report.md		summer_code_report.md

README.md

Benchmark Tool Usage

This directory contains a benchmarking tool for GraphScope GIE and other specified systems. It functions as multiple clients, sending queries through the engine's exposed endpoint or directly to the database, depending on the querying method for each system. The tool reports performance metrics such as latency, throughput, and query results. The benchmark program sends mixed queries to the server by reading query templates from queries with filling the parameters in the query templates using substitution_parameters. The program uses a round-robin strategy to iterate all the enabled queries with corresponding parameters.

Repository contents

- bin
    - bench.sh                          // script for running benchmark for queries
    - collect.sh                        // script for collecting benchmark results
- config                                
    - interactive-benchmark.properties  // configurations for running benchmark
- data
    - substitution_parameters           // query parameter files using to fill the query templates
    - expected_results                  // expected query results for the running queries 
- queries                               // query templates including LDBC queries, LSQB queries, Job queries, customized queries, etc.
- dbs                                   // Other graph systems for comparison. Currently, KuzuDB is supported.
- example                               // an example to compare GraphScope GIE and Kuzu
- src                                   // source code of benchmark program

Note: the queries here with the prefix ldbc_query are implementations of LDBC official interactive complex reads, the queries with the prefix bi_query are implementations of LDBC official business intelligence, the queries with the prefix lsqb_query are implementations of LDBC's labelled subgraph query benchmark, and the queries with the prefix job are the implementation of JOB Benchmark. The gremlin queries should be with suffix .gremlin, and cypher queries should be with suffix .cypher. The corresponding parameters (factor 1) for LDBC queries are generated by LDBC official tools.

Building

Build benchmark program using Maven:

mvn clean package

All the binary and queries would be packed into target/benchmark-0.0.1-SNAPSHOT-dist.tar.gz, and you can use deploy the package to anywhere could connect to the endpoint (which should be provided in interactive-benchmark.properties).

Running the benchmark

cd target
tar -xvf gaia-benchmark-0.0.1-SNAPSHOT-dist.tar.gz
cd gaia-benchmark-0.0.1-SNAPSHOT
./bin/bench.sh                        # run the benchmark program. You can also modify running configurations in config/interactive-benchmark.properties

With the example configuration file example/job_benchmark.properties, which compares GraphScope-GIE and KuzuDB while executing the JOB Benchmark, the results are as follows:

Start to benchmark system: GIE
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3638].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[266].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[3669].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[8603].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[613].
...
System: GIE; query count: 35; execute time(ms): xxx qps: xxx

Start to benchmark system: KuzuDb
QueryName[13a], Parameter[{}], ResultCount[1], ExecuteTimeMS[7068].
QueryName[32a], Parameter[{}], ResultCount[1], ExecuteTimeMS[253].
QueryName[9a], Parameter[{}], ResultCount[1], ExecuteTimeMS[5122].
QueryName[5c], Parameter[{}], ResultCount[1], ExecuteTimeMS[13623].
QueryName[3a], Parameter[{}], ResultCount[1], ExecuteTimeMS[4676].
...
System: KuzuDB; query count: 35; execute time(ms): xxx qps: xxx

Collecting the results

./bin/collect.sh                      # run the result collection program to collect the results and generate a performance comparison table

Furthermore, based on the benchmark results, the collected data and the final performance comparison table are as follows:

And the comparison result after collection is as follows:

QueryName	GIE Avg	GIE P50	GIE P90	GIE P95	GIE P99	GIE Count	KuzuDb Avg	KuzuDb P50	KuzuDb P90	KuzuDb P95	KuzuDb P99	KuzuDb Count
3a	613.00	613	613	613	613	1	4676.00	4676	4676	4676	4676	1
5c	8603.00	8603	8603	8603	8603	1	13623.00	13623	13623	13623	13623	1
9a	3669.00	3669	3669	3669	3669	1	5122.00	5122	5122	5122	5122	1
13a	3638.00	3638	3638	3638	3638	1	7068.00	7068	7068	7068	7068	1
32a	266.00	266	266	266	266	1	253.00	253	253	253	253	1

User-defined Benchmarking Queries

Users can add their own benchmarking queries to queries as well as adding substitution parameters of queries to substitution_parameters. Note that the file name of user-defined query templates should follow the prefix custom_query or custom_constant_query. The difference between custom_query and custom_constant_query is that the latter has no corresponding parameters.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark

benchmark

README.md

Benchmark Tool Usage

Repository contents

Building

Running the benchmark

Collecting the results

User-defined Benchmarking Queries

Files

benchmark

Directory actions

More options

Directory actions

More options

Latest commit

History

benchmark

Folders and files

parent directory

README.md

Benchmark Tool Usage

Repository contents

Building

Running the benchmark

Collecting the results

User-defined Benchmarking Queries