-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MATCH Query Always Using Sequential Scan, Ignoring Index Scan #2137
Comments
You have to create index on the expression used by age to access a certain property.
If you want to utilize gin index, you need to use the filter like
Also, querying undirected paths can really slow down the performance, so consider using directed paths in MATCH clause wherever possible. I hope this helps. |
@MuhammadTahaNaveed Thanks for the reply, I believe having this documented would help the community. Also, is there a documentation for |
Hey @MuhammadTahaNaveed thank you for the quick response, we were able to successfully create and use index scan for one of the label properties for ex "NodeType1".name in this case. A few questions we still have are :
Its a bit complex but we were able to run the index scan for NodeType1 attribute1 and attribute2, but even after creating index on the other NodeType attributes which are also queried, they are running sequential scan. Is there something we are missing, do you have any suggestions as to what fields can be indexed here? I apologise if this is a very straightforward question but any inputs will be really helpful! Please let me know if you need any additional information. Thanks! |
@pritish-moharir You need to create index on id column of node label, and index on start_id and end_id of edge label. PR #2117 does that by default btw.
Also, I would like to reiterate that querying undirected paths can really slow down the performance, so consider using directed paths in MATCH clauses wherever possible. @neerajx86 The agtype_ functions are primarily intended for internal use. You can find a list of available functions in the documentation. The documentation is a bit outdated but covers most of the available functions. |
Hey @MuhammadTahaNaveed thanks again for the reply! We were able to create these indexes and the explain output is showing them being used. Also thanks for the suggestion about the directions, but for us, the data is in such a way that results and time taken with and without direction is the same. One more doubt we had was that even though our query is showing the indexes in explain output the actual query is taking a lot of time to converge. But for the same query if we specify max hop to be 1, i.e., Thanks again for all your help ! |
I have created the btree index and it works.
Thanks! |
When you did the LIMIT test, did you turn off sequential scans? If you didn't the reason might be this: https://stackoverflow.com/questions/8566931/index-not-used-when-limit-is-used-in-postgres |
What do u prefer, using gin index or btree on node lable? |
Hi everyone,
I'm encountering an issue with Apache AGE where a complex MATCH query always defaults to using a sequential scan, even though indexes exist on the queried columns. Disabling sequential scans via
SET enable_seqscan=off
has no effect, and the query plan i.e. explain analyze output continues to show a sequential scan.Query Example:
Here's a simplified version of the query we are using:
Data Setup:
We have populated the graph with data using queries like the following :
Problem:
The query plan indicates that a sequential scan is being used on NodeType1 and other nodes, despite indexes being present on attribute1 and attribute2. For performance, we expect the query to utilize the indexes for an index scan.
Observed Behavior:
The query consistently uses sequential scans.
Setting enable_seqscan = off doesn't change the behavior.
Expected Behavior:
The query should leverage the indexes on NodeType1.attribute1 and NodeType1.attribute2 to perform an index scan.
Environment Details:
We are running a containerised apache age docker image on k8s.
Apache AGE version: release_PG16_1.5.0
PostgreSQL version: 16
K8S Version: v1.29.6
What We've Tried:
Verified that the relevant indexes exist.
Created indexes on individual properties, e.g., attribute1 and attribute2.
CREATE INDEX idx_attribute1 ON graph_table USING btree ((properties->>'attribute1'));
CREATE INDEX idx_attribute2 ON graph_table USING btree ((properties->>'attribute2'));
Created indexes on the entire properties column for broader coverage.
CREATE INDEX idx_properties ON graph_table USING gin (properties);
Set enable_seqscan = off.
Rebuilt the indexes and reanalyzed the table using ANALYZE.
Simplified the query to test individual segments but observed the same issue.
The Merge queries are working as expected with indexes, as we see as significant difference in write latencies with and without indexes.
We have ingested a good amount of data (around 25k rows) so as to warrant an index scan.
Questions:
Why does the MATCH query ignore available indexes and use sequential scans?
Are there any specific configurations or query optimizations required to enable index scans in Apache AGE for graph queries?
Could this be a limitation or a bug in Apache AGE?
Any help or insights from the community would be greatly appreciated!
If additional information, logs, or examples are needed, please let me know.
Thank you!
The text was updated successfully, but these errors were encountered: