-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
index on path, workspace_name is not used in Client::getNode #151
Comments
We're on |
interesting find, thanks for reporting. i lack the knowhow of mysql details on this. is the forward slash problem a bug or some sort of feature? could we fix that with a schema change maybe? the escaping thing feels hacky to me unless this is not persisted by mysql... what exactly is stored in the db this way? and do the = queries still work with the \ or do you add \ also to the query? the "like" part is an optimization to support fetchDepth - one quickfix could be to avoid the like alltogether if fetchDepth is 0. does the depth check happen before the like or after? that might be more efficient too. |
@dbu This is the query I was playing with: SELECT * FROM phpcr_nodes EXPLAIN SELECT * FROM phpcr_nodes EXPLAIN SELECT * FROM phpcr_nodes Explain shows it wont use an index unless I add those slashes to :pathd ('\/routes') I was thinking, how much effort would it be if paths we're stored as URI, something like thanks, |
the drawback of having the URI is that it adds some overhead and we would need to update all existing database entries. it feals unelegant, but if we can't do anything else it could be a solution. i wonder if its mysql specific or could have to do with accidentally triggering some regular expression logic. also unsure whether it might just be a bug affect mysql up to some version... i asked around on twitter now, maybe we get some more inputs. |
if every field starts with / mysql will probably determine that the indexability sucks and use full table scan instead. |
@beberlei indeed every field of the path column will start with /. but why can that not be indexed? and could the same be indexed if we use a phpcr:// prefix? we don't talk full text search indexing or anything, just an index that would help with the equality and like queries. |
tried to get some input with a tweet. |
@ACSI-IT can you try using check if this is faster than without (where its not using the index) to see if mysql is indeed wrong for using a table scan |
@ACSI-IT i assume that a performance difference is only visible if there are a lot of nodes, on less than 100s of nodes, connection time and php bootstrap is probably larger than mysql not using the index. |
What is the type of index uses on path column? Maybe adding a length can help? |
try to remove |
Forcing the usage of a index is really not a good idea as it can slow down the query. I'm not a jackalope developer and I've never used it, thus I don't know the exact schema or the content. The thing is, when the index over How much data do you have in your database? I don't think the performance increases that much when you change the data value of path column that way. What cardinality and how much data do you have? |
Ah maybe its the OR. We remove that in #159 when its not needed. Is that better? |
@dbu, there's no benefit with omitting that I can't see anything that might break the usage of that index in this query, except the cardinality issue. However, that issue should be resolved itself when data amount (and therefore probably cardinality) increases. @ACSI-IT, how much data is in this table? With which data did you test that scenario? |
ping @ACSI-IT |
@marcj forcing the index may still make sense if we know that the MySQL query optimiser is wrong. then again of course this may change over time, meaning we may need to make this configurable somehow and we could then automatically set the option when generating the DIC. |
@dantleech another one about performance |
Jackalope\Transport\DoctrineDBAL\Client::getNode is doing a LIKE query on phpcr_nodes.path, but EXPLAIN shows it is not using the index on (path, workspace_name).
I traced this to the forward slashes in :pathd, adding (escaping) '\' before these forward slashes makes MySQL use its index. significantly improving performance on path queries.
see:
https://github.com/ACSI-IT/jackalope-doctrine-dbal/commit/95f73f01ab0cdc94121a94c4b9425621a5479ade
Not sure if this is the cleanest solution, could you provide some feedback.
Let me know if you want me to create a PR for this commit.
Axel
The text was updated successfully, but these errors were encountered: