-
-
Notifications
You must be signed in to change notification settings - Fork 86
Index refactoring replication #2828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-Authored-By: Claude <[email protected]>
Co-Authored-By: Claude <[email protected]>
Co-Authored-By: Claude <[email protected]>
Summary of ChangesHello @lvca, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request undertakes a substantial refactoring of the index subsystem, primarily to enhance the management and replication of index metadata, with a strong focus on LSM Vector Indexes. By introducing a dedicated Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🧪 CI InsightsHere's what we observed from your CI run for e663a5a. 🟢 All jobs passed!But CI Insights is watching 👀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant and well-executed refactoring of the index management system, particularly for vector indexes. The introduction of the IndexMetadata class hierarchy and the redesigned IndexBuilder pattern greatly improve the code's structure, type safety, and maintainability. The changes to LSMVectorIndex make it more robust, especially in its compaction and page management logic. I've found a couple of minor issues to address, but overall, this is a high-quality contribution.
engine/src/main/java/com/arcadedb/engine/PaginatedComponent.java
Outdated
Show resolved
Hide resolved
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the index metadata handling architecture by introducing dedicated IndexMetadata and LSMVectorIndexMetadata classes to centralize metadata management. The refactoring replaces the previous LSMVectorIndexBuilder class with TypeLSMVectorIndexBuilder and BucketLSMVectorIndexBuilder, and updates all index implementations to use the new metadata pattern. Additionally, new replication tests are added for LSM vector indexes.
Key Changes:
- Introduced
IndexMetadataandLSMVectorIndexMetadataclasses for centralized metadata management - Refactored
TypeIndexBuilderandBucketIndexBuilderto use the builder pattern with proper type transitions for vector indexes - Replaced
LSMVectorIndexBuilderwithTypeLSMVectorIndexBuilderandBucketLSMVectorIndexBuilder - Updated
IndexInternalinterface with newgetMetadata()andsetMetadata()methods - Added comprehensive replication tests for vector index creation and compaction
Reviewed changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
engine/src/main/java/com/arcadedb/schema/IndexMetadata.java |
New base class for index metadata containing type name, property names, and bucket ID |
engine/src/main/java/com/arcadedb/schema/LSMVectorIndexMetadata.java |
New class extending IndexMetadata with vector-specific fields (dimensions, similarity, etc.) |
engine/src/main/java/com/arcadedb/schema/TypeLSMVectorIndexBuilder.java |
New builder for type-level LSM vector indexes replacing old LSMVectorIndexBuilder |
engine/src/main/java/com/arcadedb/schema/BucketLSMVectorIndexBuilder.java |
New builder for bucket-level LSM vector indexes |
engine/src/main/java/com/arcadedb/schema/TypeIndexBuilder.java |
Updated to use IndexMetadata and return appropriate builder types |
engine/src/main/java/com/arcadedb/index/vector/LSMVectorIndex.java |
Refactored to use LSMVectorIndexMetadata instead of individual fields |
engine/src/main/java/com/arcadedb/index/IndexInternal.java |
Updated interface with new metadata methods |
server/src/test/java/com/arcadedb/server/ha/IndexCompactionReplicationIT.java |
Added new tests for vector index replication and compaction |
engine/src/main/java/com/arcadedb/engine/PaginatedComponent.java |
Moved removeTempSuffix method to base class |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /* | ||
| * Copyright © 2021-present Arcade Data Ltd ([email protected]) | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copyright header should be placed at the beginning of the file, before the package statement. The standard Java convention is: copyright header → package → imports → class definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot open a new pull request to apply changes based on this feedback
| if (metadata.has("similarity")) | ||
| this.similarityFunction = VectorSimilarityFunction.valueOf(metadata.getString("similarity")); |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fromJSON method doesn't handle the case where metadata.getString("similarity") returns an invalid enum value. The valueOf() call will throw IllegalArgumentException if the string doesn't match any VectorSimilarityFunction enum constant. Consider adding proper error handling or validation.
| if (metadata.has("similarity")) | |
| this.similarityFunction = VectorSimilarityFunction.valueOf(metadata.getString("similarity")); | |
| if (metadata.has("similarity")) { | |
| String similarityStr = metadata.getString("similarity"); | |
| try { | |
| this.similarityFunction = VectorSimilarityFunction.valueOf(similarityStr); | |
| } catch (IllegalArgumentException e) { | |
| // Invalid enum value, fallback to default and log warning | |
| System.err.println("Warning: Invalid similarity function '" + similarityStr + "'. Using default: " + VectorSimilarityFunction.COSINE); | |
| this.similarityFunction = VectorSimilarityFunction.COSINE; | |
| } | |
| } |
engine/src/main/java/com/arcadedb/index/vector/LSMVectorIndex.java
Outdated
Show resolved
Hide resolved
| /* | ||
| * Copyright © 2021-present Arcade Data Ltd ([email protected]) | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. |
Copilot
AI
Nov 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copyright header should be placed at the beginning of the file, before the package statement. The standard Java convention is: copyright header → package → imports → class definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot open a new pull request to apply changes based on this feedback
engine/src/main/java/com/arcadedb/schema/BucketLSMVectorIndexBuilder.java
Show resolved
Hide resolved
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
…a.java Co-authored-by: Copilot <[email protected]>
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferences |
…lder.java Co-authored-by: Copilot <[email protected]>
…java Co-authored-by: Copilot <[email protected]>
…java Co-authored-by: Copilot <[email protected]>
…uilder.java Co-authored-by: Copilot <[email protected]>
….java" This reverts commit 076345d.
What does this PR do?
A brief description of the change being made with this pull request.
Motivation
What inspired you to submit this pull request?
Related issues
A list of issues either fixed, containing architectural discussions, otherwise relevant
for this Pull Request.
Additional Notes
Anything else we should know when reviewing?
Checklist
mvn clean packagecommand