-
Notifications
You must be signed in to change notification settings - Fork 1
feat: improve numeric ID handling in SQL functions #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit enhances the numeric ID handling in SQL functions to better support both plain numeric IDs and prefixed IDs (like "user:1"). The changes include: - Refactored the numeric value extraction logic to use a more robust cond statement - Added support for decimal numbers in plain numeric IDs - Improved regex patterns to handle prefixed IDs more reliably - Added comprehensive test coverage for the new functionality The changes ensure that numeric operations work correctly with both plain numeric IDs and prefixed IDs, making the SQL functions more flexible and reliable. Signed-off-by: Avelino <[email protected]>
📝 WalkthroughWalkthroughThe changes update numeric value extraction, document retrieval, and query processing in the SQL execution and storage modules. The numeric extraction logic now uses a Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant SQLHandler as SQL Query Handler
participant Storage
Client->>SQLHandler: Submit SQL Query
Note right of SQLHandler: Parse query type (SELECT, UPDATE, DELETE)
alt SELECT Query
SQLHandler->>Storage: Retrieve document by ID
Storage-->>SQLHandler: Return document with _table info
SQLHandler->>Client: Return filtered columns (omit :_table)
else UPDATE Query
SQLHandler->>SQLHandler: Validate table name & WHERE conditions
SQLHandler->>Storage: Update document
Storage-->>SQLHandler: Acknowledge update
SQLHandler->>Client: Send update confirmation
else DELETE Query
SQLHandler->>SQLHandler: Verify valid table and WHERE clause format
SQLHandler->>Storage: Delete document
Storage-->>SQLHandler: Acknowledge deletion
SQLHandler->>Client: Send deletion confirmation
end
Poem
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
src/chrondb/tools/migrator.clj (3)
39-41: Simplify the migration process implementation.The body of the
doseqloop has been effectively disabled with anilvalue, but unused bindings and structure remain. Since the migration logic is now obsolete, consider one of these options:- (doseq [doc table-docs] - ; Body removed as migration logic is obsolete and caused errors - nil) + ; Migration logic removed as it's obsolete and caused errors + nilOr to address the clj-kondo warnings:
- (let [all-docs (storage/get-documents-by-prefix storage "") - ; Filters documents that seem to belong to this table based on fields - table-docs (filter (fn [doc] - (and (not (str/includes? (:id doc) (str table-name ":"))) - (every? #(contains? doc %) table-fields))) - all-docs) - migrated-count (atom 0)] - - (log/log-info (str "Found " (count table-docs) " documents to migrate")) - - (doseq [doc table-docs] - ; Body removed as migration logic is obsolete and caused errors - nil) + (let [migrated-count (atom 0)] + (log/log-info "Migration functionality has been deprecated")🧰 Tools
🪛 GitHub Actions: Lint project with clj-kondo
[warning] 39-39: unused binding doc
4-5: Remove unused namespace import.The
chrondb.index.protocolnamespace is imported but never used, as flagged by clj-kondo.(ns chrondb.tools.migrator (:require [chrondb.storage.protocol :as storage] [chrondb.storage.git :as git] - [chrondb.index.lucene :as lucene] - [chrondb.index.protocol :as index] + [chrondb.index.lucene :as lucene] [chrondb.util.logging :as log] [clojure.string :as str]))🧰 Tools
🪛 GitHub Actions: Lint project with clj-kondo
[warning] 5-5: namespace chrondb.index.protocol is required but never used
26-26: Remove unused binding.The
indexparameter is defined but never used in themigrate-idsfunction, as flagged by clj-kondo.-[storage index table-name table-fields] +[storage _index table-name table-fields]Or remove it entirely if the parameter is no longer needed:
-[storage index table-name table-fields] +[storage table-name table-fields]🧰 Tools
🪛 GitHub Actions: Lint project with clj-kondo
[warning] 26-26: unused binding index
src/chrondb/api/sql/execution/functions_test.clj (2)
6-13: Expand test coverage for additional ID cases.It’s good to have a varied data set. Consider adding coverage for decimal or prefixed IDs in future tests to ensure more robust numeric extraction testing scenarios.
15-22: Test checks numeric extraction accurately but could include edge cases.The test successfully confirms the sum of plain IDs. Consider adding negative or floating IDs (e.g.,
"5.5") to validate decimal parsing.src/chrondb/api/sql/execution/query.clj (1)
343-347: Consider capturing more columns in the fallback.Currently, if more than two values are provided in the fallback, they’re not stored. If multi-column inserts might occur, consider an extended approach.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
src/chrondb/api/sql/execution/functions.clj(1 hunks)src/chrondb/api/sql/execution/functions_test.clj(1 hunks)src/chrondb/api/sql/execution/query.clj(7 hunks)src/chrondb/api/sql/sql_query_test.clj(1 hunks)src/chrondb/storage/git.clj(2 hunks)src/chrondb/storage/memory.clj(1 hunks)src/chrondb/tools/migrator.clj(2 hunks)test/chrondb/api/sql/sql_query_test.clj(1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Lint project with clj-kondo
src/chrondb/tools/migrator.clj
[warning] 5-5: namespace chrondb.index.protocol is required but never used
[warning] 26-26: unused binding index
[warning] 39-39: unused binding doc
src/chrondb/api/sql/sql_query_test.clj
[warning] 14-14: Unresolved namespace storage-protocol. Are you missing a require?
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: test
- GitHub Check: test
🔇 Additional comments (16)
src/chrondb/tools/migrator.clj (1)
9-15: Legacy migration code has been properly deprecated.The commented-out
ensure-table-prefixfunction is correctly identified as no longer needed. This aligns with the PR objective to improve numeric ID handling without requiring table prefixes.test/chrondb/api/sql/sql_query_test.clj (1)
27-36: Improved document structure with explicit table identification.The refactoring properly extracts the table name from the ID and adds it to the document structure as
:_table. This change aligns with the PR objective to improve ID handling and creates a more explicit relationship between documents and their tables.src/chrondb/api/sql/sql_query_test.clj (2)
1-4: Well-structured helper function.The
get-writer-outputfunction provides a clean way to extract output from the writer map, which will be helpful for testing SQL query outputs.
5-14: Improved test data preparation with plain numeric IDs.This implementation correctly aligns with the PR objective by using plain numeric IDs and explicitly setting the
:_tablefield. The approach is cleaner than the previous implementation which used prefixed IDs.🧰 Tools
🪛 GitHub Actions: Lint project with clj-kondo
[warning] 14-14: Unresolved namespace storage-protocol. Are you missing a require?
src/chrondb/storage/git.clj (1)
351-351: Enhanced document retrieval by table.The document retrieval logic has been improved to use the explicit
:_tablefield instead of checking ID prefixes. This change:
- Makes the table relationship more explicit
- Removes the coupling between ID format and table identification
- Supports the PR's goal of handling both plain and prefixed numeric IDs
The implementation is more robust and aligns with modern document database patterns of explicit field-based filtering rather than string prefix matching.
Also applies to: 371-371
src/chrondb/storage/memory.clj (1)
38-40: Simplify retrieval logic by checking:_tableinstead of relying on prefixed IDs.No immediate concerns with this approach. It improves code clarity and consistency with the rest of the system where
:_tableis used as the canonical identifier.src/chrondb/api/sql/execution/functions_test.clj (1)
1-4: Appropriate test setup and namespace usage.The initial namespace declaration and required libraries are well-structured, and
approximately=suits floating-point comparisons. Good practice.src/chrondb/api/sql/execution/functions.clj (1)
45-55:Details
❓ Verification inconclusive
Validate large or invalid numeric strings to avoid
NumberFormatException.The
condlogic is clean. However, usingDouble/parseDoublemay lead to unexpected results for very large integers or special values likeInfinity. If these edge cases are likely, consider a safer numeric parsing or additional validation.
🏁 Script executed:
#!/bin/bash # Check for usage of parseDouble across the code to see if there's extended handling or warnings for large values or Infinity rg 'parseDouble' -A 5Length of output: 1507
Action Required: Validate and Sanitize Numeric Input Edge Cases
The logic using
Double/parseDoublein the conditional block is clear and benefits from the surrounding try-catch (returningnilon exceptions). However, as noted, there remains a potential issue: very large numeric strings might not throw an exception but could yield imprecise values or special results (e.g.,Infinity). If such extreme inputs are possible in practice, please consider explicitly validating the parsed result (for instance, checking if it is finite and within an acceptable range) or adopting a more robust numeric parsing method.
- File:
src/chrondb/api/sql/execution/functions.clj(Lines 45-55)- Suggestion: After parsing, verify that the resulting value is neither
Infinitynor imprecise for the expected numeric bounds.src/chrondb/api/sql/execution/query.clj (8)
270-290: Exclude:_tablefrom returned columns for clarity.Filtering out
_tableis consistent with user-facing data. The logic for mapping columns is straightforward, and the fallback plan in thecatchblock is well-handled.
323-323: Retain consistent naming fordoc-id.Assigning
doc-iddirectly toraw-idpreserves clarity. This matches the updated prefix-free logic.
333-334: Correctly handle the case when the first column isid.Conditionally skipping
(rest clean-columns)and(rest values)ensures the proper pairing of column-value pairs.
341-341: Same doc ID assignment pattern as above.Maintains consistency with the approach in the earlier block.
370-376: Logging improvements and quote-stripping for updates.Logging the table name is good for clarity. Quote-stripping is consistent with insert logic. Potentially confirm that special characters in IDs or fields are acceptable.
382-385: Comprehensive check for valid fields in UPDATE query.Good job ensuring the table name, WHERE conditions, and update values are all present before proceeding.
427-472: Restrictive delete logic ensures user provides ID and correct table.This chunk enforces table name and
'WHERE id = ...'usage. If multi-condition deletes are needed, that could be an enhancement. Otherwise, this is a clear and protective approach.
15-16:Details
✅ Verification successful
Verify that quote-stripping for the ID doesn't cause conflicts.
Stripping
'and"from the ID is logical if IDs are always quoted in the query syntax. If IDs might legitimately contain quotes or special characters, confirm that stripping them won’t break legitimate cases.
🏁 Script executed:
#!/bin/bash # Search for usage of handle-select or get-documents-by-id to see all call sites and confirm consistent quoting. rg 'handle-select|get-documents-by-id' -A 5Length of output: 11189
Final Decision: The quote-stripping is safe in the current context.
After reviewing all call sites and test cases (especially the "test-handle-select-by-id" test) from the codebase, it appears that IDs are consistently passed without embedded quotes. The test cases currently supply unquoted IDs (e.g.,
"user:1"), which confirms that the quote-stripping performed inget-documents-by-iddoes not cause conflicts.If, however, future queries begin allowing IDs that legitimately include quotes or other escapeable characters, please review and adjust the handling accordingly.
| (defn prepare-test-data [storage] | ||
| (let [table-name "test"] ; Define table name for _table field | ||
| (doseq [id ["1" "2" "3"]] ; Use plain IDs | ||
| (let [num (Integer/parseInt id) ; Parse the plain ID | ||
| doc {:id id ; Use plain ID | ||
| :_table table-name ; Add _table field | ||
| :nome (str "Item " num) | ||
| :valor (* num 10) | ||
| :ativo (odd? num)}] | ||
| (storage-protocol/save-document storage doc))))) ; Save the updated doc structure No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed issue: Missing namespace import.
The function uses storage-protocol/save-document but the namespace is not imported, as flagged by the pipeline failure.
Add the necessary import at the top of the file:
(ns chrondb.api.sql.sql-query-test
+ (:require [chrondb.storage.protocol :as storage-protocol]))
(defn get-writer-output [writer-map]
(.flush (:writer writer-map))
(str (.getBuffer (:string-writer writer-map))))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| (defn prepare-test-data [storage] | |
| (let [table-name "test"] ; Define table name for _table field | |
| (doseq [id ["1" "2" "3"]] ; Use plain IDs | |
| (let [num (Integer/parseInt id) ; Parse the plain ID | |
| doc {:id id ; Use plain ID | |
| :_table table-name ; Add _table field | |
| :nome (str "Item " num) | |
| :valor (* num 10) | |
| :ativo (odd? num)}] | |
| (storage-protocol/save-document storage doc))))) ; Save the updated doc structure | |
| (ns chrondb.api.sql.sql-query-test | |
| (:require [chrondb.storage.protocol :as storage-protocol])) | |
| (defn get-writer-output [writer-map] | |
| (.flush (:writer writer-map)) | |
| (str (.getBuffer (:string-writer writer-map)))) | |
| (defn prepare-test-data [storage] | |
| (let [table-name "test"] ; Define table name for _table field | |
| (doseq [id ["1" "2" "3"]] ; Use plain IDs | |
| (let [num (Integer/parseInt id) ; Parse the plain ID | |
| doc {:id id ; Use plain ID | |
| :_table table-name ; Add _table field | |
| :nome (str "Item " num) | |
| :valor (* num 10) | |
| :ativo (odd? num)}] | |
| (storage-protocol/save-document storage doc))))) ; Save the updated doc structure |
🧰 Tools
🪛 GitHub Actions: Lint project with clj-kondo
[warning] 14-14: Unresolved namespace storage-protocol. Are you missing a require?
This commit moves test utility functions from sql_query_test.clj to a new dedicated namespace test_util.clj to improve code organization and reusability. The changes include: - Creating new test_util.clj file with utility functions - Moving get-writer-output and prepare-test-data functions - Updating imports in affected test files - Removing the old sql_query_test.clj file This refactoring makes the test utilities more accessible across different test files and follows better code organization practices. Signed-off-by: Avelino <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/chrondb/api/sql/test_util.clj (1)
11-22: Consider adding error handling for the test data creation function.The function creates test data with plain numeric IDs, which supports the PR objective of improving numeric ID handling in SQL functions. However, it lacks error handling for potential issues.
Consider adding error handling for:
- Integer parsing failures
- Null storage parameter
- Document saving failures
(defn create-test-data-with-plain-ids "Prepares test data with plain IDs and _table field" [storage] + (when-not storage + (throw (IllegalArgumentException. "Storage cannot be nil"))) (let [table-name "test"] (doseq [id ["1" "2" "3"]] - (let [num (Integer/parseInt id) - doc {:id id - :_table table-name - :nome (str "Item " num) - :valor (* num 10) - :ativo (odd? num)}] - (storage/save-document storage doc))))) + (try + (let [num (Integer/parseInt id) + doc {:id id + :_table table-name + :nome (str "Item " num) + :valor (* num 10) + :ativo (odd? num)}] + (storage/save-document storage doc)) + (catch NumberFormatException e + (println (str "Failed to parse ID: " id " - " (.getMessage e)))) + (catch Exception e + (println (str "Failed to save document for ID: " id " - " (.getMessage e))))))))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (4)
src/chrondb/api/sql/test_util.clj(1 hunks)src/chrondb/tools/migrator.clj(2 hunks)test/chrondb/api/sql/execution/functions_test.clj(3 hunks)test/chrondb/api/sql/sql_query_test.clj(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- src/chrondb/tools/migrator.clj
- test/chrondb/api/sql/sql_query_test.clj
🔇 Additional comments (8)
src/chrondb/api/sql/test_util.clj (3)
1-3: Clear namespace definition with appropriate documentation.The namespace is well-defined with a descriptive docstring that clearly explains its purpose for testing SQL functionality.
5-9: The writer output utility function is well implemented.This helper function correctly retrieves the output from a writer map by flushing the writer and converting the buffer to a string, which is useful for testing SQL output.
14-22: Test data creation aligns well with PR objectives.This function appropriately creates test data with plain numeric IDs, which directly supports the PR objective of "improving the handling of numeric IDs within SQL functions". The created test documents contain a mixture of string IDs, numeric values, and boolean fields which will help test various SQL operations.
test/chrondb/api/sql/execution/functions_test.clj (5)
36-36: Good test function name update.The renamed test function
test-execute-aggregate-function-advancedclearly indicates that it tests more advanced scenarios for aggregate functions.
53-53: Using approx-equals improves floating-point comparison accuracy.Replacing the previous equality check with
approx-equalsis the correct approach for floating-point comparisons and prevents test flakiness due to floating-point precision issues.
56-56: Improved precision handling for min function test.Using
approx-equalsfor the min function test ensures accurate comparison of floating-point results.
59-59: Improved precision handling for max function test.Using
approx-equalsfor the max function test ensures accurate comparison of floating-point results.
61-62: Numeric extraction from ID field test validates PR objective.This test specifically validates that the aggregate function can extract and sum numeric values from the
idfield, which directly tests the PR objective of "improving the handling of numeric IDs within SQL functions". The test uses prefixed IDs (e.g., "user:1") which confirms support for the enhanced extraction logic mentioned in the PR objectives.
| (defn- approx-equals | ||
| "Compare two numbers for approximate equality within a small delta" | ||
| [a b] | ||
| (let [delta 0.0001] | ||
| (< (Math/abs (- a b)) delta))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve error handling in the approx-equals function.
The new approx-equals function is a good addition for comparing floating-point numbers, but it lacks handling for edge cases like null inputs or non-numeric values.
(defn- approx-equals
"Compare two numbers for approximate equality within a small delta"
[a b]
+ (when (or (nil? a) (nil? b))
+ (throw (IllegalArgumentException. "Cannot compare nil values")))
(let [delta 0.0001]
(< (Math/abs (- a b)) delta)))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| (defn- approx-equals | |
| "Compare two numbers for approximate equality within a small delta" | |
| [a b] | |
| (let [delta 0.0001] | |
| (< (Math/abs (- a b)) delta))) | |
| (defn- approx-equals | |
| "Compare two numbers for approximate equality within a small delta" | |
| [a b] | |
| (when (or (nil? a) (nil? b)) | |
| (throw (IllegalArgumentException. "Cannot compare nil values"))) | |
| (let [delta 0.0001] | |
| (< (Math/abs (- a b)) delta))) |
This commit enhances the numeric ID handling in SQL functions to better support both plain numeric IDs and prefixed IDs (like "user:1"). The changes include:
The changes ensure that numeric operations work correctly with both plain numeric IDs and prefixed IDs, making the SQL functions more flexible and reliable.
Summary by CodeRabbit
Refactor
Tests
Chores