Skip to content

Conversation

@avelino
Copy link
Member

@avelino avelino commented Apr 11, 2025

This commit introduces several improvements to the SQL implementation:

  1. Protocol Layer:
  • Split protocol messages into separate reader/writer modules
  • Added new protocol core module for better organization
  • Improved message handling and type safety
  1. Connection Handling:
  • Added dedicated connection handler module
  • Improved server implementation
  • Better error handling and logging
  1. Query Execution:
  • Enhanced query parsing and execution
  • Improved error handling in query operations
  • Added support for more SQL operations
  • Better test coverage for query operations
  1. Testing:
  • Added new test files for connection handling
  • Improved existing test coverage
  • Added test utilities and helpers
  1. Code Organization:
  • Better separation of concerns
  • Improved module structure
  • Cleaner code organization

The changes improve maintainability, testability, and reliability of the SQL implementation while maintaining backward compatibility.

Summary by CodeRabbit

  • New Features
    • Enhanced SQL connection handling and PostgreSQL protocol support for improved performance and stability.
  • Documentation
    • Updated and translated documentation and user-facing messages for greater clarity.
  • Bug Fixes
    • Strengthened error detection and handling during query processing to prevent unexpected issues.
  • Refactor
    • Restructured server lifecycle management and communication flows for more reliable operations.
  • Tests
    • Expanded test coverage to ensure robust behavior across SQL server and query execution scenarios.

This commit introduces several improvements to the SQL implementation:

1. Protocol Layer:
- Split protocol messages into separate reader/writer modules
- Added new protocol core module for better organization
- Improved message handling and type safety

2. Connection Handling:
- Added dedicated connection handler module
- Improved server implementation
- Better error handling and logging

3. Query Execution:
- Enhanced query parsing and execution
- Improved error handling in query operations
- Added support for more SQL operations
- Better test coverage for query operations

4. Testing:
- Added new test files for connection handling
- Improved existing test coverage
- Added test utilities and helpers

5. Code Organization:
- Better separation of concerns
- Improved module structure
- Cleaner code organization

The changes improve maintainability, testability, and reliability of the SQL implementation while maintaining backward compatibility.

Signed-off-by: Avelino <31996+avelino@users.noreply.github.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Apr 11, 2025

📝 Walkthrough

Walkthrough

This pull request introduces new modules and refactors existing code to enhance SQL client connection management and PostgreSQL protocol handling. It adds functions for processing query and termination messages, restructures the SQL server to use dedicated functions for socket creation and client acceptance via a thread pool, and adds robust error handling. Additionally, it translates documentation and log messages between Portuguese and English, adds new protocol implementations, and extends the testing suite with new unit tests and helpers.

Changes

File(s) Change Summary
src/chrondb/api/sql/connection/handler.clj, src/chrondb/api/sql/connection/server.clj Introduced connection handlers with functions like handle-query-message, handle-terminate-message, and restructured server management with dedicated functions for creating the server socket, accepting clients via a thread pool, and improved shutdown logic.
src/chrondb/api/sql/core.clj Updated namespace description and translated docstrings/comments for functions like try-start-server, start-sql-server, and stop-sql-server to Portuguese.
src/chrondb/api/sql/execution/functions.clj, src/chrondb/api/sql/execution/query.clj, src/chrondb/api/sql/parser/statements.clj Translated error messages and comments, added ReadyForQuery signaling after query operations, and introduced a safeguard in parse-sql-query to prevent errors with empty tokens.
src/chrondb/api/sql/protocol/core.clj, src/chrondb/api/sql/protocol/handlers.clj Added a new PostgreSQL protocol implementation with PostgresProtocol and PostgresProtocolImpl, and translated log messages/comments from Portuguese to English.
src/chrondb/api/sql/protocol/messages.clj, src/chrondb/api/sql/protocol/messages/reader.clj, src/chrondb/api/sql/protocol/messages/writer.clj Refactored message handling by delegating responsibilities to new writer/reader namespaces and updating function signatures for sending/reading PostgreSQL messages.
src/chrondb/api/sql/test_util.clj Renamed keys in the test document map from Portuguese (:nome, :valor, :ativo) to English (:name, :value, :active).
test/chrondb/api/sql/connection/server_test.clj, test/chrondb/api/sql/execution/operators_test.clj, test/chrondb/api/sql/execution/query_test.clj, test/chrondb/api/sql/test_helpers.clj Added comprehensive tests for server connection management, query execution, document operators, and introduced helper functions to facilitate string I/O testing; removed redundant test cases.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server
    participant ThreadPool
    participant ConnHandler
    participant QueryExecutor

    Client->>Server: Initiate connection & send message
    Server->>Server: Create server socket (create-server-socket)
    Server->>ThreadPool: Launch accept-clients thread
    ThreadPool->>Server: Accept client connection
    Server->>ConnHandler: create-connection-handler invoked
    ConnHandler->>ConnHandler: Loop reading incoming messages
    alt Query Message
        ConnHandler->>QueryExecutor: handle-query-message called
        QueryExecutor-->>ConnHandler: Return query result or error
    else Termination Request
        ConnHandler->>ConnHandler: handle-terminate-message called
    else Unsupported Message
        ConnHandler->>ConnHandler: Log and send error response
    end
    ConnHandler->>Client: Send response (e.g., ReadyForQuery)
    Client->>Server: Disconnect
Loading

Possibly related PRs

Suggested labels

enhancement

Poem

In a burrow of code beneath the moon,
I hop through functions, each a bright tune.
With queries and protocols, we now take flight,
Error-handling and tests shining so bright.
I nibble on changes with a joyful cheer,
Hopping along, our code's magic is clear!
🐰✨

Tip

⚡💬 Agentic Chat (Pro Plan, General Availability)
  • We're introducing multi-step agentic chat in review comments and issue comments, within and outside of PR's. This feature enhances review and issue discussions with the CodeRabbit agentic chat by enabling advanced interactions, including the ability to create pull requests directly from comments and add commits to existing pull requests.

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (7)
src/chrondb/api/sql/protocol/messages/reader.clj (1)

94-101: Ensure full message read in read-query-message.
.read can return fewer bytes than requested. It's safer to read in a loop until content-length bytes are fully read or EOF is reached. Otherwise, partial reads could lead to malformed query data.

src/chrondb/api/sql/connection/handler.clj (1)

74-95: Watch out for blocking reads on handle-client-connection.
This loop blocks indefinitely waiting for messages. If there's a chance the client may not send more data, it might stall. Consider implementing a read timeout or a keepalive mechanism to gracefully shut down idle connections if desired.

test/chrondb/api/sql/connection/server_test.clj (1)

74-76: Address potential test flakiness caused by Thread/sleep.
Using arbitrary sleeps can introduce intermittent test failures. Consider using more deterministic synchronization methods (e.g., latches, semaphores, or promises) to ensure the thread has started accepting connections without relying on arbitrary delays.

src/chrondb/api/sql/protocol/messages/writer.clj (4)

17-21: Consider minimizing multiple writes for performance.
Currently, .write is invoked in a loop for each byte. A single .write call passing a byte array may be slightly more efficient.

(defn- write-bytes
  [^OutputStream out bytes]
-  (doseq [b bytes]
-    (.write out (int b))))
+  (.write out (byte-array bytes))

23-54: Question the mandatory 5ms sleep after each message.
Imposing a sleep can degrade throughput under load. Additionally, returning nil in catch blocks might mask errors from the caller. Consider either removing or making the sleep configurable, and ensuring error propagation matches your use case.


159-242: Potential buffer overflow in row description.
Using (* 256 (max 1 (count columns))) for the ByteBuffer is a static heuristic. Large column names or many columns can exhaust this allocation. Consider dynamically computing or expanding the buffer.


243-300: Static overhead of 100 bytes may be insufficient for large data rows.
If row data or columns are large, the ByteBuffer capacity might be overwhelmed. Dynamically resizing or computing an exact size is safer.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cb1a0bc and d14c855.

📒 Files selected for processing (16)
  • src/chrondb/api/sql/connection/handler.clj (1 hunks)
  • src/chrondb/api/sql/connection/server.clj (1 hunks)
  • src/chrondb/api/sql/core.clj (2 hunks)
  • src/chrondb/api/sql/execution/functions.clj (2 hunks)
  • src/chrondb/api/sql/execution/query.clj (8 hunks)
  • src/chrondb/api/sql/parser/statements.clj (1 hunks)
  • src/chrondb/api/sql/protocol/core.clj (1 hunks)
  • src/chrondb/api/sql/protocol/handlers.clj (3 hunks)
  • src/chrondb/api/sql/protocol/messages.clj (1 hunks)
  • src/chrondb/api/sql/protocol/messages/reader.clj (1 hunks)
  • src/chrondb/api/sql/protocol/messages/writer.clj (1 hunks)
  • src/chrondb/api/sql/test_util.clj (1 hunks)
  • test/chrondb/api/sql/connection/server_test.clj (1 hunks)
  • test/chrondb/api/sql/execution/operators_test.clj (1 hunks)
  • test/chrondb/api/sql/execution/query_test.clj (3 hunks)
  • test/chrondb/api/sql/test_helpers.clj (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: test
  • GitHub Check: test
🔇 Additional comments (80)
src/chrondb/api/sql/test_util.clj (1)

19-21: Field names have been internationalized from Portuguese to English

The changes translate the Portuguese field names to English:

  • :nome:name
  • :valor:value
  • :ativo:active

This aligns with the broader internationalization effort mentioned in the PR objectives and improves code consistency.

src/chrondb/api/sql/parser/statements.clj (1)

241-241: Added safety check for empty token list

Good defensive programming improvement! This change prevents potential IndexOutOfBoundsException errors when trying to access the first element of an empty sequence of tokens.

src/chrondb/api/sql/execution/functions.clj (2)

23-25: Error messages translated from Portuguese to English

The error message has been internationalized from Portuguese to English, which improves consistency with the rest of the codebase and enhances readability for English-speaking developers.


88-88: Error message translated from Portuguese to English

The aggregation function error message has been internationalized from Portuguese to English, consistent with the other changes in this file.

test/chrondb/api/sql/execution/operators_test.clj (21)

5-9: Well-structured sample data creation
Defining test-docs is clear and provides varied test records to ensure coverage of different conditions (ages, boolean flags, etc.).


12-16: Accurate equality condition testing
These lines appropriately test the scenario of matching a single field exactly. This increases reliability for simple equality checks.


18-22: Thorough greater-than condition check
The logic ensures that documents with age > 30 are retrieved. This helps confirm that numeric comparisons work as intended.


24-28: Less-than condition validation
Testing documents with age < 28 is a good negative-bound verification to ensure correctness of the operator.


30-34: Greater-than-or-equal validation
Verifying both boundary (30) and above (35) ensures inclusive conditions are handled properly.


36-40: Less-than-or-equal validation
Validating the inclusive lower bound ensures correct retrieval when the field value equals the condition threshold.


42-46: Not-equal operator coverage
This test ensures documents excluding "Alice" are returned, confirming the correct behavior of “!=”.


48-52: Boolean condition (true)
Testing for active users is an excellent coverage scenario for boolean filtering.


54-58: Boolean condition (false)
This ensures only the user with :active false is retrieved, verifying an alternate boolean filter.


60-65: Multiple conditions (AND)
Simultaneously enforcing multiple field conditions provides robust coverage for compound filters.


67-70: Invalid field scenario
Returning zero results when a non-existent field is filtered is correct, preventing unexpected matches.


72-75: Empty condition handling
Ensuring no filtering occurs when conditions are empty helps confirm fallback logic is correct.


78-84: Single-column ascending sort verification
Sorting by name gives an alphabetical sequence. This is a helpful clarity check.


86-92: Single-column descending sort verification
Sorting ages descending ensures numeric ordering is correct in reverse.


94-103: Multiple-column sort coverage
Combining multiple sort directives ensures more complex ordering logic is tested thoroughly.


106-109: Apply limit
Limiting the returned records to a smaller subset is well-tested here, ensuring the function works as expected.


111-114: Limit exceeding collection size
Ensuring the function returns all documents when limit > collection size avoids out-of-range errors.


116-118: Limit zero edge case
Returning zero documents confirms behavior for an extreme boundary case.


120-123: Nil limit handling
Skipping the limit operation when it’s nil results in returning all documents, which is correct.


125-134: Group docs by single field
This verifies that documents can be grouped correctly by a single key like :dept, testing basic grouping functionality.


135-142: Group docs by multiple fields
Testing grouping by multiple fields ensures multi-dimensional grouping is handled accurately.

src/chrondb/api/sql/execution/query.clj (16)

353-353: Clearer explanatory comment
Adding a comment clarifying “Identify documents without values for the secondary table” makes the LEFT JOIN logic more transparent.


370-373: Coherent filter pipeline
Separating the filtering steps for right-table vs. left-table conditions improves readability. This approach is sound for LEFT JOIN.


376-379: Conditional filtering for partial matches
Correctly applies conditions to docs lacking right-table fields, ensuring valid rows remain in a LEFT JOIN scenario.


381-382: Final merged results
Combining matched and unmatched results for LEFT JOIN is neatly handled here. This approach prevents data loss from missing secondary keys.


587-588: ReadyForQuery after SELECT
Explicitly sending ReadyForQuery ensures the client can submit subsequent statements, adhering to PostgreSQL protocol.


660-662: Clean insertion path
Sending ReadyForQuery upon successful INSERT finalizes the transaction confirmation forwards to the client.


674-675: ReadyForQuery in error case
Ensuring the client sees readiness even after an insert error is crucial for maintaining protocol consistency.


709-710: Update command completion
Sending command completion with the updated count is consistent with SQL client expectations. Good job.


715-716: ReadyForQuery after no updates
Even if zero documents are updated, acknowledging readiness is key to keep the session stable.


723-724: Error response handling
Sending command completion and readiness after an update error properly signals session continuity to the client.


755-756: Successful DELETE confirmation
Returning correct command completion for a single deletion is consistent with typical SQL client interactions.


760-761: DELETE error scenario
Providing an error message and readiness ensures client can proceed even if deletion fails.


779-780: Missing table or ID
Graceful handling for invalid DELETE statements prevents silent failures.


787-788: Catch-all delete error
Capturing exceptions and sending readiness finalizes the response cycle without leaving the client hanging.


812-813: Unknown command fallback
Gracefully handling unrecognized query types ensures session stability even for invalid SQL statements.


819-820: Error handling in handle-query
Sending error response and readiness for unknown exceptions promotes robust fault tolerance.

src/chrondb/api/sql/protocol/core.clj (5)

1-3: Namespace and dependencies
Defining this new namespace and referencing reader/writer modules clarifies protocol message concerns at a glance.


7-13: Protocol definition clarity
Outlining the main operations—startup, read-message, write-query-result—gives a concise contract for PostgreSQL handling.


14-31: Startup handling
Reading the startup message and returning a sequence of writer functions effectively organizes protocol initialization steps.


35-59: write-query-result implementation
Conditional logic for columns, data rows, and command completion ensures well-structured responses. The approach is correct for the PostgreSQL flow.


60-63: Factory method simplicity
create-protocol neatly instantiates the record, making the protocol straightforward to compose in other parts of the system.

src/chrondb/api/sql/protocol/messages/reader.clj (1)

9-12: Handle end-of-stream for read-byte.
Since .read returns -1 at EOF, callers may need to handle this condition. Consider documenting or explicitly handling the EOF to avoid potential issues if the input stream is closed prematurely.

src/chrondb/api/sql/connection/handler.clj (1)

9-35: Clarify partial read behavior and ensure safe query execution.
The code assumes reader/read-query-message will reliably return a valid query or an error; however, partial reads or malformed messages could still pass through and cause unexpected exceptions down the line. Consider validating the data length or ensuring robust handling of incomplete input.

src/chrondb/api/sql/protocol/messages/writer.clj (4)

9-15: Looks good for basic integer-to-byte preparation.
This function correctly extracts bytes from an integer using bit-shifting, returning a four-element vector. No functional issues found.


57-65: LGTM for authentication OK message.
This implementation correctly sends the appropriate authentication code (0 for OK).


114-130: Parameter status handling looks good.
The size is calculated from actual byte lengths, so indefinite expansion risk is lower here. Implementation is straightforward and correct.


149-158: Ready-for-query message sending is straightforward.
No issues found; sending the correct single-byte state is aligned with the spec.

src/chrondb/api/sql/connection/server.clj (3)

8-13: Server socket creation is straightforward.
This method correctly binds to the specified port and address. Ensure that addr is correct for production environments if more than “localhost” is needed.

Do you plan to allow remote connections in the future? If so, you might confirm the correct InetAddress usage.


33-48: Initialization logic is clear; watch for uncaught exceptions in accept loop.
We start the server socket, log the actual port, and spawn a dedicated accept thread. The approach is good, but confirm that any uncaught exceptions in accept-thread won't silently fail.


50-82: Graceful shutdown covers most use cases.
Stopping the server either closes an old style ServerSocket or a new map structure with thread-pool. This is a complete approach. Just ensure tasks that are running in the thread pool handle partial shutdown if needed.

src/chrondb/api/sql/core.clj (4)

2-8: Docstring updates (Portuguese translation) are consistent.
Renaming references in the docstring to Portuguese is clear and consistent with the rest of the codebase. No issues noted.


16-22: Documentation for start-sql-server is clear.
Parameters are declared in Portuguese and remain in sync with usage. The return value is also specified.


30-38: Fallback to port 0 is well-handled.
If the requested port is bound, you automatically try a random port. This promotes usability. Logging covers the fallback scenario.


41-47: stop-sql-server docstring is clear.
Renaming from server-socket to server matches the new structure. Documentation is aligned with the code in server.clj.

src/chrondb/api/sql/protocol/messages.clj (6)

2-3: Good documentation clarifying adapter pattern usage.

The comment clearly explains that these are adapter functions to maintain backward compatibility while refactoring, which is a good practice for managing technical debt.


4-6: Clear and concise imports.

Appropriate imports for the new reader and writer namespaces that will handle the actual implementation details.


9-10: Nice simplification using the adapter pattern.

The write-message function correctly forwards calls to the writer namespace, making the code more maintainable.


12-33: Clean implementation of forwarding functions.

These adapter functions provide a clean transition path while moving implementation details to specialized modules, improving separation of concerns.


36-80: Robust implementation of send-data-row with improved error handling.

This adapter function handles multiple scenarios well:

  1. Supports two arities for different use cases
  2. Converts nil values to empty strings
  3. Handles different data structures (maps, sequences, single values)
  4. Includes detailed logging

83-99: Consistently implemented reading functions.

The reading functions follow the same adapter pattern, forwarding calls to the reader namespace appropriately.

test/chrondb/api/sql/execution/query_test.clj (11)

187-192: Clean simplification of test assertion.

The test has been simplified by removing the explicit expected-doc variable and directly comparing with the built expression, making the test more concise and maintainable.


224-233: Good test for branch-specific update operations.

This test thoroughly verifies that updates work correctly when specifying a branch, which is important for the versioning functionality of ChronDB.


629-638: Validate behavior with non-existent tables.

Good edge case test ensuring that selecting from a non-existent table returns an empty result rather than throwing an exception.


640-651: Verify handling of unsupported SQL statements.

This test properly verifies that unsupported statements (like CREATE TABLE) generate an error response rather than failing silently or crashing.


652-662: Test exception handling for malformed SQL.

Good robustness test ensuring that the query handler gracefully handles exceptions from malformed SQL without propagating errors.


663-672: Handle edge case of empty queries.

Important test ensuring that empty queries don't cause exceptions, which improves resilience against potentially malformed client inputs.


674-687: Validate column projection functionality.

Comprehensive test for column projection ensuring that:

  1. Only requested columns are returned
  2. Non-requested columns are excluded
  3. The correct number of results is returned

688-702: Test aggregate function handling.

This test addresses SQL aggregate functions like COUNT. The comment about the current implementation behavior is informative and helps with future refactoring.


703-708: Full text search condition identification.

Clear test for identifying full-text search conditions with good coverage of different cases.


709-718: Test for combined insert and delete operations.

This test effectively validates the workflow of inserting and then deleting a document, ensuring both operations work correctly together.


719-783: Comprehensive tests for all SQL operations via handle-query.

These tests properly exercise the SQL query handling through the public API, ensuring that SELECT, INSERT, UPDATE, DELETE, and unsupported operations all behave as expected.

The proxy OutputStream implementation is a good approach for capturing and verifying output.

src/chrondb/api/sql/protocol/handlers.clj (4)

31-47: Improved readability with English translations.

Successfully translated Portuguese comments and log messages to English, making the code more accessible to a wider range of developers while maintaining functionality.


65-67: Clearer termination handling.

Improved message clarity for client termination request and added a helpful comment about not sending a response.


68-82: Better unsupported command handling.

Translated error messages and improved comments in the unsupported command handling section.


87-92: Clearer error handling.

Translated error messages to English, making the error handling more accessible to English-speaking developers.

Comment on lines +22 to +34
;; Helper functions for string I/O testing
(defn create-string-reader [s]
(java.io.BufferedReader. (java.io.StringReader. s)))

(defn create-string-writer []
(let [sw (java.io.StringWriter.)
bw (java.io.BufferedWriter. sw)]
{:writer bw
:string-writer sw}))

(defn get-writer-output [writer-map]
(.flush (:writer writer-map))
(str (.getBuffer (:string-writer writer-map))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Added string I/O helper functions for testing

These new helper functions enhance the testing capabilities by providing utilities for string I/O testing:

  1. create-string-reader - Creates a BufferedReader from a string
  2. create-string-writer - Creates a writer map with StringWriter and BufferedWriter
  3. get-writer-output - Retrieves the output from a writer map

These will be useful for testing components that read from or write to streams, particularly for the SQL protocol handlers mentioned in the PR objectives.


🏁 Script executed:

#!/bin/bash
# Verify if there's potential duplication with get-writer-output-util in test_util.clj

echo "Checking for similar function in test_util.clj:"
cat src/chrondb/api/sql/test_util.clj | grep -A 6 "get-writer-output-util"

Length of output: 394


Heads up: Code Duplication Detected

The new string I/O helper functions improve our testing capabilities, but please note that the get-writer-output function in test/chrondb/api/sql/test_helpers.clj is identical to get-writer-output-util in src/chrondb/api/sql/test_util.clj. This duplication can lead to maintenance challenges.

  • create-string-reader: Creates a BufferedReader from a string.
  • create-string-writer: Constructs a writer map containing a BufferedWriter and a StringWriter.
  • get-writer-output: Retrieves the writer map output, duplicating get-writer-output-util.

Action: Please consolidate these functions into a single, shared utility to avoid redundancy. Update all references accordingly.

Comment on lines +28 to +36
(defn read-null-terminated-string
"Read a null-terminated string from the input stream"
[^InputStream in]
(loop [bytes []]
(let [b (read-byte in)]
(if (zero? b)
(String. (byte-array bytes) StandardCharsets/UTF_8)
(recur (conj bytes b))))))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Require EOF handling in read-null-terminated-string.
If .read encounters -1 before reaching a null terminator, the current loop logic may continue indefinitely or accumulate invalid bytes. Consider explicitly checking for -1 to safely handle unexpected stream closures.

Comment on lines +91 to +113
(defn send-notice-response
"Sends a notice response message"
[^OutputStream out message]
(let [buffer (ByteBuffer/allocate 1024)
;; Severity field
_ (.put buffer (byte (int \S)))
_ (.put buffer (.getBytes "NOTICE" StandardCharsets/UTF_8))
_ (.put buffer (byte 0))

;; Message field
_ (.put buffer (byte (int \M)))
_ (.put buffer (.getBytes message StandardCharsets/UTF_8))
_ (.put buffer (byte 0))

;; Final null terminator
_ (.put buffer (byte 0))

_ (.flip buffer)
bytes (byte-array (.limit buffer))]

(.get buffer bytes)
(write-message out constants/PG_NOTICE_RESPONSE bytes)))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Repeated fixed buffer for notice response.
Same concern as with send-error-response: ensure 1024 bytes is adequate for all notice messages.

Comment on lines +67 to +89
(defn send-error-response
"Sends an error response message"
[^OutputStream out message]
(let [buffer (ByteBuffer/allocate 1024)
;; Severity field: 'S' + "ERROR" + null byte
_ (.put buffer (byte (int \S)))
_ (.put buffer (.getBytes "ERROR" StandardCharsets/UTF_8))
_ (.put buffer (byte 0))

;; Message field: 'M' + message + null byte
_ (.put buffer (byte (int \M)))
_ (.put buffer (.getBytes message StandardCharsets/UTF_8))
_ (.put buffer (byte 0))

;; Final null terminator
_ (.put buffer (byte 0))

;; Prepare final data
_ (.flip buffer)
bytes (byte-array (.limit buffer))]

(.get buffer bytes)
(write-message out constants/PG_ERROR_RESPONSE bytes)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Watch out for fixed buffer size in error response.
A 1024-byte buffer may be insufficient if the error message is large. Consider allocating the buffer dynamically or verifying that messages cannot exceed this size.

Comment on lines +131 to +147
(defn send-command-complete
"Sends a command complete message"
[^OutputStream out command rows]
(let [command-str (cond
(= command "INSERT") "INSERT 0 1"
(and (= command "SELECT") (zero? rows)) "SELECT 0"
:else (str command " " rows))
command-bytes (.getBytes command-str StandardCharsets/UTF_8)
buffer (ByteBuffer/allocate (inc (count command-bytes)))]

(.put buffer command-bytes)
(.put buffer (byte 0))

(.flip buffer)
(let [bytes (byte-array (.limit buffer))]
(.get buffer bytes)
(write-message out constants/PG_COMMAND_COMPLETE bytes))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Inaccurate handling of INSERT commands.
Currently, the code always sends “INSERT 0 1” regardless of the row count. This might misrepresent actual insert results.

 (let [command-str (cond
                     (= command "INSERT") 
-                      "INSERT 0 1"
+                      (str "INSERT 0 " rows)
                     (and (= command "SELECT") (zero? rows)) "SELECT 0"
                     :else (str command " " rows))]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
(defn send-command-complete
"Sends a command complete message"
[^OutputStream out command rows]
(let [command-str (cond
(= command "INSERT") "INSERT 0 1"
(and (= command "SELECT") (zero? rows)) "SELECT 0"
:else (str command " " rows))
command-bytes (.getBytes command-str StandardCharsets/UTF_8)
buffer (ByteBuffer/allocate (inc (count command-bytes)))]
(.put buffer command-bytes)
(.put buffer (byte 0))
(.flip buffer)
(let [bytes (byte-array (.limit buffer))]
(.get buffer bytes)
(write-message out constants/PG_COMMAND_COMPLETE bytes))))
(defn send-command-complete
"Sends a command complete message"
[^OutputStream out command rows]
(let [command-str (cond
(= command "INSERT") (str "INSERT 0 " rows)
(and (= command "SELECT") (zero? rows)) "SELECT 0"
:else (str command " " rows))
command-bytes (.getBytes command-str StandardCharsets/UTF_8)
buffer (ByteBuffer/allocate (inc (count command-bytes)))]
(.put buffer command-bytes)
(.put buffer (byte 0))
(.flip buffer)
(let [bytes (byte-array (.limit buffer))]
(.get buffer bytes)
(write-message out constants/PG_COMMAND_COMPLETE bytes))))

Comment on lines +15 to +31
(defn- accept-clients
"Aceita conexões de clientes e processa usando o thread pool"
[^ServerSocket server-socket thread-pool client-handler]
(loop []
(when-not (.isClosed server-socket)
(try
(while (not (.isClosed server-socket))
(try
(let [client-socket (.accept server-socket)]
(async/go
(client/handle-client storage index client-socket)))
(catch Exception e
(when-not (.isClosed server-socket)
(log/log-error (str "Error accepting SQL connection: " (.getMessage e)))))))
(let [socket (.accept server-socket)]
;; Submit the client handling task to the thread pool
(.submit thread-pool ^Runnable #(client-handler socket)))
(catch java.net.SocketException e
(if (.isClosed server-socket)
(log/log-info "Server socket closed.")
(log/log-error (str "Socket error: " (.getMessage e)))))
(catch Exception e
(log/log-error (str "Error in SQL server: " (.getMessage e))))))
server-socket))
(log/log-error (str "Error accepting client: " (.getMessage e)))))
;; Continue accepting connections (fora do try)
(recur))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Concurrent client acceptance strategy with potential unbounded thread usage.
The code spawns a new thread per connection using newCachedThreadPool, which can grow without limit. This may risk resource exhaustion under heavy load. Consider using a bounded thread pool or a more advanced concurrency approach to avoid potential DDoS scenarios.

@avelino avelino merged commit 988beb4 into main Apr 11, 2025
9 checks passed
@avelino avelino deleted the avelino/refactor-sql-implementation branch April 11, 2025 01:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants