Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor StreamReader to modularize decoding logic. #22

Merged
merged 9 commits into from
Oct 31, 2024

Conversation

junhaoliao
Copy link
Member

@junhaoliao junhaoliao commented Oct 15, 2024

Description

  1. Extract encoding type verification logic from StreamReader.cpp to a separate file decoding_methods.cpp.
  2. Extract helper create_deserializer_and_data_context as a private method in StreamReader.

Validation performed

  1. Built the js assets with task and ran below sample code:
    import ModuleInit from "./cmake-build-debug/ClpFfiJs.js"
    import fs from "node:fs"
    
    const main = async () => {
        const file = fs.readFileSync("./test.clp.zst")
    
        console.time("perf")
        const Module = await ModuleInit()
        try {
            const decoder = new Module.ClpIrStreamReader(new Uint8Array(file))
            const numEvents = decoder.deserializeStream()
            const results = decoder.decodeRange(0, numEvents, false)
            console.log(results)
    
            decoder.filterLogEvents([5])
    
            const filteredLogEventMap = decoder.getFilteredLogEventMap()
            console.log(filteredLogEventMap)
        } catch (e) {
            console.trace("Exception caught:", e)
        }
        console.timeEnd("perf")
    }
    
    void main()
    
  2. Observed no error.

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced a new method for creating a deserializer and context within the StreamReader class.
    • Added functionality to rewind the reader and verify encoding type in a new file.
  • Improvements

    • Streamlined error handling and improved code organization in the StreamReader class.
    • Enhanced type handling for better clarity and maintainability.
  • Documentation

    • New header file created for decoding methods, outlining the functionality for encoding verification.

Copy link

coderabbitai bot commented Oct 15, 2024

Walkthrough

The pull request introduces several modifications to the CMakeLists.txt and source files related to the StreamReader class and its functionalities. Key changes include project configuration updates, the addition of new methods for handling encoding types, and the restructuring of source files for better organization. The new decoding_methods.cpp and decoding_methods.hpp files encapsulate encoding verification logic, while existing files are updated to streamline type usage and error handling.

Changes

File Change Summary
CMakeLists.txt - Project name set to clp-ffi-js, version 0.2.0, C++ standard set to C++20.
- Boost library fetched (version 1.85.0) and various build configurations updated.
- Main source files updated to include decoding_methods.cpp and StreamReader.cpp.
src/clp_ffi_js/ir/StreamReader.cpp - Refactored StreamReader class by moving encoding checks to a new method create_data_context.
- Constructor updated to utilize the new method, improving readability and error handling.
src/clp_ffi_js/ir/StreamReader.hpp - New includes added, constructor signature updated for simplified type usage, and new static method declared.
src/clp_ffi_js/ir/decoding_methods.cpp - Introduced rewind_reader_and_validate_encoding_type function to handle encoding verification logic.
src/clp_ffi_js/ir/decoding_methods.hpp - New header file created declaring the rewind_reader_and_validate_encoding_type function.

Possibly related PRs

Suggested reviewers

  • kirkrodrigues

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (5)
src/clp_ffi_js/ir/decoding_methods.hpp (1)

6-8: Function declaration is well-structured, but lacks documentation.

The namespace and function declaration are appropriate and use modern C++ features. However, consider adding a brief documentation comment explaining the function's purpose and any side effects.

Consider adding a documentation comment like this:

/**
 * Rewinds the given reader and verifies its encoding type.
 * @param reader The reader to rewind and verify.
 * @throws May throw an exception if the encoding type is invalid or if rewinding fails.
 */
auto rewind_reader_and_verify_encoding_type(clp::ReaderInterface& reader) -> void;
src/clp_ffi_js/ir/decoding_methods.cpp (3)

12-13: LGTM: Function signature and initial positioning are correct.

The function signature is appropriate, taking a reference to clp::ReaderInterface and returning void. Rewinding the reader to the beginning ensures consistent behaviour.

Consider adding a brief comment explaining why the reader is being rewound, for improved code clarity:

 auto rewind_reader_and_verify_encoding_type(clp::ReaderInterface& reader) -> void {
+    // Ensure we're reading from the start of the stream
     reader.seek_from_begin(0);

15-26: LGTM: Encoding type verification is implemented correctly.

The function properly verifies the encoding type and handles errors by throwing appropriate exceptions. The use of SPDLOG_CRITICAL for logging critical errors is a good practice.

Consider using a constant for the success code to improve readability:

+    constexpr auto SUCCESS = clp::ffi::ir_stream::IRErrorCode::IRErrorCode_Success;
     if (auto const err{clp::ffi::ir_stream::get_encoding_type(reader, is_four_bytes_encoding)};
-        clp::ffi::ir_stream::IRErrorCode::IRErrorCode_Success != err)
+        SUCCESS != err)
     {
         SPDLOG_CRITICAL("Failed to decode encoding type, err={}", err);
         throw ClpFfiJsException{

27-36: LGTM: Unsupported encoding handling is implemented correctly.

The function appropriately throws an exception with a clear and informative message when an unsupported encoding is detected.

For consistency with modern C++ practices, consider using a comparison against false instead of ==:

-    if (false == is_four_bytes_encoding) {
+    if (!is_four_bytes_encoding) {
         throw ClpFfiJsException{
                 clp::ErrorCode::ErrorCode_Unsupported,
                 __FILENAME__,
                 __LINE__,
                 "IR stream uses unsupported encoding."
         };
     }
 }
 }  // namespace clp_ffi_js::ir
src/clp_ffi_js/ir/StreamReader.hpp (1)

110-113: Add documentation for the new static method

The method create_deserializer_and_data_context lacks a documentation comment. Adding a brief description of its purpose, parameters, and return value would enhance code clarity and assist other developers in understanding its functionality.

Consider adding documentation like this:

/**
 * Creates the deserializer and data context for the StreamReader.
 *
 * @param zstd_decompressor A unique pointer to a Zstandard decompressor.
 * @param data_buffer The data buffer containing compressed data.
 * @return A StreamReaderDataContext instance initialized with the given decompressor and data buffer.
 * @throw ClpFfiJsException if an error occurs during initialization.
 */
[[nodiscard]] static auto create_deserializer_and_data_context(
        std::unique_ptr<clp::streaming_compression::zstd::Decompressor>&& zstd_decompressor,
        clp::Array<char>&& data_buffer
    ) -> StreamReaderDataContext<four_byte_encoded_variable_t>;
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between e4fda3e and ab32045.

📒 Files selected for processing (5)
  • CMakeLists.txt (1 hunks)
  • src/clp_ffi_js/ir/StreamReader.cpp (3 hunks)
  • src/clp_ffi_js/ir/StreamReader.hpp (4 hunks)
  • src/clp_ffi_js/ir/decoding_methods.cpp (1 hunks)
  • src/clp_ffi_js/ir/decoding_methods.hpp (1 hunks)
🧰 Additional context used
🔇 Additional comments (9)
src/clp_ffi_js/ir/decoding_methods.hpp (3)

1-2: Include guards are correctly implemented.

The include guards follow best practices and use a unique identifier based on the file path. This prevents multiple inclusions and potential compilation issues.

Also applies to: 10-10


4-4: Include statement is appropriate.

The inclusion of <clp/ReaderInterface.hpp> is necessary for the function declaration and correctly uses angle brackets for a library header.


1-10: Well-structured header file that aligns with modularization objectives.

This new header file successfully modularizes the decoding logic by introducing a separate function for rewinding and verifying encoding types. The file structure is clean, follows modern C++ practices, and improves overall code organization. This change aligns well with the PR's objective of enhancing maintainability.

src/clp_ffi_js/ir/decoding_methods.cpp (1)

1-11: LGTM: Includes and namespace declaration are appropriate.

The necessary headers are included, and the function is correctly placed in the clp_ffi_js::ir namespace, which is consistent with the file's location and purpose.

CMakeLists.txt (1)

115-118: LGTM! Verify the new file exists.

The changes align well with the PR objectives of modularizing the decoding logic. The inclusion of decoding_methods.cpp in the build is consistent with the refactoring described in the PR summary.

Let's verify the existence of the new file:

✅ Verification successful

Verified: New File Exists

The decoding_methods.cpp file has been confirmed to exist in the src/clp_ffi_js/ir/ directory, aligning with the PR objectives.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Verify the existence of the new decoding_methods.cpp file
fd -t f "decoding_methods.cpp" src/clp_ffi_js/ir/

Length of output: 88

src/clp_ffi_js/ir/StreamReader.hpp (1)

116-117: Ensure member variables are properly initialized

The member variables m_encoded_log_events and m_stream_reader_data_context should be initialized to ensure they are in a defined state before use. Although they may be initialized in the constructor, explicitly initializing them can prevent potential issues.

Please confirm that these member variables are appropriately initialized in all constructors.

src/clp_ffi_js/ir/StreamReader.cpp (3)

29-29: Include 'decoding_methods.hpp' is appropriate

Including <clp_ffi_js/ir/decoding_methods.hpp> is necessary for accessing the decoding methods after refactoring. This improves code organization by separating decoding logic into its own module.


51-54: Ensure 'data_buffer' and 'zstd_decompressor' are not used after moving

After moving zstd_decompressor and data_buffer into create_deserializer_and_data_context, they should not be accessed in the remaining scope of StreamReader::create. Please verify that they are not used afterward to prevent undefined behaviour.


212-238: Refactored method 'create_deserializer_and_data_context' enhances modularity

The extraction of create_deserializer_and_data_context improves the modularity and readability of the StreamReader class by encapsulating the deserialization logic. This aligns with best practices for code organization.

src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved
src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved
src/clp_ffi_js/ir/StreamReader.cpp Show resolved Hide resolved
haiqi96
haiqi96 previously approved these changes Oct 23, 2024
Copy link

@haiqi96 haiqi96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change is quite simple and straight forward. Approving this change.

Title looks good to me.

src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved
src/clp_ffi_js/ir/StreamReader.hpp Outdated Show resolved Hide resolved
src/clp_ffi_js/ir/StreamReader.hpp Outdated Show resolved Hide resolved
src/clp_ffi_js/ir/decoding_methods.hpp Outdated Show resolved Hide resolved
src/clp_ffi_js/ir/decoding_methods.hpp Outdated Show resolved Hide resolved
@kirkrodrigues kirkrodrigues changed the title Refactor StreamReader to modularize decoding logic. Refactor StreamReader to modularize decoding logic. Oct 30, 2024
kirkrodrigues
kirkrodrigues previously approved these changes Oct 30, 2024
Copy link
Member

@kirkrodrigues kirkrodrigues left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit but other lgtm.

src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
src/clp_ffi_js/ir/decoding_methods.hpp (1)

7-13: Consider enhancing the function documentation.

The documentation could be more comprehensive by:

  1. Adding a description for the @param tag
  2. Including a @return tag to explicitly document the void return type

Here's the suggested improvement:

 /**
  * Rewinds the reader to the beginning and validates the CLP IR data encoding type.
- * @param reader
+ * @param reader The reader interface to validate and rewind
+ * @return void
  * @throws ClpFfiJsException if the encoding type couldn't be decoded or the encoding type is
  * unsupported.
  */
src/clp_ffi_js/ir/StreamReader.cpp (2)

213-214: Consider adding const for data_buffer parameter.

Since data_buffer is moved into the return value at the end of the method, it could be taken as const clp::Array<char>&& to better express intent and prevent accidental modifications.

 auto StreamReader::create_data_context(
         std::unique_ptr<clp::streaming_compression::zstd::Decompressor>&& zstd_decompressor,
-        clp::Array<char> data_buffer
+        const clp::Array<char>&& data_buffer
 ) -> StreamReaderDataContext<four_byte_encoded_variable_t> {

228-234: Consider using more specific error codes.

The current implementation uses a generic ErrorCode_Failure. Consider using more specific error codes like ErrorCode_DeserializerCreationFailed to provide better error context.

 throw ClpFfiJsException{
-        clp::ErrorCode::ErrorCode_Failure,
+        clp::ErrorCode::ErrorCode_DeserializerCreationFailed,
         __FILENAME__,
         __LINE__,
         "Failed to create deserializer"
 };
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 15d818c and d86dc68.

📒 Files selected for processing (4)
  • src/clp_ffi_js/ir/StreamReader.cpp (3 hunks)
  • src/clp_ffi_js/ir/StreamReader.hpp (4 hunks)
  • src/clp_ffi_js/ir/decoding_methods.cpp (1 hunks)
  • src/clp_ffi_js/ir/decoding_methods.hpp (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/clp_ffi_js/ir/StreamReader.hpp
  • src/clp_ffi_js/ir/decoding_methods.cpp
🔇 Additional comments (4)
src/clp_ffi_js/ir/decoding_methods.hpp (2)

1-5: LGTM! Include guards and dependencies are properly structured.

The include guards follow the correct naming convention, and only the necessary header is included.


14-16: LGTM! Proper closure of namespace and include guard.

The namespace closing comment and include guard termination are well-formatted.

src/clp_ffi_js/ir/StreamReader.cpp (2)

29-29: LGTM: Header inclusion aligns with modularization.

The addition of decoding_methods.hpp properly supports the modularization of decoding logic.


51-53: LGTM: Improved code organization.

The refactoring enhances readability by moving the complex data context creation logic to a dedicated method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants