Refactor `StreamReader` to modularize decoding logic. #22

junhaoliao · 2024-10-15T01:02:20Z

Description

Extract encoding type verification logic from StreamReader.cpp to a separate file decoding_methods.cpp.
Extract helper create_deserializer_and_data_context as a private method in StreamReader.

Validation performed

Built the js assets with task and ran below sample code:

import ModuleInit from "./cmake-build-debug/ClpFfiJs.js"
import fs from "node:fs"

const main = async () => {
    const file = fs.readFileSync("./test.clp.zst")

    console.time("perf")
    const Module = await ModuleInit()
    try {
        const decoder = new Module.ClpIrStreamReader(new Uint8Array(file))
        const numEvents = decoder.deserializeStream()
        const results = decoder.decodeRange(0, numEvents, false)
        console.log(results)

        decoder.filterLogEvents([5])

        const filteredLogEventMap = decoder.getFilteredLogEventMap()
        console.log(filteredLogEventMap)
    } catch (e) {
        console.trace("Exception caught:", e)
    }
    console.timeEnd("perf")
}

void main()

Observed no error.

Summary by CodeRabbit

New Features
- Introduced a new method for creating a deserializer and context within the StreamReader class.
- Added functionality to rewind the reader and verify encoding type in a new file.
Improvements
- Streamlined error handling and improved code organization in the StreamReader class.
- Enhanced type handling for better clarity and maintainability.
Documentation
- New header file created for decoding methods, outlining the functionality for encoding verification.

coderabbitai · 2024-10-15T01:02:27Z

Walkthrough

The pull request introduces several modifications to the CMakeLists.txt and source files related to the StreamReader class and its functionalities. Key changes include project configuration updates, the addition of new methods for handling encoding types, and the restructuring of source files for better organization. The new decoding_methods.cpp and decoding_methods.hpp files encapsulate encoding verification logic, while existing files are updated to streamline type usage and error handling.

Changes

File	Change Summary
`CMakeLists.txt`	- Project name set to `clp-ffi-js`, version `0.2.0`, C++ standard set to C++20.
	- Boost library fetched (version 1.85.0) and various build configurations updated.
	- Main source files updated to include `decoding_methods.cpp` and `StreamReader.cpp`.
`src/clp_ffi_js/ir/StreamReader.cpp`	- Refactored `StreamReader` class by moving encoding checks to a new method `create_data_context`.
	- Constructor updated to utilize the new method, improving readability and error handling.
`src/clp_ffi_js/ir/StreamReader.hpp`	- New includes added, constructor signature updated for simplified type usage, and new static method declared.
`src/clp_ffi_js/ir/decoding_methods.cpp`	- Introduced `rewind_reader_and_validate_encoding_type` function to handle encoding verification logic.
`src/clp_ffi_js/ir/decoding_methods.hpp`	- New header file created declaring the `rewind_reader_and_validate_encoding_type` function.

Possibly related PRs

Add log-level filtering methods and associated refactoring: #12: The changes in this PR enhance the StreamReader class, which is also modified in the main PR, particularly in how it handles log events and deserialization. The main PR updates the source files for the executable, including StreamReader.cpp, indicating a direct relationship.
Replace C++ linting configs with those managed by yscope-dev-utils; Forcibly enable CMAKE_EXPORT_COMPILE_COMMANDS. #15: This PR modifies the CMakeLists.txt file to forcibly enable CMAKE_EXPORT_COMPILE_COMMANDS, which is relevant to the changes made in the main PR regarding project configuration and build settings in the same file.

Suggested reviewers

kirkrodrigues

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (5)

src/clp_ffi_js/ir/decoding_methods.hpp (1)
6-8: Function declaration is well-structured, but lacks documentation.

The namespace and function declaration are appropriate and use modern C++ features. However, consider adding a brief documentation comment explaining the function's purpose and any side effects.

Consider adding a documentation comment like this:
/**
 * Rewinds the given reader and verifies its encoding type.
 * @param reader The reader to rewind and verify.
 * @throws May throw an exception if the encoding type is invalid or if rewinding fails.
 */
auto rewind_reader_and_verify_encoding_type(clp::ReaderInterface& reader) -> void;
src/clp_ffi_js/ir/decoding_methods.cpp (3)
12-13: LGTM: Function signature and initial positioning are correct.

The function signature is appropriate, taking a reference to clp::ReaderInterface and returning void. Rewinding the reader to the beginning ensures consistent behaviour.

Consider adding a brief comment explaining why the reader is being rewound, for improved code clarity:
 auto rewind_reader_and_verify_encoding_type(clp::ReaderInterface& reader) -> void {
+    // Ensure we're reading from the start of the stream
     reader.seek_from_begin(0);
15-26: LGTM: Encoding type verification is implemented correctly.

The function properly verifies the encoding type and handles errors by throwing appropriate exceptions. The use of SPDLOG_CRITICAL for logging critical errors is a good practice.

Consider using a constant for the success code to improve readability:
+    constexpr auto SUCCESS = clp::ffi::ir_stream::IRErrorCode::IRErrorCode_Success;
     if (auto const err{clp::ffi::ir_stream::get_encoding_type(reader, is_four_bytes_encoding)};
-        clp::ffi::ir_stream::IRErrorCode::IRErrorCode_Success != err)
+        SUCCESS != err)
     {
         SPDLOG_CRITICAL("Failed to decode encoding type, err={}", err);
         throw ClpFfiJsException{
27-36: LGTM: Unsupported encoding handling is implemented correctly.

The function appropriately throws an exception with a clear and informative message when an unsupported encoding is detected.

For consistency with modern C++ practices, consider using a comparison against false instead of ==:
-    if (false == is_four_bytes_encoding) {
+    if (!is_four_bytes_encoding) {
         throw ClpFfiJsException{
                 clp::ErrorCode::ErrorCode_Unsupported,
                 __FILENAME__,
                 __LINE__,
                 "IR stream uses unsupported encoding."
         };
     }
 }
 }  // namespace clp_ffi_js::ir
src/clp_ffi_js/ir/StreamReader.hpp (1)
110-113: Add documentation for the new static method

The method create_deserializer_and_data_context lacks a documentation comment. Adding a brief description of its purpose, parameters, and return value would enhance code clarity and assist other developers in understanding its functionality.

Consider adding documentation like this:
/**
 * Creates the deserializer and data context for the StreamReader.
 *
 * @param zstd_decompressor A unique pointer to a Zstandard decompressor.
 * @param data_buffer The data buffer containing compressed data.
 * @return A StreamReaderDataContext instance initialized with the given decompressor and data buffer.
 * @throw ClpFfiJsException if an error occurs during initialization.
 */
[[nodiscard]] static auto create_deserializer_and_data_context(
        std::unique_ptr<clp::streaming_compression::zstd::Decompressor>&& zstd_decompressor,
        clp::Array<char>&& data_buffer
    ) -> StreamReaderDataContext<four_byte_encoded_variable_t>;

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between e4fda3e and ab32045.

📒 Files selected for processing (5)

CMakeLists.txt (1 hunks)
src/clp_ffi_js/ir/StreamReader.cpp (3 hunks)
src/clp_ffi_js/ir/StreamReader.hpp (4 hunks)
src/clp_ffi_js/ir/decoding_methods.cpp (1 hunks)
src/clp_ffi_js/ir/decoding_methods.hpp (1 hunks)

🧰 Additional context used

🔇 Additional comments (9)

src/clp_ffi_js/ir/decoding_methods.hpp (3)

1-2: Include guards are correctly implemented.

The include guards follow best practices and use a unique identifier based on the file path. This prevents multiple inclusions and potential compilation issues.

Also applies to: 10-10

4-4: Include statement is appropriate.

The inclusion of <clp/ReaderInterface.hpp> is necessary for the function declaration and correctly uses angle brackets for a library header.

1-10: Well-structured header file that aligns with modularization objectives.

This new header file successfully modularizes the decoding logic by introducing a separate function for rewinding and verifying encoding types. The file structure is clean, follows modern C++ practices, and improves overall code organization. This change aligns well with the PR's objective of enhancing maintainability.

src/clp_ffi_js/ir/decoding_methods.cpp (1)

1-11: LGTM: Includes and namespace declaration are appropriate.

The necessary headers are included, and the function is correctly placed in the clp_ffi_js::ir namespace, which is consistent with the file's location and purpose.
CMakeLists.txt (1)
115-118: LGTM! Verify the new file exists.

The changes align well with the PR objectives of modularizing the decoding logic. The inclusion of decoding_methods.cpp in the build is consistent with the refactoring described in the PR summary.

Let's verify the existence of the new file:

✅ Verification successful

Verified: New File Exists

The decoding_methods.cpp file has been confirmed to exist in the src/clp_ffi_js/ir/ directory, aligning with the PR objectives.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Verify the existence of the new decoding_methods.cpp file
fd -t f "decoding_methods.cpp" src/clp_ffi_js/ir/
Length of output: 88
src/clp_ffi_js/ir/StreamReader.hpp (1)

116-117: Ensure member variables are properly initialized

The member variables m_encoded_log_events and m_stream_reader_data_context should be initialized to ensure they are in a defined state before use. Although they may be initialized in the constructor, explicitly initializing them can prevent potential issues.

Please confirm that these member variables are appropriately initialized in all constructors.

src/clp_ffi_js/ir/StreamReader.cpp (3)

29-29: Include 'decoding_methods.hpp' is appropriate

Including <clp_ffi_js/ir/decoding_methods.hpp> is necessary for accessing the decoding methods after refactoring. This improves code organization by separating decoding logic into its own module.

51-54: Ensure 'data_buffer' and 'zstd_decompressor' are not used after moving

After moving zstd_decompressor and data_buffer into create_deserializer_and_data_context, they should not be accessed in the remaining scope of StreamReader::create. Please verify that they are not used afterward to prevent undefined behaviour.

212-238: Refactored method 'create_deserializer_and_data_context' enhances modularity

The extraction of create_deserializer_and_data_context improves the modularity and readability of the StreamReader class by encapsulating the deserialization logic. This aligns with best practices for code organization.

src/clp_ffi_js/ir/StreamReader.hpp

src/clp_ffi_js/ir/StreamReader.cpp

haiqi96

The change is quite simple and straight forward. Approving this change.

Title looks good to me.

src/clp_ffi_js/ir/StreamReader.hpp

src/clp_ffi_js/ir/decoding_methods.hpp

…ext`.

…_validate_encoding_type` and update docs.

Co-authored-by: kirkrodrigues <[email protected]>

kirkrodrigues

Minor nit but other lgtm.

src/clp_ffi_js/ir/StreamReader.hpp

Co-authored-by: kirkrodrigues <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)

src/clp_ffi_js/ir/decoding_methods.hpp (1)
7-13: Consider enhancing the function documentation.

The documentation could be more comprehensive by:

Adding a description for the @param tag

Including a @return tag to explicitly document the void return type

Here's the suggested improvement:
 /**
  * Rewinds the reader to the beginning and validates the CLP IR data encoding type.
- * @param reader
+ * @param reader The reader interface to validate and rewind
+ * @return void
  * @throws ClpFfiJsException if the encoding type couldn't be decoded or the encoding type is
  * unsupported.
  */
src/clp_ffi_js/ir/StreamReader.cpp (2)
213-214: Consider adding const for data_buffer parameter.

Since data_buffer is moved into the return value at the end of the method, it could be taken as const clp::Array<char>&& to better express intent and prevent accidental modifications.
 auto StreamReader::create_data_context(
         std::unique_ptr<clp::streaming_compression::zstd::Decompressor>&& zstd_decompressor,
-        clp::Array<char> data_buffer
+        const clp::Array<char>&& data_buffer
 ) -> StreamReaderDataContext<four_byte_encoded_variable_t> {
228-234: Consider using more specific error codes.

The current implementation uses a generic ErrorCode_Failure. Consider using more specific error codes like ErrorCode_DeserializerCreationFailed to provide better error context.
 throw ClpFfiJsException{
-        clp::ErrorCode::ErrorCode_Failure,
+        clp::ErrorCode::ErrorCode_DeserializerCreationFailed,
         __FILENAME__,
         __LINE__,
         "Failed to create deserializer"
 };

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 15d818c and d86dc68.

📒 Files selected for processing (4)

src/clp_ffi_js/ir/StreamReader.cpp (3 hunks)
src/clp_ffi_js/ir/StreamReader.hpp (4 hunks)
src/clp_ffi_js/ir/decoding_methods.cpp (1 hunks)
src/clp_ffi_js/ir/decoding_methods.hpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

src/clp_ffi_js/ir/StreamReader.hpp
src/clp_ffi_js/ir/decoding_methods.cpp

🔇 Additional comments (4)

src/clp_ffi_js/ir/decoding_methods.hpp (2)

1-5: LGTM! Include guards and dependencies are properly structured.

The include guards follow the correct naming convention, and only the necessary header is included.

14-16: LGTM! Proper closure of namespace and include guard.

The namespace closing comment and include guard termination are well-formatted.

src/clp_ffi_js/ir/StreamReader.cpp (2)

29-29: LGTM: Header inclusion aligns with modularization.

The addition of decoding_methods.hpp properly supports the modularization of decoding logic.

51-53: LGTM: Improved code organization.

The refactoring enhances readability by moving the complex data context creation logic to a dedicated method.

junhaoliao added 2 commits October 14, 2024 20:52

Refactor StreamReader to modularize decoding logic.

3dae806

Add a comment section for methods in StreamReader.hpp.

ab32045

junhaoliao mentioned this pull request Oct 15, 2024

Refactor StreamReader to modularize decoding logic. #21

Closed

junhaoliao requested a review from davemarco October 15, 2024 01:04

coderabbitai bot reviewed Oct 15, 2024

View reviewed changes

src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved

src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved

src/clp_ffi_js/ir/StreamReader.cpp Show resolved Hide resolved

Add docs for rewind_reader_and_verify_encoding_type.

15d818c

haiqi96 reviewed Oct 22, 2024

View reviewed changes

src/clp_ffi_js/ir/StreamReader.cpp Show resolved Hide resolved

haiqi96 previously approved these changes Oct 23, 2024

View reviewed changes

kirkrodrigues requested changes Oct 23, 2024

View reviewed changes

Move using type statement into namespace clp_ffi_js::ir.

c3e44c9

junhaoliao dismissed haiqi96’s stale review via c3e44c9 October 24, 2024 01:45

junhaoliao and others added 4 commits October 23, 2024 22:14

Rename create_deserializer_and_data_context -> create_data_context.

c6fc555

Do not enforce data_buffer to be moved in calling `create_data_cont…

d358eb5

…ext`.

Rename rewind_reader_and_verify_encoding_type -> `rewind_reader_and…

60d41f8

…_validate_encoding_type` and update docs.

Docs - Apply suggestions from code review

537c630

Co-authored-by: kirkrodrigues <[email protected]>

junhaoliao requested review from haiqi96 and kirkrodrigues October 24, 2024 02:21

kirkrodrigues changed the title ~~Refactor StreamReader to modularize decoding logic.~~ Refactor StreamReader to modularize decoding logic. Oct 30, 2024

kirkrodrigues previously approved these changes Oct 30, 2024

View reviewed changes

src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved

Reformat code - Apply suggestions from code review

d86dc68

Co-authored-by: kirkrodrigues <[email protected]>

junhaoliao dismissed kirkrodrigues’s stale review via d86dc68 October 31, 2024 08:13

coderabbitai bot reviewed Oct 31, 2024

View reviewed changes

junhaoliao requested a review from kirkrodrigues October 31, 2024 08:50

kirkrodrigues approved these changes Oct 31, 2024

View reviewed changes

junhaoliao merged commit 9e82372 into y-scope:main Oct 31, 2024
3 checks passed

junhaoliao deleted the decoding-methods branch October 31, 2024 09:56

coderabbitai bot mentioned this pull request Nov 10, 2024

Update DecodedResultsTsType to use bigint in Emscripten binding (fixes #33). #34

Merged

coderabbitai bot mentioned this pull request Dec 6, 2024

feat: Add support for log-level filtering of structured IR streams. #35

Merged

coderabbitai bot mentioned this pull request Jan 29, 2025

feat: Add support for finding the log event that's closest to a target timestamp. #42

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor `StreamReader` to modularize decoding logic. #22

Refactor `StreamReader` to modularize decoding logic. #22

junhaoliao commented Oct 15, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 15, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

haiqi96 left a comment •

edited

Loading

kirkrodrigues left a comment

coderabbitai bot left a comment

Refactor StreamReader to modularize decoding logic. #22

Refactor StreamReader to modularize decoding logic. #22

Conversation

junhaoliao commented Oct 15, 2024 • edited by coderabbitai bot Loading

Description

Validation performed

Summary by CodeRabbit

Summary by CodeRabbit

coderabbitai bot commented Oct 15, 2024 • edited Loading

Walkthrough

Changes

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

haiqi96 left a comment • edited Loading

Choose a reason for hiding this comment

kirkrodrigues left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

Refactor `StreamReader` to modularize decoding logic. #22

Refactor `StreamReader` to modularize decoding logic. #22

junhaoliao commented Oct 15, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 15, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

haiqi96 left a comment •

edited

Loading