feat: Support search logs by timestamp for structured and unstructured logs. #42

Henry8192 · 2024-12-19T19:07:21Z

Description

Adds getLogEventIndexByTimestamp API in clp_ffi_js/ir/StreamReader.cpp for ClpStreamReader. It returns null only if there are no log events. Otherwise, it would search logs in "best effort", which means it returns the last index smaller than the timestamp if no exact timestamp match, unless all log event timestamps are larger than the target. In that case, return the first log event index.

In clp_ffi_js/ir/StreamReader.hpp, implements generic_get_log_event_index_by_timestamp(), the core logic (basically just binary search) for finding log events by timestamp.
This function is called both by StructuredIrStreamReader and UnstructuredIrStreamReader's get_log_event_index_by_timestamp.

Validation performed

Use the following javascript for testing structured and unstructured logs:

import ModuleInit from "./cmake-build-debug/ClpFfiJs-node.js"
import fs from "node:fs"

const main = async () => {
    const file = fs.readFileSync("./test-1K.clp.zst")

    console.time("perf")
    const Module = await ModuleInit()
    try {
        const decoder = new Module.ClpStreamReader(new Uint8Array(file), {logLevelKey: "$log_level", timestampKey: "$timestamp"})
        console.log("type:", decoder.getIrStreamType() === Module.IrStreamType.STRUCTURED ? "structured" : "unstructured")
        const numEvents = decoder.deserializeStream()
        console.log(numEvents)
        const results = decoder.decodeRange(0, numEvents, false)
        console.log(results)
        const logEventIdx = decoder.getLogEventIndexByTimestamp(1736183181284)
        console.log(logEventIdx)
    } catch (e) {
        console.error("Exception caught:", e.stack)
    }
    console.timeEnd("perf")
}

void main()

irv2 logs generating scripts:

"""
To install all dependencies:
    pip install clp-ffi-py==0.1.0b1 msgpack nabg zstandard
TODO: set random seed based on `i` so that all logs generated are deterministic.
"""
import random
import time

import msgpack
import nabg
import zstandard as zstd
from clp_ffi_py.ir.native import Serializer

NUM_EVENTS_TO_GENERATE = 1_000
FILE_NAME = "test-1K.clp.zst"

LOG_LEVELS = ["TRACE", 'DEBUG', 'INFO', 'WARN', 'ERROR', 'FATAL', "SOME_UNKNOWN_LEVEL"]

EMOJI_RANGES = [
    (0x1F600, 0x1F64F),  # Emoticons
    (0x1F300, 0x1F5FF),  # Miscellaneous Symbols and Pictographs
    (0x1F680, 0x1F6FF),  # Transport and Map Symbols
    (0x1F900, 0x1F9FF),  # Supplemental Symbols and Pictographs
    (0x2600, 0x26FF),  # Miscellaneous Symbols
    (0x1F1E6, 0x1F1FF),  # Flags
]
EMOJI_POOL = [chr(i) for r in EMOJI_RANGES for i in range(r[0], r[1] + 1)]

NUM_MILLIS_IN_SECOND = 1000


def generate_random_emoji():
    return random.choice(EMOJI_POOL)


def generate_random_message():
    return (
            generate_random_emoji() +
            nabg.ionize() +
            generate_random_emoji()
    )


cctx = zstd.ZstdCompressor(threads=2)

with open(FILE_NAME, "wb") as f:
    with cctx.stream_writer(f) as w:
        with Serializer(w) as s:
            for i in range(NUM_EVENTS_TO_GENERATE):
                random.seed(i)
                log_event = msgpack.dumps({
                    "$log_level": LOG_LEVELS[i % len(LOG_LEVELS)],
                    "$timestamp": int(time.time() * NUM_MILLIS_IN_SECOND),
                    "message": generate_random_message()
                })
                s.serialize_log_event_from_msgpack_map(log_event)

Summary by CodeRabbit

New Features
- Introduced a capability to locate the nearest log event index based on a specified timestamp across key streaming components.
- Enhanced log event indexing with a new flexible data type that supports both numerical values and the absence of a value, ensuring more robust and efficient data handling.

# Conflicts: # src/clp_ffi_js/ir/StreamReader.cpp # src/clp_ffi_js/ir/StreamReader.hpp

coderabbitai · 2024-12-19T19:07:28Z

Walkthrough

This pull request introduces new methods across several log stream reader classes to retrieve the nearest log event index based on a specified timestamp. The method findNearestLogEventByTimestamp is added to the bindings for StreamReader, StructuredIrStreamReader, and UnstructuredIrStreamReader. A new type alias, NullableLogEventIdx (defined as number | null in JavaScript), is also registered. Additionally, a templated helper function for efficient searching is introduced, while existing functionalities remain unchanged.

Changes

File(s)	Change Summary
`src/clp_ffi_js/ir/StreamReader.cpp`	Added binding for `findNearestLogEventByTimestamp` and registered the new type `NullableLogEventIdx` (`number
`src/clp_ffi_js/ir/StreamReader.hpp`	Declared the new virtual method `find_nearest_log_event_by_timestamp`, introduced a templated `generic_find_nearest_log_event_by_timestamp`, and defined the type alias `NullableLogEventIdx`.
`src/clp_ffi_js/ir/StructuredIrStreamReader.{cpp, hpp}`	Added the new method `find_nearest_log_event_by_timestamp` in both source and header files; updated include directives to support new type definitions.
`src/clp_ffi_js/ir/UnstructuredIrStreamReader.{cpp, hpp}`	Added the new method `find_nearest_log_event_by_timestamp` that calls `generic_find_nearest_log_event_by_timestamp` using `m_encoded_log_events`.

Sequence Diagram(s)

sequenceDiagram
    participant JS as JavaScript Interface
    participant SR as StreamReader/StructuredIrStreamReader/UnstructuredIrStreamReader
    participant GF as Generic Finder
    JS->>SR: findNearestLogEventByTimestamp(timestamp)
    SR->>GF: generic_find_nearest_log_event_by_timestamp(log_events, timestamp)
    GF-->>SR: NullableLogEventIdx
    SR-->>JS: NullableLogEventIdx

Possibly related PRs

Add support for deserializing and decoding v0.1.0 IR streams, but without log-level parsing and filtering. #30: The changes in the main PR are related to the addition of the findNearestLogEventByTimestamp method in the StreamReader class, which is also implemented in the StructuredIrStreamReader class in the retrieved PR.
feat: Add support for log-level filtering of structured IR streams. #35: The changes in the main PR are related to the modifications in the StreamReader class, specifically the addition of the findNearestLogEventByTimestamp method, which aligns with the new filtering capabilities introduced in the retrieved PR that also involves log event handling.
Split StreamReader into an interface and implementation to prepare for adding another IR stream reader. #26: The changes in the main PR, which add the findNearestLogEventByTimestamp method and the NullableLogEventIdx type to the StreamReader class, are related to the modifications in the retrieved PR that also introduces a get_version method and restructures the StreamReader class, as both involve enhancements to the same class's interface and functionality.

Suggested reviewers

junhaoliao
davemarco
kirkrodrigues

Tip

🌐 Web search-backed reviews and chat

We have enabled web search-based reviews and chat for all users. This feature allows CodeRabbit to access the latest documentation and information on the web.
You can disable this feature by setting web_search: false in the knowledge_base settings.
Please share any feedback in the Discord discussion.

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

junhaoliao

For future commits, you may follow https://github.com/y-scope/clp-ffi-js?tab=readme-ov-file#linting to run the linter

src/clp_ffi_js/ir/StreamReader.hpp

src/clp_ffi_js/ir/StreamReader.cpp

src/clp_ffi_js/ir/StreamReader.hpp

src/clp_ffi_js/ir/StructuredIrStreamReader.cpp

src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp

src/clp_ffi_js/ir/StreamReader.hpp

src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp

src/clp_ffi_js/ir/StructuredIrStreamReader.hpp

junhaoliao · 2024-12-29T04:04:08Z

This PR is blocking y-scope/yscope-log-viewer#152

src/clp_ffi_js/ir/StreamReader.hpp

…x_by_timestamp, use std::ranges::upper_bound instead of std::upper_bound

…ot suitable here

…s null when log events are empty, and return index with "best effort"

coderabbitai

Actionable comments posted: 4

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 32b1220 and 5221588.

📒 Files selected for processing (6)

src/clp_ffi_js/ir/StreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/StreamReader.hpp (6 hunks)
src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/StructuredIrStreamReader.hpp (2 hunks)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1 hunks)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1 hunks)

🧰 Additional context used

📓 Path-based instructions (6)

src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/StructuredIrStreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/StreamReader.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

🪛 GitHub Actions: lint

src/clp_ffi_js/ir/StreamReader.hpp

[error] 65-66: Code formatting violation: Static create method declaration is not properly formatted

[error] 126-126: Code formatting violation: DecodedResultsTsType return type declaration is not properly formatted

[error] 138-139: Code formatting violation: get_log_event_index_by_timestamp method declaration is not properly formatted

[error] 201-203: Code formatting violation: requires clause and its parameters are not properly formatted

[error] 301-303: Code formatting violation: requires clause and its parameters are not properly formatted

🔇 Additional comments (10)

src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1)

74-75: LGTM! Method declaration is well-defined.

The method signature is consistent with the base class and follows proper C++ practices with the [[nodiscard]] attribute.

src/clp_ffi_js/ir/StructuredIrStreamReader.hpp (2)

11-11: LGTM! Include directive is appropriately placed.

The addition of <clp/ir/types.hpp> provides the necessary type definitions.

78-79: LGTM! Method declaration is consistent.

The method signature matches the base class and follows the same pattern as UnstructuredIrStreamReader.

src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (2)

14-14: LGTM! Include directive is appropriately placed.

The addition of <clp/ir/types.hpp> provides the necessary type definitions.

151-158: LGTM! Implementation is clean and efficient.

The implementation properly delegates to the generic function, promoting code reuse between different reader types.

src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1)

161-168: LGTM! Implementation is consistent with StructuredIrStreamReader.

The implementation properly uses the generic function, ensuring consistent behaviour across different reader types.

src/clp_ffi_js/ir/StreamReader.cpp (1)

132-132: LGTM! Type registration and function binding are properly implemented.

The new type registration and function binding follow the established patterns in the codebase.

Also applies to: 149-153

src/clp_ffi_js/ir/StreamReader.hpp (3)

14-14: LGTM! Include and type declaration are properly placed.

The additions follow the established patterns in the codebase.

Also applies to: 33-33

128-137: LGTM! Documentation is clear and comprehensive.

The documentation clearly explains the method's purpose, parameters, and return values.

313-331: LGTM! Implementation is efficient and handles edge cases properly.

The implementation:

Correctly handles empty log events

Uses std::upper_bound for efficient binary search

Properly handles the case when all timestamps are larger than the target

src/clp_ffi_js/ir/StreamReader.hpp

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)
312-312: Use Yoda condition for null check.

According to the coding guidelines, prefer false == <expression> over !<expression>.

Apply this diff:
-    if (log_events.empty()) {
+    if (true == log_events.empty()) {

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5221588 and abeb4f8.

📒 Files selected for processing (1)

src/clp_ffi_js/ir/StreamReader.hpp (5 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: lint (ubuntu-latest)
GitHub Check: lint (macos-latest)

🔇 Additional comments (5)

src/clp_ffi_js/ir/StreamReader.hpp (5)

33-33: LGTM! Type declaration follows established pattern.

The declaration of LogEventIdxTsType aligns with other type declarations in the file and properly integrates with Emscripten's type system.

128-137: LGTM! Well-documented method declaration.

The documentation thoroughly explains all possible return scenarios and the method signature follows C++ best practices with appropriate use of [[nodiscard]].

138-139: Fix method declaration formatting.

The method declaration needs to be reformatted according to the linting rules.

Apply this diff:
-    [[nodiscard]] virtual auto get_log_event_index_by_timestamp(clp::ir::epoch_time_ms_t timestamp
-    ) -> LogEventIdxTsType = 0;
+    [[nodiscard]] virtual auto get_log_event_index_by_timestamp(
+            clp::ir::epoch_time_ms_t timestamp
+    ) -> LogEventIdxTsType = 0;
312-314: LGTM! Efficient implementation with proper edge case handling.

The implementation efficiently uses binary search via std::upper_bound and properly handles all edge cases:

Empty log events return null

No exact match returns the last smaller index

All larger timestamps return the first index

Also applies to: 315-322, 323-329

300-307: Fix requires clause formatting.

The requires clause needs to be reformatted according to the linting rules.

Apply this diff:
-requires requires(
-                 LogEventWithFilterData<LogEvent> const& event,
-                 clp::ir::epoch_time_ms_t timestamp
-         ) {
+    requires requires(LogEventWithFilterData<LogEvent> const& event,
+                     clp::ir::epoch_time_ms_t timestamp) {

src/clp_ffi_js/ir/StreamReader.cpp

src/clp_ffi_js/ir/StreamReader.hpp

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

src/clp_ffi_js/ir/StreamReader.hpp (2)

49-57: Add documentation for the GetLogEventIdxInterface concept.

The concept is well-defined, but it would benefit from documentation explaining its purpose and requirements.

Add a documentation block before the concept:

+/**
+ * Concept defining the requirements for log event types that can be searched by timestamp.
+ * Types satisfying this concept must provide a get_timestamp method that returns a value
+ * convertible to epoch_time_ms_t.
+ */
 template <typename LogEvent>
 concept GetLogEventIdxInterface = requires(

299-325: Consider optimizing and documenting the binary search implementation.

While the implementation is correct, consider these improvements:

Add a comment explaining the binary search strategy
Use std::prev for cleaner iterator manipulation

Apply this diff to improve the implementation:

 template <GetLogEventIdxInterface LogEvent>
 auto StreamReader::generic_get_log_event_idx_by_timestamp(
         LogEvents<LogEvent> const& log_events,
         clp::ir::epoch_time_ms_t timestamp
 ) -> LogEventIdxTsType {
     if (log_events.empty()) {
         return LogEventIdxTsType{emscripten::val::null()};
     }

+    // Use binary search to find the first element greater than the target timestamp
     auto upper{std::upper_bound(
             log_events.begin(),
             log_events.end(),
             timestamp,
             [](clp::ir::epoch_time_ms_t ts, LogEventWithFilterData<LogEvent> const& log_event) {
                 return ts < log_event.get_timestamp();
             }
     )};

     if (upper == log_events.begin()) {
         return LogEventIdxTsType{emscripten::val(0)};
     }

-    auto const upper_index{std::distance(log_events.begin(), upper)};
-    auto const index{upper_index - 1};
+    // Return the index of the previous element (last element <= timestamp)
+    auto const index{std::distance(log_events.begin(), std::prev(upper))};

     return LogEventIdxTsType{emscripten::val(index)};
 }

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e3e21b and 412b96e.

📒 Files selected for processing (2)

src/clp_ffi_js/ir/StreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/StreamReader.hpp (6 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

src/clp_ffi_js/ir/StreamReader.cpp

🧰 Additional context used

📓 Path-based instructions (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: lint (ubuntu-latest)
GitHub Check: lint (macos-latest)

🔇 Additional comments (2)

src/clp_ffi_js/ir/StreamReader.hpp (2)

14-14: LGTM! Type declarations and includes are well-organized.

The new include and type declaration follow the existing patterns in the codebase.

Also applies to: 33-33

138-149: LGTM! Well-documented virtual method declaration.

The method signature and documentation are clear and comprehensive.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)

298-305: Consider optimizing the binary search predicate.

The current implementation creates a lambda for each comparison. Consider using a member function pointer or functor for potentially better performance:

-    auto upper{std::upper_bound(
-            log_events.begin(),
-            log_events.end(),
-            timestamp,
-            [](clp::ir::epoch_time_ms_t ts, LogEventWithFilterData<LogEvent> const& log_event) {
-                return ts < log_event.get_timestamp();
-            }
-    )};
+    struct TimestampCompare {
+        bool operator()(clp::ir::epoch_time_ms_t ts, 
+                       LogEventWithFilterData<LogEvent> const& log_event) const {
+            return ts < log_event.get_timestamp();
+        }
+    };
+    auto upper{std::upper_bound(
+            log_events.begin(),
+            log_events.end(),
+            timestamp,
+            TimestampCompare{}
+    )};

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 412b96e and 54c7df1.

📒 Files selected for processing (1)

src/clp_ffi_js/ir/StreamReader.hpp (5 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: lint (ubuntu-latest)
GitHub Check: lint (macos-latest)

🔇 Additional comments (3)

src/clp_ffi_js/ir/StreamReader.hpp (3)

33-33: LGTM! Type declaration for LogEventIdxTsType.

The declaration is necessary for exposing the timestamp-based search result type to JavaScript.

128-139: LGTM! Virtual method declaration is well-documented.

The pure virtual method declaration follows the class's convention, and the documentation clearly explains all possible return values and edge cases.

311-314: LGTM! Index calculation is correct and follows best practices.

The implementation properly calculates the index using std::distance and maintains const correctness.

davemarco

Oh I added a second review and it didn't get posted. Here is it

src/clp_ffi_js/ir/StreamReader.cpp

src/clp_ffi_js/ir/StreamReader.hpp

src/clp_ffi_js/ir/StructuredIrStreamReader.cpp

src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (3)

src/clp_ffi_js/ir/StreamReader.hpp (3)
128-139: Consider renaming the parameter for consistency.

The parameter name timestamp could be renamed to target_ts to maintain consistency with the documentation style used throughout the codebase.
-    [[nodiscard]] virtual auto get_log_event_idx_by_timestamp(clp::ir::epoch_time_ms_t timestamp
+    [[nodiscard]] virtual auto get_log_event_idx_by_timestamp(clp::ir::epoch_time_ms_t target_ts
     ) -> LogEventIdxTsType = 0;
190-201: Enhance documentation for the templated implementation.

Consider referencing the virtual method's documentation to avoid duplication and potential inconsistencies.
     /**
      * Templated implementation of `get_log_event_idx_by_timestamp`.
      *
      * @tparam LogEvent
      * @param timestamp
-     * @return the best matched log event index.
+     * @return See `get_log_event_idx_by_timestamp`.
      */
289-314: Improve variable naming for better readability.

The variable names could be more descriptive to better convey their purpose:

first_greater_it could be upper_bound_it

first_greater_idx could be upper_bound_idx

These names would better align with the standard library's terminology.
-    auto first_greater_it{std::upper_bound(
+    auto upper_bound_it{std::upper_bound(
             log_events.begin(),
             log_events.end(),
             timestamp,
             [](clp::ir::epoch_time_ms_t ts, LogEventWithFilterData<LogEvent> const& log_event) {
                 return ts < log_event.get_timestamp();
             }
     )};

-    if (first_greater_it == log_events.begin()) {
+    if (upper_bound_it == log_events.begin()) {
         return LogEventIdxTsType{emscripten::val(0)};
     }

-    auto const first_greater_idx{std::distance(log_events.begin(), first_greater_it)};
+    auto const upper_bound_idx{std::distance(log_events.begin(), upper_bound_it)};

-    return LogEventIdxTsType{emscripten::val(first_greater_idx - 1)};
+    return LogEventIdxTsType{emscripten::val(upper_bound_idx - 1)};

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 54c7df1 and eff1849.

📒 Files selected for processing (1)

src/clp_ffi_js/ir/StreamReader.hpp (5 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: lint (ubuntu-latest)
GitHub Check: lint (macos-latest)

🔇 Additional comments (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)

33-33: LGTM! Type declaration follows established patterns.

The new type declaration for LogEventIdxTsType follows the existing pattern and naming convention used for other JavaScript interop types.

davemarco

Added comments on last review

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)
298-305: Add comments explaining the binary search strategy.

While the implementation using std::upper_bound is efficient, it would benefit from a brief comment explaining why this approach was chosen and how it achieves the desired result.

Add a comment before the upper_bound call:
+    // Use binary search to find the first element greater than the target timestamp.
+    // This allows us to step back one position to find the last element less than or equal to the target.
     auto first_greater_it{std::upper_bound(

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between eff1849 and 00f89d4.

📒 Files selected for processing (3)

src/clp_ffi_js/ir/StreamReader.hpp (5 hunks)
src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp
src/clp_ffi_js/ir/StructuredIrStreamReader.cpp

🧰 Additional context used

📓 Path-based instructions (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

🔇 Additional comments (2)

src/clp_ffi_js/ir/StreamReader.hpp (2)

33-33: LGTM! Type declaration follows established patterns.

The declaration of LogEventIdxTsType is appropriately placed with other output types and follows the existing pattern for TypeScript type declarations.

128-138: LGTM! Well-documented virtual method with clear behaviour specification.

The method signature and documentation follow best practices. The use of epoch_time_ms_t for timestamp handling is appropriate.

src/clp_ffi_js/ir/StreamReader.hpp

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 122f1cf and f4a0207.

📒 Files selected for processing (7)

CMakeLists.txt (1 hunks)
src/clp_ffi_js/ir/StreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/StreamReader.hpp (5 hunks)
src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/StructuredIrStreamReader.hpp (2 hunks)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1 hunks)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp
src/clp_ffi_js/ir/StructuredIrStreamReader.cpp
src/clp_ffi_js/ir/StructuredIrStreamReader.hpp

🧰 Additional context used

📓 Path-based instructions (3)

src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/StreamReader.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: lint (ubuntu-latest)
GitHub Check: lint (macos-latest)

🔇 Additional comments (7)

src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1)

74-75: LGTM!

The method signature is correct and properly marked with [[nodiscard]] and override attributes.

src/clp_ffi_js/ir/StreamReader.cpp (2)

132-132: Consider using std::optional with register_optional().

For better type safety and consistency, consider using std::optional<T> with Embind's register_optional() function instead of custom nullable type definitions.

This has been tracked in issue #51 for implementation in a future PR.

150-153: LGTM!

The function binding is correctly added and properly formatted.

src/clp_ffi_js/ir/StreamReader.hpp (4)

14-14: LGTM!

The include statement is correctly added.

33-33: LGTM!

The type declaration is correctly added.

128-138: LGTM!

The virtual method is well-documented with clear descriptions of parameters and return values.

289-314: LGTM!

The template implementation:

Uses std::upper_bound for efficient searching.

Properly handles edge cases (empty container and all timestamps greater than target).

Uses clear and descriptive variable names.

CMakeLists.txt

kirkrodrigues · 2025-01-29T17:21:38Z

src/clp_ffi_js/ir/StreamReader.hpp

@@ -29,6 +30,7 @@ EMSCRIPTEN_DECLARE_VAL_TYPE(ReaderOptions);
 // JS types used as outputs
 EMSCRIPTEN_DECLARE_VAL_TYPE(DecodedResultsTsType);
 EMSCRIPTEN_DECLARE_VAL_TYPE(FilteredLogEventMapTsType);
+EMSCRIPTEN_DECLARE_VAL_TYPE(LogEventIdxTsType);


Suggested change

EMSCRIPTEN_DECLARE_VAL_TYPE(LogEventIdxTsType);

// How about `NullableLogEventIdx`?

EMSCRIPTEN_DECLARE_VAL_TYPE(LogEventIdxTsType);

@kirkrodrigues Should we also change the name of FilteredLogEventMapTsType later? It is also nullable.

I do prefer that, but what do @davemarco / @junhaoliao think?

I personally even prefer making the change here, as it is just a internal type name change and won't affect any front end code.

Oh, isn't FilteredLogEventIdx also defined in the Log Viewer? It's true that the name in clp-ffi-js doesn't influence (from a code execution perspective) the name in the log viewer, but it's probably better for developers if we stay consistent. So if we rename it here, we should probably rename it in the log viewer as well.

I wouldn't change the filteredEventMap. Whether nullable in the name is better I think is a question of style. Like in general i think its better not to put type info in the name, but here I guess maybe exception since we are trying to make more clear than returns null.

src/clp_ffi_js/ir/StreamReader.hpp

kirkrodrigues · 2025-01-29T17:24:03Z

src/clp_ffi_js/ir/StreamReader.hpp

+        return LogEventIdxTsType{emscripten::val::null()};
+    }
+
+    auto first_greater_it{std::upper_bound(


std::upper_bound assumes the range is sorted, right? What would happen if the log events aren't in ascending timestamp order? It's technically not impossible and we have seen log files where that's true.

@junhaoliao iirc, I was told to assume the log events are in accending order. If that's not the case, this whole search function won't work.

If that's not the case, we have to sort all of the log events, cuz' that's the only way to fully garantee all log events are sorted. And we (might?) also need a map to remember their original location.

Since the case with out of order timestamps is rare (as far as we know), we could maybe do something like:

When decoding the file, keep track of whether the timestamp of two consecutive log events are out of order and if so, set a flag and display a warning to the user that the file contains out of order timestamps and so timestamp-related searches may not function correctly (since two log events with the same timestamp may exist at completely different positions in the log file).

If that flag is set, we use a brute-force implementation of find_nearest_log_event_by_timestamp. Otherwise, we use the implementation you have here.

When decoding the file, keep track of whether the timestamp of two consecutive log events are out of order
"keep track of" means we are basically iterating the log events during the decode right?

if that's the case, can we divide the log events up to chunks using these timestamp outliers, perform searches in the proper chunk, and finally iterate those outliers?

Would that be straightforward to do given that the log events collection changes based on the active filter? If not, then given how rare the case is (as far as we know), I'm not sure the extra code complexity is worth it.

In an offline discussion with @junhaoliao, we reach to an agreement that we can doc the assumption, and neglact this corner case.

By neglacting this corner case, we can save plenty of resource from verifying log events are in chronological order.

src/clp_ffi_js/ir/StreamReader.hpp

kirkrodrigues · 2025-01-29T17:25:40Z

Sorry for the delay in reviewing this. Let's keep further discussions in the PR so that we can all come to a consensus more quickly.

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4a0207 and 8c8c42e.

📒 Files selected for processing (6)

src/clp_ffi_js/ir/StreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/StreamReader.hpp (5 hunks)
src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/StructuredIrStreamReader.hpp (2 hunks)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1 hunks)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp
src/clp_ffi_js/ir/StructuredIrStreamReader.hpp
src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp

🧰 Additional context used

📓 Path-based instructions (3)

src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/StreamReader.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: lint (ubuntu-latest)
GitHub Check: lint (macos-latest)

🔇 Additional comments (7)

src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (2)

14-14: LGTM!

The include is necessary for the epoch_time_ms_t type used in the new method.

151-155: LGTM!

The implementation correctly delegates to the generic template function, maintaining consistency with other similar methods in the class.

src/clp_ffi_js/ir/StreamReader.cpp (2)

132-132: LGTM!

The type registration follows the established pattern for nullable types.

150-153: LGTM!

The function binding is properly placed and follows the naming convention used in other bindings.

src/clp_ffi_js/ir/StreamReader.hpp (3)

14-14: LGTM!

The include is necessary for the epoch_time_ms_t type used throughout the file.

33-33: LGTM!

The type declaration follows the established pattern for nullable types.

129-145: LGTM!

The method declaration is well documented with clear examples and edge cases.

coderabbitai · 2025-02-01T14:21:57Z

src/clp_ffi_js/ir/StreamReader.hpp

+template <typename LogEvent>
+auto StreamReader::generic_find_nearest_log_event_idx_by_timestamp(
+        LogEvents<LogEvent> const& log_events,
+        clp::ir::epoch_time_ms_t target_ts
+) -> NullableLogEventIdx {
+    if (log_events.empty()) {
+        return NullableLogEventIdx{emscripten::val::null()};
+    }
+
+    // Find the log event whose timestamp is just after `target_ts`
+    auto first_greater_it{std::upper_bound(
+            log_events.begin(),
+            log_events.end(),
+            target_ts,
+            [](clp::ir::epoch_time_ms_t ts, LogEventWithFilterData<LogEvent> const& log_event) {
+                return ts < log_event.get_timestamp();
+            }
+    )};
+
+    if (first_greater_it == log_events.begin()) {
+        return NullableLogEventIdx{emscripten::val(0)};
+    }
+
+    auto const first_greater_idx{std::distance(log_events.begin(), first_greater_it)};
+
+    return NullableLogEventIdx{emscripten::val(first_greater_idx - 1)};
+}


💡 Codebase verification

Timestamp Ordering Not Verified

The function in StreamReader.hpp employs std::upper_bound with a lambda that compares timestamps, which only works correctly when the log events are sorted in ascending order.

The various get_timestamp implementations simply return their internal timestamp (e.g. from m_timestamp in LogEventWithFilterData.hpp), and there is no evidence of any explicit sort operation in the codebase.

Instead, the log events appear to be appended (for example, via emplace_back in StructuredIrUnitHandler.cpp), meaning the natural order is assumed to be chronologically correct rather than being enforced programmatically.

🔗 Analysis chain

Verify timestamp ordering assumption.

The implementation assumes log events are sorted by timestamp in ascending order. This assumption should be verified as log files might not always maintain this order.

Run this script to check if timestamps are always in ascending order:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check if log events are always sorted by timestamp # Expected: No output means timestamps are in order ast-grep --pattern $'class $_ { $$$ get_timestamp() { $$$ } $$$ }' rg -A 5 "get_timestamp"

Length of output: 4283

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c8c42e and a984d7b.

📒 Files selected for processing (6)

src/clp_ffi_js/ir/StreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/StreamReader.hpp (5 hunks)
src/clp_ffi_js/ir/StructuredIrStreamReader.cpp (2 hunks)
src/clp_ffi_js/ir/StructuredIrStreamReader.hpp (2 hunks)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1 hunks)
src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

src/clp_ffi_js/ir/StructuredIrStreamReader.cpp
src/clp_ffi_js/ir/StructuredIrStreamReader.hpp
src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp
src/clp_ffi_js/ir/StreamReader.cpp

🧰 Additional context used

📓 Path-based instructions (2)

src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: lint (ubuntu-latest)
GitHub Check: lint (macos-latest)

🔇 Additional comments (4)

src/clp_ffi_js/ir/UnstructuredIrStreamReader.cpp (1)

161-165: LGTM!

The implementation correctly delegates to the generic implementation and follows the single responsibility principle.

src/clp_ffi_js/ir/StreamReader.hpp (3)

33-33: LGTM!

The type name clearly indicates its purpose and nullability, following the pattern used for other type declarations.

129-146: LGTM!

The method declaration is well-documented with clear explanation of its behaviour, correctly marked as virtual and nodiscard, and uses consistent parameter naming.

196-209: LGTM!

The templated method declaration is well-structured with appropriate documentation and parameter types.

coderabbitai · 2025-02-03T06:21:04Z

src/clp_ffi_js/ir/StreamReader.hpp

+template <typename LogEvent>
+auto StreamReader::generic_find_nearest_log_event_by_timestamp(
+        LogEvents<LogEvent> const& log_events,
+        clp::ir::epoch_time_ms_t target_ts
+) -> NullableLogEventIdx {
+    if (log_events.empty()) {
+        return NullableLogEventIdx{emscripten::val::null()};
+    }
+
+    // Find the log event whose timestamp is just after `target_ts`
+    auto first_greater_it{std::upper_bound(
+            log_events.begin(),
+            log_events.end(),
+            target_ts,
+            [](clp::ir::epoch_time_ms_t ts, LogEventWithFilterData<LogEvent> const& log_event) {
+                return ts < log_event.get_timestamp();
+            }
+    )};
+
+    if (first_greater_it == log_events.begin()) {
+        return NullableLogEventIdx{emscripten::val(0)};
+    }
+
+    auto const first_greater_idx{std::distance(log_events.begin(), first_greater_it)};
+
+    return NullableLogEventIdx{emscripten::val(first_greater_idx - 1)};
+}


💡 Codebase verification

Attention: Log events ordering may not be enforced.

Our investigation shows that the implementation of generic_find_nearest_log_event_by_timestamp relies on log events being in ascending timestamp order, yet searches for sorting or ordering logic (e.g. explicit calls to std::sort or other ordering mechanisms) produced no evidence to enforce this order during deserialization or insertion. This could potentially lead to incorrect log event lookups if the input sequence isn’t already sorted.

The log events appear to be added via methods like emplace_back without explicit ordering.

No ordering logic (e.g. sorting function calls) was identified to confirm that the events are sorted as required.

🔗 Analysis chain

Verify timestamp ordering assumption.

The implementation assumes log events are sorted by timestamp in ascending order. This assumption should be verified as log files might not always maintain this order.

Run this script to check if timestamps are always in ascending order:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Description: Check if timestamps are in ascending order during log event deserialization # Expected: Find all places where timestamps are added to log events # Search for timestamp assignments during deserialization ast-grep --pattern $'deserialize_log_event() { $$$ get_timestamp() $$$ }' # Search for places where log events are added to collections rg -A 5 "emplace_back.*LogEvent"

Length of output: 115

Script:

#!/bin/bash # Investigate ordering logic for log events in the codebase # 1. Search for std::sort invocations that might enforce timestamp ordering rg -n "std::sort\s*\(.*get_timestamp" --color=always # 2. Locate deserialization functions potentially responsible for creating or ordering log events rg -n "deserialize_log_event\s*\(" --color=always # 3. Find where log events are inserted into collections (e.g. using emplace_back) to inspect their ordering rg -n "emplace_back\s*\(.*LogEvent" --color=always # 4. Additionally, search for any comparison operators or functions comparing timestamps rg -n "get_timestamp\s*\(" --color=always

Length of output: 1721

…nction ONLY works with timestamps in chronological order.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)
196-322: Consider improving variable names for better readability.

The implementation is efficient and handles edge cases well. However, consider these variable name improvements:

first_greater_it could be upper_bound_it to match STL terminology

first_greater_idx could be upper_bound_idx for consistency

Apply this diff to improve readability:
-    auto first_greater_it{std::upper_bound(
+    auto upper_bound_it{std::upper_bound(

-    if (first_greater_it == log_events.begin()) {
+    if (upper_bound_it == log_events.begin()) {

-    auto const first_greater_idx{std::distance(log_events.begin(), first_greater_it)};
+    auto const upper_bound_idx{std::distance(log_events.begin(), upper_bound_it)};

-    return NullableLogEventIdx{emscripten::val(first_greater_idx - 1)};
+    return NullableLogEventIdx{emscripten::val(upper_bound_idx - 1)};

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a984d7b and 4788157.

📒 Files selected for processing (1)

src/clp_ffi_js/ir/StreamReader.hpp (5 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

src/clp_ffi_js/ir/StreamReader.hpp (1)

Pattern **/*.{cpp,hpp,java,js,jsx,ts,tsx}: - Prefer false == <expression> rather than !<expression>.

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: lint (ubuntu-latest)
GitHub Check: lint (macos-latest)

🔇 Additional comments (2)

src/clp_ffi_js/ir/StreamReader.hpp (2)

33-33: LGTM!

The type declaration is well-placed and follows the established naming pattern for output types.

129-145: LGTM with a verification note.

The method declaration is well-documented and consistent with the class's style. The documentation clearly states the chronological order assumption.

Let's verify if the chronological order assumption is documented elsewhere in the codebase:

✅ Verification successful

I'm going to extend our search beyond markdown files to check for any references to a "chronological order" documentation anywhere in the repository. Please run the script below:

This will help verify whether the assumption is documented elsewhere. Thank you.

LGTM – Chronological order assumption verified.

The search confirms that the chronological ordering is documented within the method’s comment in src/clp_ffi_js/ir/StreamReader.hpp. No other contradictory or additional documentation was found.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Search for documentation about log event ordering
# Expected: Find documentation about chronological order assumption

rg -i "chronological|timestamp.*order|sorted.*timestamp" --type md
Length of output: 68

Script:
#!/bin/bash
# Search for any reference to "chronological order" in all files across the repository.
rg -i "chronological order" --max-columns 200
Length of output: 180

kirkrodrigues

For the PR title, how about:

feat: Add support for finding the log event that's closest to a target timestamp.

kirkrodrigues · 2025-02-04T18:00:31Z

src/clp_ffi_js/ir/StreamReader.hpp

+     * - the collection of log events is sorted in chronological order, or the search won't work;
+     * - and we insert a marker log event, M, with timestamp `target_ts` into the collection (if log
+     *   events with timestamp `target_ts` already exist in the collection, M should be inserted
+     *   after them).
+     *
+     * L is the event just before M, if M is not the first event in the collection; otherwise L is
+     * the event just after M.


Suggested change

* - the collection of log events is sorted in chronological order, or the search won't work;

* - and we insert a marker log event, M, with timestamp `target_ts` into the collection (if log

* events with timestamp `target_ts` already exist in the collection, M should be inserted

* after them).

*

* L is the event just before M, if M is not the first event in the collection; otherwise L is

* the event just after M.

* - the collection of log events is sorted in chronological order;

* - and we insert a marker log event, M, with timestamp `target_ts` into the collection (if log

* events with timestamp `target_ts` already exist in the collection, M should be inserted

* after them).

*

* L is the event just before M, if M is not the first event in the collection; otherwise L is

* the event just after M.

*

* NOTE: If the collection of log events isn't in chronological order, this method has undefined

* behaviour.

Henry8192 added 3 commits November 19, 2024 17:47

search timestamp works for unstructured logs

ea23947

Merge branch 'y-scope:main' into search-timestamp

7edaa48

Merge branch 'main' into search-timestamp

7481573

# Conflicts: # src/clp_ffi_js/ir/StreamReader.cpp # src/clp_ffi_js/ir/StreamReader.hpp

Henry8192 added 3 commits December 19, 2024 14:15

fix lint

3ee05a2

implement structured logs search by timestamp

f1a71a1

Merge branch 'main' into search-timestamp

5263385

junhaoliao requested changes Dec 22, 2024

View reviewed changes

junhaoliao reviewed Dec 22, 2024

View reviewed changes

src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved

junhaoliao reviewed Dec 22, 2024

View reviewed changes

src/clp_ffi_js/ir/UnstructuredIrStreamReader.hpp Outdated Show resolved Hide resolved

junhaoliao reviewed Dec 22, 2024

View reviewed changes

src/clp_ffi_js/ir/StructuredIrStreamReader.hpp Outdated Show resolved Hide resolved

junhaoliao mentioned this pull request Dec 27, 2024

feat(URL): Add support to query timestamp by URL y-scope/yscope-log-viewer#152

Draft

address partial changes from review

c423ec5

Henry8192 added 2 commits December 29, 2024 19:52

snapshot: get_timestamp seems to be undefined for std::upper_bound

467e998

fix lint

c66cb80

junhaoliao requested changes Dec 30, 2024

View reviewed changes

src/clp_ffi_js/ir/StreamReader.hpp Outdated Show resolved Hide resolved

Henry8192 added 5 commits January 1, 2025 22:11

pass in log_events instead of iterators to generic_get_log_event_inde…

89338f7

…x_by_timestamp, use std::ranges::upper_bound instead of std::upper_bound

switch back to std::upper_bound because std::ranges::upper_bound is n…

a99ec2a

…ot suitable here

fix lint

9ec039f

change generic_get_log_event_index_by_timestamp behavior: only return…

4f125b8

…s null when log events are empty, and return index with "best effort"

edit docstring for get_log_event_index_by_timestamp

5221588

Henry8192 marked this pull request as ready for review January 6, 2025 21:42

coderabbitai bot reviewed Jan 6, 2025

View reviewed changes

src/clp_ffi_js/ir/StreamReader.hpp Outdated Show resolved Hide resolved

src/clp_ffi_js/ir/StreamReader.hpp Outdated Show resolved Hide resolved

src/clp_ffi_js/ir/StreamReader.hpp Outdated Show resolved Hide resolved

src/clp_ffi_js/ir/StreamReader.hpp Outdated Show resolved Hide resolved

fix lint

abeb4f8

coderabbitai bot reviewed Jan 6, 2025

View reviewed changes

junhaoliao reviewed Jan 7, 2025

View reviewed changes

src/clp_ffi_js/ir/StreamReader.cpp Outdated Show resolved Hide resolved

coderabbitai bot mentioned this pull request Jan 7, 2025

refactor: Convert nullable Embind types to use std::optional #51

Open

junhaoliao requested changes Jan 7, 2025

View reviewed changes

address changes from Marco's review

412b96e

coderabbitai bot reviewed Jan 13, 2025

View reviewed changes

remove unnecessary require statement for get_log_event_idx_by_timestamp

54c7df1

coderabbitai bot reviewed Jan 13, 2025

View reviewed changes

Henry8192 requested a review from davemarco January 13, 2025 17:55

davemarco requested changes Jan 13, 2025

View reviewed changes

address the rest of the comments

eff1849

Henry8192 requested a review from davemarco January 13, 2025 21:02

coderabbitai bot reviewed Jan 13, 2025

View reviewed changes

davemarco requested changes Jan 13, 2025

View reviewed changes

address code review changes

00f89d4

Henry8192 requested a review from davemarco January 14, 2025 19:12

coderabbitai bot reviewed Jan 14, 2025

View reviewed changes

src/clp_ffi_js/ir/StreamReader.hpp Outdated Show resolved Hide resolved

fix lint

122f1cf

junhaoliao requested a review from kirkrodrigues January 16, 2025 16:05

LinZhihao-723 self-requested a review January 21, 2025 19:44

resolve the rest of the conflicts

f4a0207

coderabbitai bot reviewed Jan 29, 2025

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

kirkrodrigues reviewed Jan 29, 2025

View reviewed changes

src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved

kirkrodrigues requested changes Jan 29, 2025

View reviewed changes

src/clp_ffi_js/ir/StreamReader.hpp Show resolved Hide resolved

Henry8192 added 2 commits January 30, 2025 22:39

revert the CMakeList Change

46fa81f

address kirk's comments in the code review

8c8c42e

coderabbitai bot reviewed Feb 1, 2025

View reviewed changes

revert function name to find_nearest_log_event_by_timestamp

a984d7b

coderabbitai bot reviewed Feb 3, 2025

View reviewed changes

amend find_nearest_log_event_by_timestamp's comments, warning this fu…

4788157

…nction ONLY works with timestamps in chronological order.

coderabbitai bot reviewed Feb 4, 2025

View reviewed changes

kirkrodrigues requested changes Feb 4, 2025

View reviewed changes

	EMSCRIPTEN_DECLARE_VAL_TYPE(LogEventIdxTsType);
	// How about `NullableLogEventIdx`?
	EMSCRIPTEN_DECLARE_VAL_TYPE(LogEventIdxTsType);

feat: Support search logs by timestamp for structured and unstructured logs. #42

Are you sure you want to change the base?

feat: Support search logs by timestamp for structured and unstructured logs. #42

Conversation

Henry8192 commented Dec 19, 2024 • edited by coderabbitai bot Loading

Description

Validation performed

Summary by CodeRabbit

coderabbitai bot commented Dec 19, 2024 • edited Loading

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

junhaoliao left a comment

Choose a reason for hiding this comment

junhaoliao commented Dec 29, 2024

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

davemarco left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

davemarco left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Henry8192 Feb 1, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kirkrodrigues commented Jan 29, 2025

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Feb 1, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Feb 3, 2025

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

kirkrodrigues left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Henry8192 commented Dec 19, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 19, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

Henry8192 Feb 1, 2025 •

edited

Loading