Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create and use adapter subclass of Relational that forwards to a parent object or to a data frame scan #949

Open
7 of 8 tasks
krlmlr opened this issue Jan 4, 2025 · 3 comments · May be fixed by #960
Open
7 of 8 tasks

Comments

@krlmlr
Copy link
Collaborator

krlmlr commented Jan 4, 2025

Closes tidyverse/duckplyr#442.

The example below highlights the problem.

Implementation guide:

  • Create a subclass of Relational that takes an ALTREP data frame as an SEXP
  • Implement methods for this Relational object to unconditionally forward to the parent relational object stored in the SEXP
  • Create C++ functions rel_project2() and rel_filter2()
  • Use the new Relational subclass in these new C++ functions
  • Ensure the example passes when the ...2() functions are used
  • In the subclass, instead of unconditionally forwarding, now check if the data frame is materialized, and if yes, forward to a new relational object with a data frame scan instead
  • Check that the example still works and actually uses the materialized data frame
  • Ensure that all operators have ...2() versions
drv <- duckdb::duckdb()
con <- DBI::dbConnect(drv)
df1 <- tibble::tibble(a = 1)

"mutate"
#> [1] "mutate"
rel1 <- duckdb:::rel_from_df(con, df1)
"mutate"
#> [1] "mutate"
rel2 <- duckdb:::rel_project(
  rel1,
  list(
    {
      tmp_expr <- duckdb:::expr_reference("a")
      duckdb:::expr_set_alias(tmp_expr, "a")
      tmp_expr
    },
    {
      tmp_expr <- duckdb:::expr_constant(2)
      duckdb:::expr_set_alias(tmp_expr, "b")
      tmp_expr
    }
  )
)
"filter"
#> [1] "filter"
rel3 <- duckdb:::rel_filter(
  rel2,
  list(
    duckdb:::expr_comparison(
      "==",
      list(
        duckdb:::expr_reference("b"),
        duckdb:::expr_constant(2)
      )
    )
  )
)
rel3
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Filter [(b = 2.0)]
#>   Projection [a as a, 2.0 as b]
#>     r_dataframe_scan(0x125e1b3d0)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)

# This materializes the data frame
duckdb:::rel_to_altrep(rel2)
#>   a b
#> 1 1 2

# Expecting this to use a data frame scan only, without a projection
rel2
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Projection [a as a, 2.0 as b]
#>   r_dataframe_scan(0x125e1b3d0)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)

# Expecting this to use filter only, with a data frame scan based on rel2
rel3
#> DuckDB Relation: 
#> ---------------------
#> --- Relation Tree ---
#> ---------------------
#> Filter [(b = 2.0)]
#>   Projection [a as a, 2.0 as b]
#>     r_dataframe_scan(0x125e1b3d0)
#> 
#> ---------------------
#> -- Result Columns  --
#> ---------------------
#> - a (DOUBLE)
#> - b (DOUBLE)

Created on 2025-01-04 with reprex v2.1.1

@krlmlr
Copy link
Collaborator Author

krlmlr commented Jan 6, 2025

Example with _project2() and _filter2() :

drv <- duckdb::duckdb()
con <- DBI::dbConnect(drv)
df1 <- tibble::tibble(a = 1)

"mutate"
rel2 <- duckdb:::rel_project2(
  df1,
  list(
    {
      tmp_expr <- duckdb:::expr_reference("a")
      duckdb:::expr_set_alias(tmp_expr, "a")
      tmp_expr
    },
    {
      tmp_expr <- duckdb:::expr_constant(2)
      duckdb:::expr_set_alias(tmp_expr, "b")
      tmp_expr
    }
  )
)
df2 <- duckdb:::rel_to_altrep(rel2)
"filter"
rel3 <- duckdb:::rel_filter2(
  df2,
  list(
    duckdb:::expr_comparison(
      "==",
      list(
        duckdb:::expr_reference("b"),
        duckdb:::expr_constant(2)
      )
    )
  )
)
rel3

# This materializes the data frame
duckdb:::rel_to_altrep(rel2)

# Expecting this to use a data frame scan only, without a projection
rel2

# Expecting this to use filter only, with a data frame scan based on rel2
rel3

@krlmlr
Copy link
Collaborator Author

krlmlr commented Jan 8, 2025

@Antonov548: Revised, the rel_() functions now return a data frame:

drv <- duckdb::duckdb()
con <- DBI::dbConnect(drv)
df1 <- tibble::tibble(a = 1)

"mutate"
df2 <- duckdb:::rel_project2(
  df1,
  list(
    {
      tmp_expr <- duckdb:::expr_reference("a")
      duckdb:::expr_set_alias(tmp_expr, "a")
      tmp_expr
    },
    {
      tmp_expr <- duckdb:::expr_constant(2)
      duckdb:::expr_set_alias(tmp_expr, "b")
      tmp_expr
    }
  )
)
"filter"
df3 <- duckdb:::rel_filter2(
  df2,
  list(
    duckdb:::expr_comparison(
      "==",
      list(
        duckdb:::expr_reference("b"),
        duckdb:::expr_constant(2)
      )
    )
  )
)

duckb:::rel_from_altrep_df(df3)

# This materializes the data frame
df2

# Expecting this to use a data frame scan only, without a projection
duckb:::rel_from_altrep_df(df2)

# Expecting this to use filter only, with a data frame scan based on rel2
duckb:::rel_from_altrep_df(df3)

@krlmlr
Copy link
Collaborator Author

krlmlr commented Jan 8, 2025

According to ChatGPT, this is probably a variant of the "proxy" pattern. Perhaps a "router", but the typical use cases are a bit different from ours.


The design pattern you are describing is commonly known as the Router Pattern or sometimes a Switch Pattern. It is a variant of the Proxy Pattern, specifically tailored for scenarios where the decision to delegate behavior to one of multiple instances is dynamic and based on some switching logic.

Key Characteristics of This Pattern:

  1. Interface Implementation: The wrapper implements the same interface as the objects it wraps, ensuring compatibility.
  2. Dynamic Routing: The wrapper contains logic to determine which wrapped instance should handle a given request based on some condition or switch.
  3. Encapsulation: The clients interact only with the wrapper, abstracting away the details of the delegation.

Example Use Cases:

  • Load Balancing: Forwarding requests to different instances based on load, availability, or region.
  • Feature Toggles: Switching between two implementations of a feature (e.g., an old version and a new version) based on a feature flag.
  • Logging or Debugging: Forwarding requests to a real implementation or a mock/stub for debugging purposes.
  • Multitenancy: Directing requests to different instances based on the tenant or user making the request.

Code Example:

Here's an example in Python:

from abc import ABC, abstractmethod

# Define the interface
class Service(ABC):
    @abstractmethod
    def perform_action(self, data):
        pass

# Concrete implementations of the interface
class ServiceA(Service):
    def perform_action(self, data):
        print(f"ServiceA handling: {data}")

class ServiceB(Service):
    def perform_action(self, data):
        print(f"ServiceB handling: {data}")

# The Router (switching logic)
class ServiceRouter(Service):
    def __init__(self, service_a: Service, service_b: Service):
        self.service_a = service_a
        self.service_b = service_b

    def perform_action(self, data):
        # Switching logic
        if data.startswith("A"):
            self.service_a.perform_action(data)
        else:
            self.service_b.perform_action(data)

# Usage
service_a = ServiceA()
service_b = ServiceB()
router = ServiceRouter(service_a, service_b)

router.perform_action("A123")  # Routed to ServiceA
router.perform_action("B456")  # Routed to ServiceB

Related Design Patterns:

  • Proxy Pattern: Similar in concept but typically used for adding functionality (e.g., access control, logging) rather than routing.
  • Strategy Pattern: Focuses on dynamically switching between different algorithms or strategies.
  • Chain of Responsibility Pattern: Requests are passed along a chain of handlers, where each handler decides to process or forward the request.

Let me know if you'd like to explore this further!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant