@@ -5919,6 +5919,66 @@ tbl_match(self, tbl_compare: 'FrameT | Any', pre: 'Callable | None' = None, thre
59195919 Aside from reporting failure conditions, thresholds can be used to determine the actions to
59205920 take for each level of failure (using the `actions=` parameter).
59215921
5922+ Cross-Backend Validation
5923+ ------------------------
5924+ The `tbl_match()` method supports **automatic backend coercion** when comparing tables from
5925+ different backends (e.g., comparing a Polars DataFrame against a Pandas DataFrame, or
5926+ comparing database tables from DuckDB/SQLite against in-memory DataFrames). When tables with
5927+ different backends are detected, the comparison table is automatically converted to match the
5928+ data table's backend before validation proceeds.
5929+
5930+ **Certified Backend Combinations:**
5931+
5932+ All combinations of the following backends have been tested and certified to work (in both
5933+ directions):
5934+
5935+ - Pandas DataFrame
5936+ - Polars DataFrame
5937+ - DuckDB (native)
5938+ - DuckDB (as Ibis table)
5939+ - SQLite (via Ibis)
5940+
5941+ Note that database backends (DuckDB, SQLite, PostgreSQL, MySQL, Snowflake, BigQuery) are
5942+ automatically materialized during validation:
5943+
5944+ - if comparing **against Polars**: materialized to Polars
5945+ - if comparing **against Pandas**: materialized to Pandas
5946+ - if **both tables are database backends**: both materialized to Polars
5947+
5948+ This ensures optimal performance and type consistency.
5949+
5950+ **Data Types That Work Best in Cross-Backend Validation:**
5951+
5952+ - numeric types: int, float columns (including proper NaN handling)
5953+ - string types: text columns with consistent encodings
5954+ - boolean types: True/False values
5955+ - null values: `None` and `NaN` are treated as equivalent across backends
5956+ - list columns: nested list structures (with basic types)
5957+
5958+ **Known Limitations:**
5959+
5960+ While many data types work well in cross-backend validation, there are some known
5961+ limitations to be aware of:
5962+
5963+ - date/datetime types: When converting between Polars and Pandas, date objects may be
5964+ represented differently. For example, `datetime.date` objects in Pandas may become
5965+ `pd.Timestamp` objects when converted from Polars, leading to false mismatches. To work
5966+ around this, ensure both tables use the same datetime representation before comparison.
5967+ - custom types: User-defined types or complex nested structures may not convert cleanly
5968+ between backends and could cause unexpected comparison failures.
5969+ - categorical types: Categorical/factor columns may have different internal
5970+ representations across backends.
5971+ - timezone-aware datetimes: Timezone handling differs between backends and may cause
5972+ comparison issues.
5973+
5974+ Here are some ideas to overcome such limitations:
5975+
5976+ - for date/datetime columns, consider using `pre=` preprocessing to normalize representations
5977+ before comparison.
5978+ - when working with custom types, manually convert tables to the same backend before using
5979+ `tbl_match()`.
5980+ - use the same datetime precision (e.g., milliseconds vs microseconds) in both tables.
5981+
59225982 Examples
59235983 --------
59245984 For the examples here, we'll create two simple tables to demonstrate the `tbl_match()`
@@ -5980,8 +6040,8 @@ tbl_match(self, tbl_compare: 'FrameT | Any', pre: 'Callable | None' = None, thre
59806040 validation
59816041 ```
59826042
5983- The validation table shows that the test unit failed because the tables don't match (one
5984- value is different in column `c`).
6043+ The validation table shows that the single test unit failed because the tables don't match
6044+ (one value is different in column `c`).
59856045
59866046
59876047conjointly(self, *exprs: 'Callable', pre: 'Callable | None' = None, thresholds: 'int | float | bool | tuple | dict | Thresholds' = None, actions: 'Actions | None' = None, brief: 'str | bool | None' = None, active: 'bool' = True) -> 'Validate'
0 commit comments