posit-dev
diff --git a/‎docs/_quarto.yml‎
Lines changed: 1 addition & 0 deletions b/‎docs/_quarto.yml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎docs/user-guide/yaml-reference.qmd‎
Lines changed: 36 additions & 1 deletion b/‎docs/user-guide/yaml-reference.qmd‎
Lines changed: 36 additions & 1 deletion
diff --git a/‎docs/user-guide/yaml-validation-workflows.qmd‎
Lines changed: 130 additions & 0 deletions b/‎docs/user-guide/yaml-validation-workflows.qmd‎
Lines changed: 130 additions & 0 deletions
diff --git a/‎pointblank/data/api-docs.txt‎
Lines changed: 49 additions & 4 deletions b/‎pointblank/data/api-docs.txt‎
Lines changed: 49 additions & 4 deletions
@@ -194,6 +194,7 @@ quartodoc:
         can split the data based on the validation results (with `get_sundered_data()`).
       contents:
         - name: Validate.interrogate
+        - name: Validate.set_tbl
         - name: Validate.get_tabular_report
         - name: Validate.get_step_report
         - name: Validate.get_json_report
 
@@ -56,6 +56,40 @@ tbl:
     pl.scan_csv("data.csv").filter(pl.col("date") >= "2024-01-01")
 ```
 
+#### Using Templates with `set_tbl=`
+
+For reusable validation templates that will always use a custom data source via the `set_tbl=`
+parameter in `yaml_interrogate()`, the `tbl` field is still required but its value doesn't matter
+since it will be overridden. Recommended approaches:
+
+```yaml
+# Option 1: Use a valid dataset name (gets overridden anyway)
+tbl: small_table  # Will be ignored when `set_tbl=` is used
+
+# Option 2: Use YAML null (clearest semantic intent)
+tbl: null  # Indicates table will be provided via `set_tbl=`
+```
+
+When using `yaml_interrogate()` with `set_tbl=`, the validation template becomes fully reusable:
+
+```python
+# Define reusable template
+template = """
+tbl: null  # Will be overridden
+tbl_name: "Sales Validation"
+steps:
+  - col_exists:
+      columns: [customer_id, revenue, region]
+  - col_vals_gt:
+      columns: [revenue]
+      value: 0
+"""
+
+# Apply to different datasets
+q1_result = pb.yaml_interrogate(template, set_tbl=q1_data)
+q2_result = pb.yaml_interrogate(template, set_tbl=q2_data)
+```
+
 ### DataFrame Library (`df_library`)
 
 The `df_library` key controls which DataFrame library is used to load data sources. This parameter
@@ -117,7 +151,7 @@ thresholds:
   critical: 0.15   # 15% failure rate triggers critical
 ```
 
-- values: numbers between 0 and 1 (percentages) or integers (row counts)
+- values: numbers between `0` and `1` (percentages) or integers (row counts)
 - levels: `warning`, `error`, `critical`
 
 ### Global Actions
@@ -477,6 +511,7 @@ For Pandas DataFrames (when using `df_library: pandas`):
 ```yaml
 - specially:
     expr: "lambda df: df.assign(is_valid=df['a'] + df['d'] > 0)"
+```
 
 ## Column Selection Patterns
 
 
@@ -167,6 +167,136 @@ tbl:
     )
 ```
 
+## Reusable Templates with `set_tbl=`
+
+One of the most powerful features of YAML validation workflows is the ability to create reusable
+templates that can be applied to different datasets. Using the `set_tbl=` parameter with
+`yaml_interrogate()`, you can define validation logic once and apply it to multiple data sources.
+
+### Creating Validation Templates
+
+When creating templates for use with `set_tbl=`, the `tbl` field is still required but its value
+will be overridden. The recommended approach is to use `tbl: null`:
+
+```yaml
+tbl: null
+tbl_name: "Sales Data Validation Template"
+label: "Standard validation checks for sales data"
+steps:
+  - col_exists:
+      columns: [customer_id, revenue, region, date]
+  - col_vals_not_null:
+      columns: [customer_id, revenue]
+  - col_vals_gt:
+      columns: [revenue]
+      value: 0
+  - col_vals_in_set:
+      columns: [region]
+      set: [North, South, East, West]
+```
+
+### Applying Templates to Multiple Datasets
+
+Here's a practical example showing how to apply the same validation template to multiple quarterly
+datasets, demonstrating the power of reusable YAML configurations:
+
+```{python}
+import pointblank as pb
+import polars as pl
+
+# Define the template once
+sales_template = """
+tbl: null  # Will be overridden
+tbl_name: "Sales Data Validation"
+label: "Standard sales validation checks"
+thresholds:
+  warning: 0.05
+  error: 0.1
+steps:
+  - col_exists:
+      columns: [customer_id, revenue, region]
+  - col_vals_not_null:
+      columns: [customer_id, revenue]
+  - col_vals_gt:
+      columns: [revenue]
+      value: 0
+  - col_vals_in_set:
+      columns: [region]
+      set: [North, South, East, West]
+"""
+
+# Create different datasets
+q1_data = pl.DataFrame({
+    "customer_id": [1, 2, 3, 4],
+    "revenue": [100, 200, 150, 300],
+    "region": ["North", "South", "East", "West"]
+})
+
+q2_data = pl.DataFrame({
+    "customer_id": [5, 6, 7, 8],
+    "revenue": [250, 180, 220, 350],
+    "region": ["South", "North", "West", "East"]
+})
+
+# Apply the same template to both datasets
+q1_result = pb.yaml_interrogate(sales_template, set_tbl=q1_data)
+q2_result = pb.yaml_interrogate(sales_template, set_tbl=q2_data)
+
+print(f"Q1 validation: {all(v.all_passed for v in q1_result.validation_info)}")
+print(f"Q2 validation: {all(v.all_passed for v in q2_result.validation_info)}")
+```
+
+### Template Best Practices
+
+1. **Use `tbl: null`**: this clearly indicates the template expects a data source to be provided
+2. **Include comprehensive metadata**: use `tbl_name`, `label`, and `brief` to make results
+self-documenting
+3. **Set appropriate thresholds**: define warning/error levels that make sense for your use case
+4. **Version control templates**: store templates in your repository alongside your data processing
+code
+5. **Test with sample data**: validate your templates work with representative datasets
+
+### Common Template Patterns
+
+For API response validation, you can ensure that responses have the expected structure and valid
+status codes:
+
+```yaml
+tbl: null
+tbl_name: "API Response Validation"
+brief: "Standard checks for API response data"
+steps:
+  - col_exists:
+      columns: [user_id, status, timestamp]
+  - col_vals_in_set:
+      columns: [status]
+      set: [success, error, pending]
+  - col_vals_not_null:
+      columns: [user_id, timestamp]
+```
+
+For file upload validation, you can check file sizes and formats to ensure they meet your
+requirements:
+
+```yaml
+tbl: null
+tbl_name: "File Upload Validation"
+steps:
+  - col_vals_gt:
+      columns: [file_size]
+      value: 0
+  - col_vals_lt:
+      columns: [file_size]
+      value: 10485760  # 10MB limit
+  - col_vals_in_set:
+      columns: [file_type]
+      set: [csv, json, xlsx, parquet]
+```
+
+This template approach is particularly valuable in data pipelines, ETL processes, and automated
+testing scenarios where you need to apply consistent validation logic across multiple similar
+datasets.
+
 ## Validation Steps
 
 YAML supports all of Pointblank's validation methods. Here are some common patterns:
 
@@ -9798,7 +9798,7 @@ validation workflows. The `yaml_interrogate()` function can be used to run a val
 YAML strings or files. The `validate_yaml()` function checks if the YAML configuration
 passes its own validity checks.
 
-yaml_interrogate(yaml: 'Union[str, Path]') -> 'Validate'
+yaml_interrogate(yaml: 'Union[str, Path]', set_tbl: 'Union[FrameT, Any, None]' = None) -> 'Validate'
 Execute a YAML-based validation workflow.
 
     This is the main entry point for YAML-based validation workflows. It takes YAML configuration
@@ -9813,13 +9813,20 @@ Execute a YAML-based validation workflow.
     yaml
         YAML configuration as string or file path. Can be: (1) a YAML string containing the
         validation configuration, or (2) a Path object or string path to a YAML file.
+    set_tbl
+        An optional table to override the table specified in the YAML configuration. This allows you
+        to apply a YAML-defined validation workflow to a different table than what's specified in
+        the configuration. If provided, this table will replace the table defined in the YAML's
+        `tbl` field before executing the validation workflow. This can be any supported table type
+        including DataFrame objects, Ibis table objects, CSV file paths, Parquet file paths, GitHub
+        URLs, or database connection strings.
 
     Returns
     -------
     Validate
-        An instance of the `Validate` class that has been configured based on the YAML input.
-        This object contains the results of the validation steps defined in the YAML configuration.
-        It includes metadata like table name, label, language, and thresholds if specified.
+        An instance of the `Validate` class that has been configured based on the YAML input. This
+        object contains the results of the validation steps defined in the YAML configuration. It
+        includes metadata like table name, label, language, and thresholds if specified.
 
     Raises
     ------
@@ -9918,6 +9925,44 @@ Execute a YAML-based validation workflow.
     This approach is particularly useful for storing validation configurations as part of your data
     pipeline or version control system, allowing you to maintain validation rules alongside your
     code.
+
+    ### Using `set_tbl=` to Override the Table
+
+    The `set_tbl=` parameter allows you to override the table specified in the YAML configuration.
+    This is useful when you have a template validation workflow but want to apply it to different
+    tables:
+
+    ```python
+    import polars as pl
+
+    # Create a test table with similar structure to small_table
+    test_table = pl.DataFrame({
+        "date": ["2023-01-01", "2023-01-02", "2023-01-03"],
+        "a": [1, 2, 3],
+        "b": ["1-abc-123", "2-def-456", "3-ghi-789"],
+        "d": [150, 200, 250]
+    })
+
+    # Use the same YAML config but apply it to our test table
+    yaml_config = '''
+    tbl: small_table  # This will be overridden
+    tbl_name: Test Table  # This name will be used
+    steps:
+    - col_exists:
+        columns: [date, a, b, d]
+    - col_vals_gt:
+        columns: [d]
+        value: 100
+    '''
+
+    # Execute with table override
+    result = pb.yaml_interrogate(yaml_config, set_tbl=test_table)
+    print(f"Validation applied to: {result.tbl_name}")
+    result
+    ```
+
+    This feature makes YAML configurations more reusable and flexible, allowing you to define
+    validation logic once and apply it to multiple similar tables.
 
 
 validate_yaml(yaml: 'Union[str, Path]') -> 'None'