Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Catalog / Schema quoting issue #297

Open
1 of 3 tasks
samuhepp opened this issue Apr 29, 2024 · 0 comments
Open
1 of 3 tasks

Catalog / Schema quoting issue #297

samuhepp opened this issue Apr 29, 2024 · 0 comments
Labels
bug Something isn't working triage

Comments

@samuhepp
Copy link

Describe the bug

Hello!

Having been playing around with this, there looks to be an issue with the way this package constructs the naming - specifically with the Databricks adapter.

Given the following sample Yaml

version: 2

sources:
  - name: my-source
    catalog: my-catalog
    schema: my-schema
    tables:
      - name: table1
        external:
          ...
        columns:
          ...

Within databricks, you have to quote any name that doesn't contain alphanumeric and undescores using backticks "`". This doesn't happen here when I run dbt run-operation stage_external_sources. The error:

09:10:11  Encountered an error while running operation: Runtime Error
  Runtime Error
    
    [INVALID_IDENTIFIER] The identifier dataplatform-sandbox is invalid. Please, consider quoting it with back-quotes as `my-catalog`. SQLSTATE: 42602 (line 3, pos 57)
    
    == SQL ==
    /* {"app": "dbt", "dbt_version": "1.8.0b2", "dbt_databricks_version": "1.8.0b2", "databricks_sql_connector_version": "3.1.2", "profile_name": "dev", "target_name": "dev", "connection_name": "macro_stage_external_sources"} */
    
                     create schema if not exists my-catalog.my-schema
    -----------------------------------------------^^^

(It's complaining about the Hyphen)

I've attempted to quote this myself via the following config:

version: 2

sources:
  - name: my-source
    catalog: "`my-catalog`"
    schema: "`my-schema`"
    tables:
      - name: table1
        external:
          ...
        columns:
          ...

which seems to get past this issue, but then fails later down the line because all identifiers are being quoted twice.

    [PARSE_SYNTAX_ERROR] Syntax error at or near 'my'. SQLSTATE: 42601 (line 5, pos 31)
    
    == SQL ==
    /* {"app": "dbt", "dbt_version": "1.8.0b2", "dbt_databricks_version": "1.8.0b2", "databricks_sql_connector_version": "3.1.2", "profile_name": "dev", "target_name": "dev", "connection_name": "macro_stage_external_sources"} */
    
                     
        
            drop table if exists ``my-catalog``.``my-schema``.`raw_adis`
    -------------------------------^^^

(note the double quotes)

Steps to reproduce

Expected results

Creation of schema if not exists should be quoted

Actual results

Fails when running dbt run-operation stage_external_sources

Screenshots and log output

System information

The contents of your packages.yml file:

Which database are you using dbt with?

  • redshift
  • snowflake
  • other (specify: databricks)

The output of dbt --version:

Core:
  - installed: 1.8.0-b2
  - latest:    1.7.13   - Ahead of latest version!

Plugins:
  - databricks: 1.8.0b2 - Ahead of latest version!
  - spark:      1.8.0b2 - Ahead of latest version!

The operating system you're using:

The output of python --version:

Python 3.10.12

Additional context

@samuhepp samuhepp added bug Something isn't working triage labels Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

1 participant