persistent issue with the fivetran_connector_schema_config issue #310

jakerohald · 2024-05-15T05:39:20Z

I'm currently addressing the challenges posed by the fivetran_terraform_config_schema updates from the 1.1.18 release. Our company manages roughly 700 connections, including numerous database shards, which makes the current fix considerably impractical for our scale.

The recent discussions around the new release noted that the fix was successfully applied to just six connectors. However, with many connections, each hosting several schemas (typically around five) each featuring a substantial number of nested column fields (between 10-20), this approach is not viable as it currently is.

beevital · 2024-05-15T13:03:37Z

@jakerohald have you tried schemas_json field with file approach?

beevital · 2024-05-15T13:05:27Z

Or you're talking about overall complexity of settings format?
It's tied to current API contract - so we have a lot of restrictions.
What exact problems are you facing now, could you please share an example of impractical case?

beevital · 2024-05-15T13:08:51Z

We are currently thinking about enhancements of the existing approach. Do you have desired proposal of how it would be better to define schema settings on large connectors (with several huge schemas)?.

jakerohald · 2024-05-16T08:21:26Z

Hey @beevital - I have tried the schemas_config.

Happy to share what I mean by 'impratical case'

When I say impractical I mean the following - we have multiple db's, each of which are sharded and so we split the connector. Taking one example - one of our sql server databases has 8 shards, which itself contains 6 schemas, which contain 8, 6, 1, 1, 1 and 12 tables each respectively, and each table contains between 6 and 20 fields each. This doesnt seem to scale (reflected by the fact that the fivetran_connector_schema_config state refresh takes an impratical amount of time for a single schema).

Thanks.

beevital · 2024-05-16T09:40:31Z

fivetran_connector_schema_config state refresh takes an impratical amount of time for a single schema

Could you share some logs so I'll be able to investigate this on API side?

It should not reloas schema on each refresh, and should work pretty fast after the first apply

beevital · 2024-05-16T10:35:35Z

specificly - I'm interested in connection_id (connector_id)

jakerohald · 2024-05-20T02:27:44Z

The issue mentioned above has been resolved. The problem was caused by the schema configuration in the state file, which was deployed using the old schema method. To fix it, the old schema configuration needed to be removed and then reapplied, as using the 'refresh' with the old method would cause it to hang. It is worth mentioning in the documentation that if resources were deployed using the old method, they should be destroyed first before reapplying with the new method.

However, we have discovered another issue that is worth mentioning here. Although the example uses the GitHub connector, it applies to other connectors we tested as well.

For instance, we write the output to a JSON file for the GitHub connector as follows (note: fields used for simplicity in this demonstration have been removed):

{
"github": {
"enabled": true,
"tables": {
"deployment": {
"columns": {
"createdAt": {
"enabled": true
},
"description": {
"enabled": true
}
},
"enabled": true,
"history_mode": false
}
}
}
}

We then deploy our resource using the 'BLOCK_ALL' schema change handling and it produces the following apply plan:

 ~ id                     = "ihe_theorists" -> (known after apply)
  ~ schemas_json           = jsonencode(
      ~ {
          ~ github = {
              ~ enabled = "true" -> true (note we didnt deploy ever using "true" vs true - indicating a potential parsing issue?)
              ~ tables  = {
                  ~ deployment = {
                      + columns      = {
                          + createdAt   = {
                              + enabled = true
                            }
                          + description = {
                              + enabled = true
                            }
                        }
                      ~ enabled      = "true" -> true
                      + history_mode = false
                    }
                }
            }
        }
    )
    # (2 unchanged attributes hidden)

    # (1 unchanged block hidden)
}

--
When i actually apply the change, the error we get is:

│ When applying changes to fivetran_connector_schema_config.contract_schema["github"], provider "provider["[registry.terraform.io/fivetran/fivetran](http://registry.terraform.io/fivetran/fivetran%5C)"]"
│ produced an unexpected new value: .schemas_json: was cty.StringVal("{\n "github": {\n "enabled": true,\n "tables": {\n
│ "deployment": {\n "enabled": true,\n "history_mode": false,\n "columns": {\n "createdAt": {\n "enabled":
│ true\n },\n "description": {\n "enabled": true\n }\n }\n }\n }\n }\n}"), but now
│ cty.StringVal("{"github":{"enabled":"true","tables":{"deployment":{"enabled":"true"}}}}")

We are running the following terraform versions:

Terraform v1.8.2
on darwin_arm64

provider registry.terraform.io/fivetran/fivetran v1.1.23
provider registry.terraform.io/hashicorp/aws v5.50.0
provider registry.terraform.io/hashicorp/local v2.5.1
provider registry.terraform.io/hashicorp/null v3.2.2

beevital · 2024-05-21T07:48:23Z

Yeah, understood - github connector returns columns as well, but the original config doesn't contain it.
Have you tried field schemas - that is map-based? It should not produce such issues as schemas_json, but it a bit slower.

I'll try to figure out how could we manage schemas_json better.

jakerohald · 2024-05-21T08:55:36Z

@beevital what do you mean by github connector returns columns as well, but the original config doesn't contain it - how does that relate to the new issue outlined above?

beevital · 2024-05-23T14:17:21Z

@jakerohald it's just my thoughts about what is happening. I know how it works on API side and inside provider.
API returns response that contains additional elements and there's a difference between expected value and actually returned. (one contains columns as well)

Also - I'm wondering how you passed "history_mode": false ?
this field will be just ignored by API. You need to use "sync_mode": "HISTORY" instead.
please refer to API docs (as schemas_json field expect json compatible with API payload)

beevital · 2024-05-23T14:18:14Z

https://fivetran.com/docs/rest-api/connectors#payloadparameters_6

jakerohald · 2024-05-24T02:40:37Z

As you said this wont affect the deployment, but we went ahead and fixed this up.

We have closely identified the source of the issue when we deploy the resource

{
"github_XXX": {
"enabled": true,
"sync_mode": "HISTORY",
"tables": {
"deployment": {
"enabled": true,
"columns": {
"createdAt": {
"enabled": true/false # this line breaks it. Leaving this line blank lets the resource be successfully deployed - although leaving it blank doesnt do anything for the column itself (no default behaviour)
}
}
},
"team": {
"enabled": true
}
}
}
}

Happy to hear your subsequent thoughts.

beevital · 2024-05-24T12:06:48Z

As I said - it's a bug: so I have to update the way how we compare value of schemas_json we store in state with the one we receive in API response after applying.

jakerohald · 2024-06-26T22:51:58Z

Hi @beevital , any update on the timeline for a possible fix? Thanks very much :)

beevital · 2024-06-27T10:09:27Z

@jakerohald I'm not sure if the issues you're facing were actually fixed in v1.2.0 - could you please try it and report about provider behaviour.

jakerohald · 2024-06-28T01:50:25Z

Sure. @beevital it works with the old schema config block, but same error as above for schemas and schemas_json

beevital · 2024-06-28T09:06:29Z

Okay, will take a look deeper into it.

jakerohald · 2024-08-19T05:04:20Z

Hi @beevital , just wanted to check if and see if any thought has been given to this. I just tested this with BLOCK_ALL for provider version 1.2.6 and no fix is in - same problem.

Was wondering if this can be looked at again as it would help unblock all resources created by BLOCK_ALL.

beevital · 2024-08-20T09:13:19Z

Nope, unfortunately I had no change to dedicate any time on it =(
We will try to prioritise this issue and investigate it in nearest time

jakerohald · 2024-08-21T07:09:08Z

Thanks very much!

jakerohald · 2024-08-28T00:34:36Z

@fivetran-jovanmanojlovic can I clarify if the intended fix for this was in version 1.2.8?

If so, I have tested it and it still produces some problems in apply, happy to share logs if thats the case.

jakerohald · 2024-09-03T23:35:11Z

@beevital is there any update re the comment above? Thanks!

peyyero · 2024-09-12T01:51:32Z

Hi @fivetran-jovanmanojlovic,

I have noticed that this matter has been assigned to you. I am inquiring about the progress made in resolving this issue. My colleague Jake has been in contact with @beevital regarding this matter but would like clarification on its priority.

This specific matter has been causing significant challenges for us, and I am hopeful that it can be prioritised accordingly.

I am available to offer further information if needed.

Thank you.

jakerohald added the bug Something isn't working label May 15, 2024

beevital added question Further information is requested and removed bug Something isn't working labels May 15, 2024

beevital self-assigned this May 15, 2024

beevital added the bug Something isn't working label May 21, 2024

beevital assigned fivetran-jovanmanojlovic Aug 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

persistent issue with the fivetran_connector_schema_config issue #310

persistent issue with the fivetran_connector_schema_config issue #310

jakerohald commented May 15, 2024

beevital commented May 15, 2024

beevital commented May 15, 2024

beevital commented May 15, 2024

jakerohald commented May 16, 2024

beevital commented May 16, 2024 •

edited

Loading

beevital commented May 16, 2024

jakerohald commented May 20, 2024

beevital commented May 21, 2024

jakerohald commented May 21, 2024 •

edited

Loading

beevital commented May 23, 2024

beevital commented May 23, 2024

jakerohald commented May 24, 2024

beevital commented May 24, 2024

jakerohald commented Jun 26, 2024

beevital commented Jun 27, 2024

jakerohald commented Jun 28, 2024

beevital commented Jun 28, 2024

jakerohald commented Aug 19, 2024

beevital commented Aug 20, 2024

jakerohald commented Aug 21, 2024

jakerohald commented Aug 28, 2024

jakerohald commented Sep 3, 2024

peyyero commented Sep 12, 2024

persistent issue with the fivetran_connector_schema_config issue #310

persistent issue with the fivetran_connector_schema_config issue #310

Comments

jakerohald commented May 15, 2024

beevital commented May 15, 2024

beevital commented May 15, 2024

beevital commented May 15, 2024

jakerohald commented May 16, 2024

beevital commented May 16, 2024 • edited Loading

beevital commented May 16, 2024

jakerohald commented May 20, 2024

beevital commented May 21, 2024

jakerohald commented May 21, 2024 • edited Loading

beevital commented May 23, 2024

beevital commented May 23, 2024

jakerohald commented May 24, 2024

beevital commented May 24, 2024

jakerohald commented Jun 26, 2024

beevital commented Jun 27, 2024

jakerohald commented Jun 28, 2024

beevital commented Jun 28, 2024

jakerohald commented Aug 19, 2024

beevital commented Aug 20, 2024

jakerohald commented Aug 21, 2024

jakerohald commented Aug 28, 2024

jakerohald commented Sep 3, 2024

peyyero commented Sep 12, 2024

beevital commented May 16, 2024 •

edited

Loading

jakerohald commented May 21, 2024 •

edited

Loading