-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix queryId calculation for queries with grouping clause #665
Conversation
Problem description: Following warning messages appear "WARNING: unrecognized node type: 925" "WARNING: unrecognized node type: 924" if 'pg_stat_statements' extension is enabled and query, containing GROUP BY clause or GROUPING function, is executed. Cause: In JumbleExpr, called from JumbleQuery, there is no handling for node tags T_GroupingClause and T_GroupingFunc. Fix: Added handling logic for missed node tags. For T_GroupingFunc we append to the jumble the tag itself and list of arguments. Field 'ngrpcols' is not appended to the jumble, because it can be deduced from the list of arguments in grouping clause. Plus added handling for T_Integer tag, because it is required to parse list of arguments. For T_GroupingClause we append to the jumble the tag itself, grouping type and list of groupsets. Field 'location' is not appended to the jumble, because it is the textual location from parser and is not semantically significant.
Allure report https://allure-ee.adsw.io/launch/60788 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/909714 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/909715 |
If enabled, pg_stat_statements may cause warnings in case jumbling logic doesn't handle some part of a query. This simple test will check that no warnings are generated on different queries. Currently added 1 query with GROUP clause and GROUPING function, as it recently caused such issues. This test later may be improved by adding more test queries.
Allure report https://allure-ee.adsw.io/launch/60873 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/915047 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/915048 |
Allure report https://allure-ee.adsw.io/launch/60887 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/915839 |
Failed job Behave tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/917776 |
Failed job Behave tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/920047 |
Modified the pg_stat_statements test to enable the pg_stat_statements extension automatically at the beginning of the test, and disable the extension once the test is complete. Also added minor refactoring of recently added code in JumbleExpr - moved all new code to the end of the switch case in order to simplify the patch.
Allure report https://allure-ee.adsw.io/launch/61084 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/928426 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/928427 |
Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/928417 |
Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/928525 |
Can you provide a test not only to check for the absence of warnings, but also that the query text in pg_stat_statements differs with and without a patch (if possible). To check the correctness of the logic implementation. |
it seems to me that the problem is more that queryId is calculated incorrectly (for example, the |
With this update the test for pg_stat_statements verifies that queryId for queries with semantically different GROUP BY options are different (thus these queries reside in seperate entries of pg_stat_statements statistics).
Allure report https://allure-ee.adsw.io/launch/61416 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/949680 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/949681 |
Allure report https://allure-ee.adsw.io/launch/62408 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/992650 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/992651 |
Failed job Behave tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/992648 |
Allure report https://allure-ee.adsw.io/launch/62478 |
Failed job Resource group isolation tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/996082 |
Failed job Resource group isolation tests on ppc64le: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/996083 |
Failed job Regression tests with ORCA on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/996076 |
Failed job Behave tests on x86_64: https://gitlab.adsw.io/arenadata/github_mirroring/gpdb/-/jobs/996080 |
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tags T_GroupingClause, T_GroupingFunc, T_GroupId and T_TableValueExpr. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingFunc we append to the jumble the list of arguments. Field 'ngrpcols' is not appended to the jumble, because it can be deduced from the list of groupsets in grouping clause. 'ngrpcols' is the number of unique grouping attributes in grouping clause. Equivalent queries must have the same groupsets in grouping clause. Thus they will have the same 'ngrpcols'. So, adding 'ngrpcols' to jumble is redundant. Plus handling for the T_Integer tag was added, because it is required to parse the list of arguments. List of arguments (field 'args' of structure GroupingFunc) is a list of T_Integer. The T_Integer element is the index of GROUPING function parameter inside the array of unique grouping attributes from grouping clause. We need to jumble value of this T_Integer, because changing parameter in GROUPING function changes this index (and definitely changes semantic of the query). For T_GroupingClause we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For T_GroupId - this tag was added to switch case inside JumbleExpr to suppress warning. The node tag for it was already handled. Struct GroupId doesn't have any additional fields, that could be added to jumble. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Case for T_Integer is not necessary since GPDB 7 uses T_IntList, removed. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, changed to match implementation from GPDB 6. 4. Test renamed to gp_pg_stat_statements, since there were already tests present in GPDB 7. 5. Currently the tests for pg_stat_statements do not set up shared_preload_libraries and they don't disable optimizer. Changed the new test to match existing ones, added a comment about optimizer to Makefile. 6. Fixed test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tags T_GroupingClause, T_GroupingFunc, T_GroupId and T_TableValueExpr. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingFunc we append to the jumble the list of arguments. Field 'ngrpcols' is not appended to the jumble, because it can be deduced from the list of groupsets in grouping clause. 'ngrpcols' is the number of unique grouping attributes in grouping clause. Equivalent queries must have the same groupsets in grouping clause. Thus they will have the same 'ngrpcols'. So, adding 'ngrpcols' to jumble is redundant. Plus handling for the T_Integer tag was added, because it is required to parse the list of arguments. List of arguments (field 'args' of structure GroupingFunc) is a list of T_Integer. The T_Integer element is the index of GROUPING function parameter inside the array of unique grouping attributes from grouping clause. We need to jumble value of this T_Integer, because changing parameter in GROUPING function changes this index (and definitely changes semantic of the query). For T_GroupingClause we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For T_GroupId - this tag was added to switch case inside JumbleExpr to suppress warning. The node tag for it was already handled. Struct GroupId doesn't have any additional fields, that could be added to jumble. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Case for T_Integer is not necessary since GPDB 7 uses T_IntList, removed. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, changed to match implementation from GPDB 6. 4. Test renamed to jumble, since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled because they do not set up shared_preload_libraries and they don't disable optimizer. Added files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allowed to remove some initialization and cleanup from the new test. 6. Fixed test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tags T_GroupingClause, T_GroupingFunc, T_GroupId and T_TableValueExpr. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingFunc we append to the jumble the list of arguments. Field 'ngrpcols' is not appended to the jumble, because it can be deduced from the list of groupsets in grouping clause. 'ngrpcols' is the number of unique grouping attributes in grouping clause. Equivalent queries must have the same groupsets in grouping clause. Thus they will have the same 'ngrpcols'. So, adding 'ngrpcols' to jumble is redundant. Plus handling for the T_Integer tag was added, because it is required to parse the list of arguments. List of arguments (field 'args' of structure GroupingFunc) is a list of T_Integer. The T_Integer element is the index of GROUPING function parameter inside the array of unique grouping attributes from grouping clause. We need to jumble value of this T_Integer, because changing parameter in GROUPING function changes this index (and definitely changes semantic of the query). For T_GroupingClause we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For T_GroupId - this tag was added to switch case inside JumbleExpr to suppress warning. The node tag for it was already handled. Struct GroupId doesn't have any additional fields, that could be added to jumble. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Case for T_Integer is not necessary since GPDB 7 uses T_IntList, removed. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, changed to match implementation from GPDB 6. 4. Test renamed to jumble, since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled because they do not set up shared_preload_libraries and they don't disable optimizer. Added files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allowed to remove some initialization and cleanup from the new test. 6. Fixed test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tags T_GroupingClause, T_GroupingFunc, T_GroupId and T_TableValueExpr. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingFunc we append to the jumble the list of arguments. Field 'ngrpcols' is not appended to the jumble, because it can be deduced from the list of groupsets in grouping clause. 'ngrpcols' is the number of unique grouping attributes in grouping clause. Equivalent queries must have the same groupsets in grouping clause. Thus they will have the same 'ngrpcols'. So, adding 'ngrpcols' to jumble is redundant. Plus handling for the T_Integer tag was added, because it is required to parse the list of arguments. List of arguments (field 'args' of structure GroupingFunc) is a list of T_Integer. The T_Integer element is the index of GROUPING function parameter inside the array of unique grouping attributes from grouping clause. We need to jumble value of this T_Integer, because changing parameter in GROUPING function changes this index (and definitely changes semantic of the query). For T_GroupingClause we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For T_GroupId - this tag was added to switch case inside JumbleExpr to suppress warning. The node tag for it was already handled. Struct GroupId doesn't have any additional fields, that could be added to jumble. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Case for T_Integer is not necessary since GPDB 7 uses T_IntList, removed. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, changed to match implementation from GPDB 6. 4. Test renamed to jumble, since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled because they do not set up shared_preload_libraries and they don't disable optimizer. Added files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allowed to remove some initialization and cleanup from the new test. 6. Fixed test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tags T_GroupingClause, T_GroupingFunc, T_GroupId and T_TableValueExpr. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingFunc we append to the jumble the list of arguments. Field 'ngrpcols' is not appended to the jumble, because it can be deduced from the list of groupsets in grouping clause. 'ngrpcols' is the number of unique grouping attributes in grouping clause. Equivalent queries must have the same groupsets in grouping clause. Thus they will have the same 'ngrpcols'. So, adding 'ngrpcols' to jumble is redundant. Plus handling for the T_Integer tag was added, because it is required to parse the list of arguments. List of arguments (field 'args' of structure GroupingFunc) is a list of T_Integer. The T_Integer element is the index of GROUPING function parameter inside the array of unique grouping attributes from grouping clause. We need to jumble value of this T_Integer, because changing parameter in GROUPING function changes this index (and definitely changes semantic of the query). For T_GroupingClause we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For T_GroupId - this tag was added to switch case inside JumbleExpr to suppress warning. The node tag for it was already handled. Struct GroupId doesn't have any additional fields, that could be added to jumble. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Case for T_Integer is not necessary since GPDB 7 uses T_IntList, removed. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, changed to match implementation from GPDB 6. 4. Test renamed to jumble, since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled because they do not set up shared_preload_libraries and they don't disable optimizer. Added files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allowed to remove some initialization and cleanup from the new test. 6. Fixed test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tags T_GroupingClause, T_GroupingFunc, T_GroupId and T_TableValueExpr. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingFunc we append to the jumble the list of arguments. Field 'ngrpcols' is not appended to the jumble, because it can be deduced from the list of groupsets in grouping clause. 'ngrpcols' is the number of unique grouping attributes in grouping clause. Equivalent queries must have the same groupsets in grouping clause. Thus they will have the same 'ngrpcols'. So, adding 'ngrpcols' to jumble is redundant. Plus handling for the T_Integer tag was added, because it is required to parse the list of arguments. List of arguments (field 'args' of structure GroupingFunc) is a list of T_Integer. The T_Integer element is the index of GROUPING function parameter inside the array of unique grouping attributes from grouping clause. We need to jumble value of this T_Integer, because changing parameter in GROUPING function changes this index (and definitely changes semantic of the query). For T_GroupingClause we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For T_GroupId - this tag was added to switch case inside JumbleExpr to suppress warning. The node tag for it was already handled. Struct GroupId doesn't have any additional fields, that could be added to jumble. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Case for T_Integer is not necessary since GPDB 7 uses T_IntList, removed. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, changed to match implementation from GPDB 6. 4. Test renamed to jumble, since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled because they do not set up shared_preload_libraries and they don't disable optimizer. Added files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allowed to remove some initialization and cleanup from the new test. 6. Fixed test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tags T_GroupingClause, T_GroupingFunc, T_GroupId and T_TableValueExpr. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingFunc we append to the jumble the list of arguments. Field 'ngrpcols' is not appended to the jumble, because it can be deduced from the list of groupsets in grouping clause. 'ngrpcols' is the number of unique grouping attributes in grouping clause. Equivalent queries must have the same groupsets in grouping clause. Thus they will have the same 'ngrpcols'. So, adding 'ngrpcols' to jumble is redundant. Plus handling for the T_Integer tag was added, because it is required to parse the list of arguments. List of arguments (field 'args' of structure GroupingFunc) is a list of T_Integer. The T_Integer element is the index of GROUPING function parameter inside the array of unique grouping attributes from grouping clause. We need to jumble value of this T_Integer, because changing parameter in GROUPING function changes this index (and definitely changes semantic of the query). For T_GroupingClause we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For T_GroupId - this tag was added to switch case inside JumbleExpr to suppress warning. The node tag for it was already handled. Struct GroupId doesn't have any additional fields, that could be added to jumble. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Remove case for T_Integer because it is not necessary since GPDB 7 uses T_IntList instead. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, change to match implementation from GPDB 6. 4. Rename test to jumble since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled before because they do not set up shared_preload_libraries and they don't disable optimizer. Add files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allows to remove some initialization and cleanup from the new test. 6. Fix test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tags T_GroupingClause, T_GroupingFunc, T_GroupId and T_TableValueExpr. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingFunc we append to the jumble the list of arguments. Field 'ngrpcols' is not appended to the jumble, because it can be deduced from the list of groupsets in grouping clause. 'ngrpcols' is the number of unique grouping attributes in grouping clause. Equivalent queries must have the same groupsets in grouping clause. Thus they will have the same 'ngrpcols'. So, adding 'ngrpcols' to jumble is redundant. Plus handling for the T_Integer tag was added, because it is required to parse the list of arguments. List of arguments (field 'args' of structure GroupingFunc) is a list of T_Integer. The T_Integer element is the index of GROUPING function parameter inside the array of unique grouping attributes from grouping clause. We need to jumble value of this T_Integer, because changing parameter in GROUPING function changes this index (and definitely changes semantic of the query). For T_GroupingClause we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For T_GroupId - this tag was added to switch case inside JumbleExpr to suppress warning. The node tag for it was already handled. Struct GroupId doesn't have any additional fields, that could be added to jumble. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Remove case for T_Integer because it is not necessary since GPDB 7 uses T_IntList instead. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, change to match implementation from GPDB 6. 4. Rename test to jumble since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled before because they do not set up shared_preload_libraries and they don't disable optimizer. Add files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allows to remove some initialization and cleanup from the new test. 6. Fix test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tag and T_TableValueExpr and T_GroupingSet is handled incorrectly. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingSet we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Remove case for T_Integer because it is not necessary since GPDB 7 uses T_IntList instead. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, change to match implementation from GPDB 6. 4. Rename test to jumble since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled before because they do not set up shared_preload_libraries and they don't disable optimizer. Add files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allows to remove some initialization and cleanup from the new test. 6. Fix test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tag and T_TableValueExpr and T_GroupingSet is handled incorrectly. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingSet we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Remove case for T_Integer because it is not necessary since GPDB 7 uses T_IntList instead. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, change to match implementation from GPDB 6. 4. Rename test to jumble since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled before because they do not set up shared_preload_libraries and they don't disable optimizer. Add files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allows to remove some initialization and cleanup from the new test. 6. Fix test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description: if 'pg_stat_statements' extension is enabled and a query containing GROUP BY clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is executed, queryId calculation doesn't take into account these grouping clause parameters and GROUPING functions. So, semantically different queries are treated by pg_stat_statements as equivalent queries. Plus warning messages appear during jumbling. Also, jumbling logic doesn't handle properly queries with group_id() function call and calls of functions with 'anytable' parameter. First type of queries causes warnings. Second type of query causes error message "unrecognized RTE kind: 7". Expected correct behavior: queries with different grouping clauses or different usage of the GROUPING function are treated by pg_stat_statements as different queries, and no warning messages appear during the query execution. Cause: In JumbleExpr there is no handling for node tag and T_TableValueExpr and T_GroupingSet is handled incorrectly. Jumble hashing is used as queryId. Error message "unrecognized RTE kind: 7" caused by not handled range table type RTE_TABLEFUNCTION in JumbleRangeTable. Fix: Handling logic for missed node tags was added. According to comments in queryjumble.c, main guideline how to handle query tree is: "Rule of thumb for what to include is that we should ignore anything not semantically significant (such as alias names) as well as anything that can be deduced from child nodes (else we'd just be double- hashing that piece of information)." For T_GroupingSet we append to the jumble the grouping type and list of groupsets. Field 'location' is not appended to the jumble because it is the textual location from parser and is not semantically significant. For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and 'subquery' fields (other significant fields are null). But 'subquery' is duplicated in TableValueExpr below, so do not jumble it. For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery that is inside "TABLE()" statement) was added to the jumble. Changes from original commit: 1. Cases for T_GroupId and T_GroupingFunc are already present, no fix needed. 2. Remove case for T_Integer because it is not necessary since GPDB 7 uses T_IntList instead. 3. T_GroupingClause is now named T_GroupingSet. Its case was not correct in GPDB 7, change to match implementation from GPDB 6. 4. Rename test to jumble since there were already tests present in GPDB 7. 5. The tests in pg_stat_statements were disabled before because they do not set up shared_preload_libraries and they don't disable optimizer. Add files enable_pg_stat_statements and disable_pg_stat_statements to adapt pg_stat_statements tests to current testing utilities. This allows to remove some initialization and cleanup from the new test. 6. Fix test output for GPDB 7. Notably, ROLLUP now outputs 1 row even if the table has no rows in it. (cherry picked from commit fa44e50)
Problem description:
if 'pg_stat_statements' extension is enabled and a query containing GROUP BY
clause with ROLLUP, CUBE or GROUPING SETS or with GROUPING function is
executed, queryId calculation doesn't take into account these grouping clause
parameters and GROUPING functions. So, semantically different queries are
treated by pg_stat_statements as equivalent queries. Plus warning messages
appear during jumbling.
Also, jumbling logic doesn't handle properly queries with group_id() function
call and calls of functions with 'anytable' parameter.
First type of queries causes warnings.
Second type of query causes error message "unrecognized RTE kind: 7".
Expected correct behavior: queries with different grouping clauses or different
usage of the GROUPING function are treated by pg_stat_statements as different
queries, and no warning messages appear during the query execution.
Cause:
In JumbleExpr there is no handling for node tags T_GroupingClause,
T_GroupingFunc, T_GroupId and T_TableValueExpr. Jumble hashing is used
as queryId. Error message "unrecognized RTE kind: 7" caused by not handled
range table type RTE_TABLEFUNCTION in JumbleRangeTable.
Fix:
Handling logic for missed node tags was added.
According to comments in queryjumble.c, main guideline how to handle
query tree is: "Rule of thumb for what to include is that we should ignore
anything not semantically significant (such as alias names) as well as
anything that can be deduced from child nodes (else we'd just be double-
hashing that piece of information)."
For T_GroupingFunc we append to the jumble the list of
arguments. Field 'ngrpcols' is not appended to the jumble, because it can be
deduced from the list of groupsets in grouping clause. 'ngrpcols' is the
number of unique grouping attributes in grouping clause. Equivalent queries
must have the same groupsets in grouping clause. Thus they will have the same
'ngrpcols'. So, adding 'ngrpcols' to jumble is redundant.
Plus handling for the T_Integer tag was added, because it is required to parse
the list of arguments. List of arguments (field 'args' of structure
GroupingFunc) is a list of T_Integer. The T_Integer element is the index of
GROUPING function parameter inside the array of unique grouping attributes
from grouping clause. We need to jumble value of this T_Integer, because
changing parameter in GROUPING function changes this index (and definitely
changes semantic of the query).
For T_GroupingClause we append to the jumble the grouping type
and list of groupsets. Field 'location' is not appended to the jumble because
it is the textual location from parser and is not semantically significant.
For T_GroupId - this tag was added to switch case inside JumbleExpr to
suppress warning. The node tag for it was already handled. Struct GroupId
doesn't have any additional fields, that could be added to jumble.
For RTE_TABLEFUNCTION - 'functions' field of RangeTblEntry was added to the
jumble. RangeTblEntry in case of RTE_TABLEFUNCTION uses 'functions' and
'subquery' fields (other significant fields are null). But 'subquery' is
duplicated in TableValueExpr below, so do not jumble it.
For T_TableValueExpr - 'subquery' field of TableValueExpr (the subquery
that is inside "TABLE()" statement) was added to the jumble.