Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Malformed SQL query with "SELECT *FROM ... WHERE ..." doesn't cause error #3124

Open
seankao-az opened this issue Oct 25, 2024 · 0 comments
Labels
bug Something isn't working v2.19.0 Issues targeting release v2.19.0

Comments

@seankao-az
Copy link
Collaborator

seankao-az commented Oct 25, 2024

What is the bug?
Instead of using the proper SQL syntax SELECT * FROM ... WHERE ..., when running SELECT *FROM ... WHERE ... (notice missing space between * and FROM), query still executes unexpectedly, and behaves differently from the proper syntax.

Verified same behavior can be observed in https://opensearch.org/versions/opensearch-1-0-0.html, all the way back to the 1.0 version, and in latest version.

Below are some queries on some sample index opensearch_dashboards_sample_data_ecommerce

# Query (1)
POST /_plugins/_sql
{
    "query": "select *from opensearch_dashboards_sample_data_ecommerce where manufacturer in (Elitelligence, Primemaster) limit 10"
}

# Unexpected. Returns 10 rows
{
    "schema": [
        (abbreviated)
    ],
    "datarows": [
        (abbreviated)
    ],
    "total": 10,
    "size": 10,
    "status": 200
}

This result is unexpected.

Notice that if we modify above query (1) into SELECT * FROM ..., then we get an error:

# Query (2)
POST /_plugins/_sql
{
    "query": "select * from opensearch_dashboards_sample_data_ecommerce where manufacturer in (Elitelligence, Primemaster) limit 10"
}

# Expected. Error.
{
    "error": {
        "reason": "Invalid SQL query",
        "details": "can't resolve Symbol(namespace=FIELD_NAME, name=Elitelligence) in type env",
        "type": "SemanticCheckException"
    },
    "status": 400
}

The correct syntax should be the following, with literals quoted

# Query (3)
POST /_plugins/_sql
{
    "query": "select * from opensearch_dashboards_sample_data_ecommerce where manufacturer in (\"Elitelligence\", \"Primemaster\") limit 10"
}

# Expected. Returns 10 rows.
{
    "schema": [
        (abbreviated)
    ],
    "datarows": [
        (abbreviated)
    ],
    "total": 10,
    "size": 10,
    "status": 200
}

Without where clause

Notice, however, that without the where clause, this query errors out:

# Query (4.1)
POST /_plugins/_sql
{
    "query": "select *from opensearch_dashboards_sample_data_ecommerce limit 10"
}

# Expected. Error. However different result from query (1)
{
  "error": {
    "reason": "Invalid SQL query",
    "details": "can't resolve Symbol(namespace=FIELD_NAME, name=*from) in type env",
    "type": "SemanticCheckException"
  },
  "status": 400
}

with json format

Even more bizarre, same malformed query without where clause but querying with json format returns without error

# Query (4.2)
POST /_plugins/_sql?format=json
{
    "query": "select *from opensearch_dashboards_sample_data_ecommerce limit 11"
}

# Unexpected. Returns 11 docs, but hits.total value seems incorrect
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 4675,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      { ... },
      { ... },
      { ... },
      { ... },
      { ... },
      { ... },
      { ... },
      { ... },
      { ... },
      { ... },
      { ... },
    ]
  }
}

How can one reproduce the bug?
Steps to reproduce the behavior:

  1. Go to "Query Workbench"
  2. Run the example queries

Note that in all the released version I tested, the issue can be observed.
However, when tested this on https://playground.opensearch.org/app/home#/ as well, the behavior in playground is slightly different.
In query workbench, the malformed query (1) doesn't do anything. It doesn't return result, and also doesn't error out, unlike other malformed query which yields "Opensearch_dashboards_sample_data_ecommerce: Bad Request, this query is not runnable."

What is the expected behavior?
SELECT *FROM ... WHERE ... is a malformed query and should've resulted in parsing error and not be executed.

What is your host/environment?

  • version: 1.0 / 2.13 / 2.15 / 2.17.1

Do you have any screenshots?
If applicable, add screenshots to help explain your problem.

Do you have any additional context?
Explaining SELECT * FROM ... WHERE ... and SELECT *FROM ... WHERE ... yields different result.
Not all queries in the example are explainable, though.

# Explaining query (3) `select * from opensearch_dashboards_sample_data_ecommerce where manufacturer in ("Elitelligence", "Primemaster") limit 10`
{
  "root": {
    "name": "ProjectOperator",
    "description": {
      "fields": "[type, manufacturer, products, customer_last_name, day_of_week_i, total_quantity, currency, taxless_total_price, total_unique_products, event, sku, email, day_of_week, geoip, customer_first_name, customer_phone, customer_birth_date, customer_full_name, order_date, category, customer_id, order_id, user, customer_gender, taxful_total_price]"
    },
    "children": [
      {
        "name": "OpenSearchIndexScan",
        "description": {
          "request": "OpenSearchQueryRequest(indexName=opensearch_dashboards_sample_data_ecommerce, sourceBuilder={\"from\":0,\"size\":10,\"timeout\":\"1m\",\"query\":{\"bool\":{\"should\":[{\"term\":{\"manufacturer.keyword\":{\"value\":\"Elitelligence\",\"boost\":1.0}}},{\"term\":{\"manufacturer.keyword\":{\"value\":\"Primemaster\",\"boost\":1.0}}}],\"adjust_pure_negative\":true,\"boost\":1.0}},\"_source\":{\"includes\":[\"email\",\"sku\",\"customer_last_name\",\"customer_id\",\"total_quantity\",\"category\",\"taxful_total_price\",\"customer_first_name\",\"currency\",\"order_date\",\"day_of_week_i\",\"type\",\"customer_phone\",\"taxless_total_price\",\"total_unique_products\",\"day_of_week\",\"customer_gender\",\"order_id\",\"customer_full_name\",\"manufacturer\",\"products\",\"customer_birth_date\",\"user\",\"geoip\",\"event\"],\"excludes\":[]},\"sort\":[{\"_doc\":{\"order\":\"asc\"}}]}, searchDone=false)"
        },
        "children": []
      }
    ]
  }
}

# Explaining `select *from opensearch_dashboards_sample_data_ecommerce where manufacturer in ("Elitelligence", "Primemaster") limit 10`
# Note that this is not query (1). The literals are with quotes. Somehow query (1) can be executed but cannot be explained
{
  "from": 0,
  "size": 10,
  "query": {
    "bool": {
      "filter": [
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "should": [
                    {
                      "term": {
                        "manufacturer.keyword": {
                          "value": "Elitelligence",
                          "boost": 1
                        }
                      }
                    },
                    {
                      "term": {
                        "manufacturer.keyword": {
                          "value": "Primemaster",
                          "boost": 1
                        }
                      }
                    }
                  ],
                  "adjust_pure_negative": true,
                  "boost": 1
                }
              }
            ],
            "adjust_pure_negative": true,
            "boost": 1
          }
        }
      ],
      "adjust_pure_negative": true,
      "boost": 1
    }
  }
}
@seankao-az seankao-az added bug Something isn't working untriaged labels Oct 25, 2024
@anasalkouz anasalkouz added v2.19.0 Issues targeting release v2.19.0 and removed untriaged labels Oct 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working v2.19.0 Issues targeting release v2.19.0
Projects
None yet
Development

No branches or pull requests

2 participants