Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql_if handling of "missing" case for if_else translation should use ELSE ? #1573

Open
rplsmn opened this issue Jan 22, 2025 · 0 comments
Open

Comments

@rplsmn
Copy link
Contributor

rplsmn commented Jan 22, 2025

Greetings,

Context:
On the R side, if_else provides a missing argument to explictely assign a value (NA by defaut) to handle the case where the condition of the test yields NA instead of TRUE/FALSE.
In ifelse, where there is no missing arg, when the condition evaluates to NA then ifelse yields NA.

On the SQL side, the traduction for ifelse in dbplyr for the standard DBIConnection relies on CASE WHEN's implicit ELSE NULL when no ELSE clause is mentionned, which works great.

Issue:
But the traduction for if_else perplexes me : instead of explicitely relying on ELSE and passing the missing value, instead it's another WHEN clause that tries to check wether the condition is NA. But, what else could be left after the TRUE and FALSE cases of the condition are handled in a logical vector ?
This implementation (WHEN) then fails on backends that do not support boolean types (e.g Teradata).

Proposed solution
Imo, unless I'm missing something obvious here, it would be cleaner to translate the missing arg into an ELSE missing

Here's the proposed code change for sql_if (which is the function used to translate the if family of functions) :

sql_if <- function(cond, if_true, if_false = quo(NULL), missing = quo(NULL)) {
  enpared_cond <- enpar(cond)
  enpared_if_true <- enpar(if_true)
  enpared_if_false <- enpar(if_false)
  enpared_missing <- enpar(missing)
  con <- sql_current_con()

  out <- "CASE WHEN {.val enpared_cond} THEN {.val enpared_if_true}"

  # `ifelse()` and `if_else()` have a three value logic: they return `NA` resp.
  # `missing` if `cond` is `NA`. To get the same in SQL it is necessary to
  # translate to
  # CASE
  #   WHEN <cond> THEN `if_true`
  #   WHEN NOT <cond> THEN `if_false`
  #   ELSE `missing`
  # END
  #
  # Together these cases cover every possible case. So, if `if_false` and
  # `missing` are identical they can be simplified to `ELSE <if_false>`
  if (!quo_is_null(if_false) && identical(if_false, missing)) {
    out <- glue_sql2(con, out, " ELSE {.val enpared_if_false} END")
    return(out)
  }

  if (!quo_is_null(if_false)) {
    false_sql <- " WHEN NOT {.val enpared_cond} THEN {.val enpared_if_false}"
    out <- paste0(out, false_sql)
  }

  if (!quo_is_null(missing)) {
    missing_cond <- translate_sql(is.na(!!cond), con = con)
    missing_sql <- " ELSE {.val enpared_missing}"
    out <- paste0(out, missing_sql)
  }

  glue_sql2(con, out, " END")
}

Additional info:
I have a PR ready to submit if you think these changes are acceptable.

Thank you for reading, have a great day.
Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant