-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support query composition #24
Comments
I was curious about trying this out. I'm a complete Live Book newb so I apologize in advance :). Not sure how much you already planned/decided...these are some of the questions I was considering
|
Oh also one more thing, did you already consider allowing I'm not totally clear how this works yet but I thought maybe if the cell evaluates to a query struct or maybe can be supplied in the UI as a variable. |
Hi @greg-rychlewski! 👋 Here is what I was thinking:
It will become something like:
Some notes:
@greg-rychlewski does this answer your questions? :) |
Makes sense, thank you! I'll give it a shot. |
I started playing around with this. One thing that came up is that each database will have different placeholders for the parameters: Some ways I can think to handle it:
I think (3) makes it easier for the dev but worse UX. 2 seems the best to me as long as there are no databases that use |
@greg-rychlewski We can do option 3 for now and then, instead of having an option in the select to be lazy, we can have a checkbox that makes the query as lazy. :) |
@greg-rychlewski actually, ignore me. Even if we use $1, $2, $3 for Postgres, we will have to rewrite those to $3, $4, $5 depending on the number of queries we have, so going down with 3 won't make life easier for Postgrex. |
I think we will need to go one level of abstraction higher. For example, we could emit this code: {[sample_cte_query], params} = Kino.DB.to_ctes([sample_cte])
query(
"""
WITH "sample_cte" AS (#{sample_cte_query})
SELECT * FROM foo FROM sample_cte AS s join s.id == id and status == $2
""",
params ++ [status]
) or this code: {ctes, cte_params} = Kino.DB.to_ctes([sample_cte: sample_cte])
query(
"""
WITH #{ctes}
SELECT * FROM foo FROM sample_cte AS s join s.id == id and status == $2
""",
cte_params ++ [status]
) And then for PG we have a slightly different version which is called Thoughts? |
Sorry for the delayed response @josevalim. If I'm understanding correctly, then I think what you proposed is a very nice solution. Do I understand correctly: Someone creates 2 CTEs (i'm putting the variable declaration together with the queries just to make it easier to write, but I understand they will be in different cells): name1 = "bob"
name2 = "joe"
select * from users where name = {{ name1 }} limit 1
select * from users where name = {{ name2 }} limit 1 These will get converted into the following structs: cte1 = %Kino.DB.CTE{parts: [query: "select * from users where id = ", param: "bob", query: " limit 1"]}
cte2 = %Kino.DB.CTE{parts: [query: "select * from users where id = ", param: "joe", query: " limit 1"]} Then when interpolated into a new query salary_threshold = 100
select * from {{ cte1 }} join {{ cte2 }} where cte1.salary > {{ salary_threshold }} the {ctes, cte_params} = Kino.DB.to_ctes([cte1: cte1, cte2: cte2])
# ctes = """
# WITH cte1 AS (
# select * from users where name = $1 limit 1
# ), cte2 AS (
# select * from users where name = $2 limit 1
# )
#
# cte_params = ["bob", "joe"]
query(
"""
#{ctes}
SELECT * from cte1 join cte2 where cte1.salary > $3,
""",
["bob", "joe", 100]
) If no misunderstanding above, the only other situation I think we need to consider is nested CTEs. Because it's pretty common to have one CTE reference another. I don't think we want to create the And then there is the situation of duplicate names. It could be the case that a nested CTE is saved with some variable name. And then later on that variable name is reused for a different CTE. And then both of these are used in the same query. i.e: cte1 = select * from users where name = "bob" limit 1
cte2 = select * from {{ cte1 }} limit 1
cte1 = select * from users where name = "dan" limit 2
query = select * from {{ cte2 }} join {{ cte1 }} So it's a bit of a weird situation because it's fine to re-use the name as long as the query is the same. I can think of 2 choices:
|
Great points about nested CTEs! The name is defined in an input of the smart cell, so we should be able to forbid redefinitions or error/warn in said cases! |
About nested CTEs, your proposed structure needs to change a tiny bit but otherwise it is spot on. Instead of this: cte1 = %Kino.DB.CTE{parts: [query: "select * from users where id = ", param: "bob", query: " limit 1"]} We will need to have something like: query: binary | param: {value, maybe_cte_name} Where Then: {ctes, cte_params} = Kino.DB.to_ctes([cte1: cte1, cte2: cte2]) can recursively traverse and handle any nested CTE. |
Oh and sorry I had a newb question somewhat related to this. Are all the database adapters contained in this repo or is there some way for people to create, say, a Cassandra adapter and plug it into this? If no way currently is that something you think is doable/desirable? |
Everything is contained in this repo. But anyone can create they own smart cells. |
But we will be glad to support other databases too! |
Sorry you might have already said no to this, just not 100% sure :P. Would you be interested in a PR to create a Not even sure if possible. Just enjoying working on this repo and trying to think if there's anything I can contribute after the CTE stuff. |
@greg-rychlewski I am not sure because it is not only about the adapter behaviour, but also the generated code, and the interface, and I am not sure all of those can be encapsulated behind an adapter with adding potentially a lot more complexity (such as encapsulating the forms themselves). :) I am also not even sure this package will fully exist as is in the long term. Explorer is definitely a better tool for data exploration/processing and, if this becomes a reality, then it will be the best tool for data processing/exploration available. So I would go baby steps for now until we have a complete vision. WDYT? |
Oh yes I see what you're saying about Explorer. I'm a bit out of touch with this space but that makes a lot of sense. And I didn't consider what you said about the interface/generated code but that's very true. I'll stick to this PR :). |
We can make it so a query is "lazy", so it returns a query construct that can be interpolated into other queries in order to build CTEs.
The text was updated successfully, but these errors were encountered: