Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Printer for the plan, ie: EXPLAIN #2075

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

Conversation

mamcx
Copy link
Contributor

@mamcx mamcx commented Dec 19, 2024

Description of Changes

Incipient support for EXPLAIN with close plan output as in Postgres (but not yet conformant).

It has the extra capability of showing extra metadata about the schema & indexes, useful for testing & debugging.

NOTE: In draft because we want to coordinate with @joshua-spacetime about what will be the desirable testing changes...

Example:

-- Without metadata
Nested Loop
  -> Index Scan using Index id 0: (employee) on m
    -> Index Cond: (m.employee = U64(1))
  -> Seq Scan on p
  Output: id, name
Planning Time: 541.5µs

-- With metadata
Query: SELECT m.* FROM m CROSS JOIN p WHERE m.employee = 1
Nested Loop
  -> Index Scan using Index id 0: (employee) on m:1
    -> Index Cond: (m.employee = U64(1))
  -> Seq Scan on p:2
  Output: id, name
Planning Time: 527.666µs
-------
Schema:

Label m: 1
  Columns: employee, manager
  Indexes: Unique(m.employee)
Label p: 2
  Columns: id, name
  Indexes: Unique(p.id)

Closes #2058.

API and ABI breaking changes

None

Expected complexity level and risk

1

Testing

  • Adding testing for both optimized or not variations of the plan
  • WILL NEED more test but is likely that we need to do another pr

@mamcx mamcx added no runtime change This change does not affect the final binaries release-1.0 labels Dec 19, 2024
@mamcx mamcx self-assigned this Dec 19, 2024
@mamcx mamcx force-pushed the mamcx/planner-printer branch from be18b08 to 81adebc Compare December 24, 2024 15:58
@mamcx mamcx force-pushed the mamcx/planner-printer branch from 1ca3567 to 75e1cbd Compare January 6, 2025 17:24
@mamcx mamcx changed the base branch from master to joshua/query-engine-integration January 6, 2025 17:26
@mamcx mamcx changed the base branch from joshua/query-engine-integration to master January 6, 2025 17:27
@mamcx mamcx force-pushed the mamcx/planner-printer branch 4 times, most recently from 1d25e4f to 305e0bc Compare January 13, 2025 18:14
@mamcx mamcx force-pushed the mamcx/planner-printer branch from 305e0bc to 33c8978 Compare January 14, 2025 15:21
@@ -166,10 +166,12 @@ pub fn type_subscription(ast: SqlSelect, tx: &impl SchemaView) -> TypingResult<P

/// Parse and type check a *subscription* query into a `StatementCtx`
pub fn compile_sql_sub<'a>(sql: &'a str, tx: &impl SchemaView) -> TypingResult<StatementCtx<'a>> {
let planning_time = std::time::Instant::now();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make planning time external? So that we don't compute it during normal execution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So create a wrapper? Because the alternative is to parse EXPLAIN

"select * from t",
expect![[r#"
Seq Scan on t
Output: t.id, t.x"#]],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the Output section, or at least make it optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last Output is always show in pg. BTW It also show the output in each node of the plan.

It has help me to sport problems so I think is a good check to keep.

Comment on lines -1126 to -1130
/// x: join
/// p: project
/// s: select
/// ix: index scan
/// rx: right index semijoin
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: A double space at the end of a doc comment line is interpreted as a new line by the formatter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spaces are removed by rustfmt

Comment on lines 1261 to 1264
-> Index Scan using Index id 0: (identity) on u
-> Index Cond: (u.identity = U64(5))
-> Inner Unique: true
-> Index Cond: (u.entity_id = p.entity_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These arrows should all have the same indentation, right?

Copy link
Contributor Author

@mamcx mamcx Jan 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which ones?:

Only the first two?

-> Index Cond: (u.identity = U64(5))
-> Inner Unique: true
-> Index Cond: (u.entity_id = p.entity_id)
-> Inner Unique: false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have outer joins, so should this just be Unique: false instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well this is what the IxJoin pass.

&db,
sql,
expect![[r#"
Hash Join: All
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Hash Join: All
Hash Join

All means it's not a semijoin. So it would probably be clearer to just call it HashJoin in that case.

-> Seq Scan on m
-> Seq Scan on n
-> Inner Unique: false
-> Hash Cond: (m.manager = n.manager)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
-> Hash Cond: (m.manager = n.manager)
-> Join Cond: (m.manager = n.manager)

-> Inner Unique: false
-> Index Cond: (p.chunk = q.chunk)
Inner Unique: true
Index Cond: (q.entity_id = b.entity_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Index Cond: (q.entity_id = b.entity_id)
Join Cond: (q.entity_id = b.entity_id)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is consistent to how pg show the plan:

SELECT * FROM auth_user a JOIN auth_user b ON a.username = b.username where a.username = 'a'

 Nested Loop  (cost=0.29..16.34 rows=1 width=386) (actual time=21.830..21.831 rows=0 loops=1)
   Output: a.username, a.name, a.email, a.password, a.role, a.is_active, a.dot, b.username, b.name, b.email, b.password, b.role, b.is_active, b.dot
   ->  Index Scan using auth_user_pkey on betatest.auth_user a  (cost=0.15..8.17 rows=1 width=193) (actual time=21.829..21.830 rows=0 loops=1)
         Output: a.username, a.name, a.email, a.password, a.role, a.is_active, a.dot
         Index Cond: (a.username = 'a'::text COLLATE nd)
   ->  Index Scan using auth_user_pkey on betatest.auth_user b  (cost=0.15..8.17 rows=1 width=193) (never executed)
         Output: b.username, b.name, b.email, b.password, b.role, b.is_active, b.dot
         Index Cond: (b.username = 'a'::text COLLATE nd)

sql,
expect![
r#"
Index Scan using Index id 2: (x, y, z) on t
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Index Scan using Index id 2: (x, y, z) on t
Index Scan using index id: 2

Can we must mention the index id in this part, and then have the index info in the metadata section?

-------
Schema:

Label m: 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to mention the Label ids. The names are sufficient.

But can we add the TableId?

@mamcx mamcx force-pushed the mamcx/planner-printer branch from 33c8978 to f8700bc Compare January 15, 2025 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no runtime change This change does not affect the final binaries release-1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query plan printer
2 participants