Skip to content

Commit

Permalink
Implemented InPlaceUpdate to be used for updates made on non-distribu…
Browse files Browse the repository at this point in the history
…tion columns.

Currently, ORCA uses Split-Update for updates on both distribution
and non-distribution columns. With this commit,
ORCA uses an InPlaceUpdate whenever updates are made to
non-distribution columns or non-partition keys, and Split Update
if any of modified columns are either distribution or partition keys.
Consider below setup where we are updating
non-distibution column, b in the table foo.

`
create table foo(a int, b int);
explain update foo set b=4;
`
ORCA produces plan with Split and Update nodes

```
Update on public.foo
   ->  Result
         Output: foo_1.a, foo_1.b, (DMLAction), foo_1.ctid, foo_1.gp_segment_id
         ->  Split
               Output: foo_1.a, foo_1.b, foo_1.ctid, foo_1.gp_segment_id, DMLAction
               ->  Seq Scan on public.foo foo_1
                     Output: foo_1.a, foo_1.b, 4, foo_1.ctid, foo_1.gp_segment_id
```

There is no point in using a Split and Update for this as we are updating a
non-distribution column which do not require any redistribution. This
commit uses an InPlace Update to perform updates on non-distribution
columns like Planner. Below is the new plan produced with this commit.

New Plan
```
Update on public.foo
   ->  Seq Scan on public.foo foo_1
         Output: foo_1.a, 4, foo_1.ctid, foo_1.gp_segment_id
 Optimizer: Pivotal Optimizer (GPORCA)
```

greenplum 6 specific changes:
1. Some constructors have been changed because the list of arguments in 6X and 7X
   are different.
2. fixed a bug in CParseHandlerPhysicalDML::startElement where preserve_oids_xml
   was used instead of fSplit, which could lead to SIGSEGV during DML node parsing.
3. Changed create_index_hot test. Removed disabling the optimizer before updating
   since ORCA no longer uses split update in this case.

(cherry picked from commit 3ced85b)
  • Loading branch information
Sanath97 authored and KnightMurloc committed Jan 15, 2024
1 parent 107c9f4 commit 0031781
Show file tree
Hide file tree
Showing 52 changed files with 1,409 additions and 1,695 deletions.
23 changes: 16 additions & 7 deletions src/backend/gpopt/translate/CTranslatorDXLToPlStmt.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4100,6 +4100,7 @@ CTranslatorDXLToPlStmt::TranslateDXLDml(
DML *dml = MakeNode(DML);
Plan *plan = &(dml->plan);
AclMode acl_mode = ACL_NO_RIGHTS;
BOOL isSplit = phy_dml_dxlop->FSplit();

switch (phy_dml_dxlop->GetDmlOpType())
{
Expand Down Expand Up @@ -4186,11 +4187,21 @@ CTranslatorDXLToPlStmt::TranslateDXLDml(
dml_target_list = target_list_with_dropped_cols;
}

// Extract column numbers of the action and ctid columns from the
// target list.
dml->actionColIdx = AddTargetEntryForColId(&dml_target_list, &child_context,
phy_dml_dxlop->ActionColId(),
true /*is_resjunk*/);
// Doesn't needed for in place update
if (isSplit || CMD_UPDATE != m_cmd_type)
{
// Extract column numbers of the action and ctid columns from the
// target list.
dml->actionColIdx = AddTargetEntryForColId(
&dml_target_list, &child_context, phy_dml_dxlop->ActionColId(),
true /*is_resjunk*/);
GPOS_ASSERT(0 != dml->actionColIdx);
}
else
{
dml->actionColIdx = 0;
}

dml->ctidColIdx = AddTargetEntryForColId(&dml_target_list, &child_context,
phy_dml_dxlop->GetCtIdColId(),
true /*is_resjunk*/);
Expand All @@ -4205,8 +4216,6 @@ CTranslatorDXLToPlStmt::TranslateDXLDml(
dml->tupleoidColIdx = 0;
}

GPOS_ASSERT(0 != dml->actionColIdx);

plan->targetlist = dml_target_list;

plan->lefttree = child_plan;
Expand Down
81 changes: 25 additions & 56 deletions src/backend/gporca/data/dxl/minidump/SelfUpdate.mdp
Original file line number Diff line number Diff line change
Expand Up @@ -216,17 +216,17 @@ update t1 set b = c;
</dxl:LogicalUpdate>
</dxl:Query>
<dxl:Plan Id="0" SpaceSize="1">
<dxl:DMLUpdate Columns="0,1,2" ActionCol="10" CtidCol="3" SegmentIdCol="9" PreserveOids="false">
<dxl:DMLUpdate Columns="0,2,2" ActionCol="10" CtidCol="3" SegmentIdCol="9" IsSplitUpdate="false" PreserveOids="false">
<dxl:Properties>
<dxl:Cost StartupCost="0" TotalCost="431.067764" Rows="1.000000" Width="1"/>
<dxl:Cost StartupCost="0" TotalCost="431.023462" Rows="1.000000" Width="1"/>
</dxl:Properties>
<dxl:DirectDispatchInfo/>
<dxl:ProjList>
<dxl:ProjElem ColId="0" Alias="a">
<dxl:Ident ColId="0" ColName="a" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="1" Alias="b">
<dxl:Ident ColId="1" ColName="b" TypeMdid="0.23.1.0"/>
<dxl:ProjElem ColId="2" Alias="c">
<dxl:Ident ColId="2" ColName="c" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="2" Alias="c">
<dxl:Ident ColId="2" ColName="c" TypeMdid="0.23.1.0"/>
Expand All @@ -248,14 +248,14 @@ update t1 set b = c;
</dxl:TableDescriptor>
<dxl:Assert ErrorCode="23502">
<dxl:Properties>
<dxl:Cost StartupCost="0" TotalCost="431.000056" Rows="2.000000" Width="26"/>
<dxl:Cost StartupCost="0" TotalCost="431.000025" Rows="1.000000" Width="18"/>
</dxl:Properties>
<dxl:ProjList>
<dxl:ProjElem ColId="0" Alias="a">
<dxl:Ident ColId="0" ColName="a" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="1" Alias="b">
<dxl:Ident ColId="1" ColName="b" TypeMdid="0.23.1.0"/>
<dxl:ProjElem ColId="2" Alias="c">
<dxl:Ident ColId="2" ColName="c" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="2" Alias="c">
<dxl:Ident ColId="2" ColName="c" TypeMdid="0.23.1.0"/>
Expand All @@ -266,9 +266,6 @@ update t1 set b = c;
<dxl:ProjElem ColId="9" Alias="gp_segment_id">
<dxl:Ident ColId="9" ColName="gp_segment_id" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="10" Alias="ColRef_0010">
<dxl:Ident ColId="10" ColName="ColRef_0010" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
</dxl:ProjList>
<dxl:AssertConstraintList>
<dxl:AssertConstraint ErrorMessage="Not null constraint for column b of table t1 was violated">
Expand All @@ -279,17 +276,14 @@ update t1 set b = c;
</dxl:Not>
</dxl:AssertConstraint>
</dxl:AssertConstraintList>
<dxl:Split DeleteColumns="0,1,2" InsertColumns="0,2,2" ActionCol="10" CtidCol="3" SegmentIdCol="9" PreserveOids="false">
<dxl:TableScan>
<dxl:Properties>
<dxl:Cost StartupCost="0" TotalCost="431.000039" Rows="2.000000" Width="26"/>
<dxl:Cost StartupCost="0" TotalCost="431.000008" Rows="1.000000" Width="18"/>
</dxl:Properties>
<dxl:ProjList>
<dxl:ProjElem ColId="0" Alias="a">
<dxl:Ident ColId="0" ColName="a" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="1" Alias="b">
<dxl:Ident ColId="1" ColName="b" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="2" Alias="c">
<dxl:Ident ColId="2" ColName="c" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
Expand All @@ -299,48 +293,23 @@ update t1 set b = c;
<dxl:ProjElem ColId="9" Alias="gp_segment_id">
<dxl:Ident ColId="9" ColName="gp_segment_id" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="10" Alias="ColRef_0010">
<dxl:DMLAction/>
</dxl:ProjElem>
</dxl:ProjList>
<dxl:TableScan>
<dxl:Properties>
<dxl:Cost StartupCost="0" TotalCost="431.000008" Rows="1.000000" Width="22"/>
</dxl:Properties>
<dxl:ProjList>
<dxl:ProjElem ColId="0" Alias="a">
<dxl:Ident ColId="0" ColName="a" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="1" Alias="b">
<dxl:Ident ColId="1" ColName="b" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="2" Alias="c">
<dxl:Ident ColId="2" ColName="c" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="3" Alias="ctid">
<dxl:Ident ColId="3" ColName="ctid" TypeMdid="0.27.1.0"/>
</dxl:ProjElem>
<dxl:ProjElem ColId="9" Alias="gp_segment_id">
<dxl:Ident ColId="9" ColName="gp_segment_id" TypeMdid="0.23.1.0"/>
</dxl:ProjElem>
</dxl:ProjList>
<dxl:Filter/>
<dxl:TableDescriptor Mdid="6.47297780.1.1" TableName="t1">
<dxl:Columns>
<dxl:Column ColId="0" Attno="1" ColName="a" TypeMdid="0.23.1.0"/>
<dxl:Column ColId="1" Attno="2" ColName="b" TypeMdid="0.23.1.0"/>
<dxl:Column ColId="2" Attno="3" ColName="c" TypeMdid="0.23.1.0"/>
<dxl:Column ColId="3" Attno="-1" ColName="ctid" TypeMdid="0.27.1.0"/>
<dxl:Column ColId="4" Attno="-3" ColName="xmin" TypeMdid="0.28.1.0"/>
<dxl:Column ColId="5" Attno="-4" ColName="cmin" TypeMdid="0.29.1.0"/>
<dxl:Column ColId="6" Attno="-5" ColName="xmax" TypeMdid="0.28.1.0"/>
<dxl:Column ColId="7" Attno="-6" ColName="cmax" TypeMdid="0.29.1.0"/>
<dxl:Column ColId="8" Attno="-7" ColName="tableoid" TypeMdid="0.26.1.0"/>
<dxl:Column ColId="9" Attno="-8" ColName="gp_segment_id" TypeMdid="0.23.1.0"/>
</dxl:Columns>
</dxl:TableDescriptor>
</dxl:TableScan>
</dxl:Split>
<dxl:Filter/>
<dxl:TableDescriptor Mdid="6.47297780.1.1" TableName="t1">
<dxl:Columns>
<dxl:Column ColId="0" Attno="1" ColName="a" TypeMdid="0.23.1.0"/>
<dxl:Column ColId="1" Attno="2" ColName="b" TypeMdid="0.23.1.0"/>
<dxl:Column ColId="2" Attno="3" ColName="c" TypeMdid="0.23.1.0"/>
<dxl:Column ColId="3" Attno="-1" ColName="ctid" TypeMdid="0.27.1.0"/>
<dxl:Column ColId="4" Attno="-3" ColName="xmin" TypeMdid="0.28.1.0"/>
<dxl:Column ColId="5" Attno="-4" ColName="cmin" TypeMdid="0.29.1.0"/>
<dxl:Column ColId="6" Attno="-5" ColName="xmax" TypeMdid="0.28.1.0"/>
<dxl:Column ColId="7" Attno="-6" ColName="cmax" TypeMdid="0.29.1.0"/>
<dxl:Column ColId="8" Attno="-7" ColName="tableoid" TypeMdid="0.26.1.0"/>
<dxl:Column ColId="9" Attno="-8" ColName="gp_segment_id" TypeMdid="0.23.1.0"/>
</dxl:Columns>
</dxl:TableDescriptor>
</dxl:TableScan>
</dxl:Assert>
</dxl:DMLUpdate>
</dxl:Plan>
Expand Down
Loading

0 comments on commit 0031781

Please sign in to comment.