new doc

digoal zhou · digoal zhou · commit 1b0fb79f99c8 · 2022-01-26T16:59:17.000+08:00
diff --git a/202201/20220125_02.md b/202201/20220125_02.md
@@ -13,6 +13,7 @@ PostgreSQL , 开源
                                           
 ## 背景                                          
 视频回放:     
+[PPT](20220125_02_doc_001.pptx)    
     
 ![pic](20220125_02_pic_001.png)    
 - 一个和尚挑水喝    
diff --git a/202201/20220125_02_doc_001.pptx b/202201/20220125_02_doc_001.pptx
diff --git a/202201/20220126_01.md b/202201/20220126_01.md
@@ -0,0 +1,263 @@
+## PostgreSQL nestloop Materialize 代价算法      
+                     
+### 作者                          
+digoal                                              
+                                              
+### 日期                                              
+2022-01-26                                             
+                                              
+### 标签                                           
+PostgreSQL , Materialize                 
+                                            
+----                                            
+                                            
+## 背景    
+  
+  
+```  
+select ... from a join b using (xx) where xx;  
+  
+nest loop join:   
+a (xx rows)  
+  b   
+  全表扫 , 与loops有关  
+  索引扫 , 与loops有关   
+  物化(run cost + rescan cost)   物化的代价与loops无关     
+```  
+  
+通常内表小于work_mem的, 并且join字段没有索引时, 可能物化的代价更低.   
+  
+```  
+/*  
+ * cost_material  
+ *        Determines and returns the cost of materializing a relation, including  
+ *        the cost of reading the input data.  
+ *  
+ * If the total volume of data to materialize exceeds work_mem, we will need  
+ * to write it to disk, so the cost is much higher in that case.  
+ *  
+ * Note that here we are estimating the costs for the first scan of the  
+ * relation, so the materialization is all overhead --- any savings will  
+ * occur only on rescan, which is estimated in cost_rescan.  
+ */  
+void  
+cost_material(Path *path,  
+                          Cost input_startup_cost, Cost input_total_cost,  
+                          double tuples, int width)  
+{  
+        Cost            startup_cost = input_startup_cost;  
+        Cost            run_cost = input_total_cost - input_startup_cost;  
+        double          nbytes = relation_byte_size(tuples, width);  
+        long            work_mem_bytes = work_mem * 1024L;  
+  
+        path->rows = tuples;  
+  
+        /*  
+         * Whether spilling or not, charge 2x cpu_operator_cost per tuple to  
+         * reflect bookkeeping overhead.  (This rate must be more than what  
+         * cost_rescan charges for materialize, ie, cpu_operator_cost per tuple;  
+         * if it is exactly the same then there will be a cost tie between  
+         * nestloop with A outer, materialized B inner and nestloop with B outer,  
+         * materialized A inner.  The extra cost ensures we'll prefer  
+         * materializing the smaller rel.)      Note that this is normally a good deal  
+         * less than cpu_tuple_cost; which is OK because a Material plan node  
+         * doesn't do qual-checking or projection, so it's got less overhead than  
+         * most plan nodes.  
+         */  
+        run_cost += 2 * cpu_operator_cost * tuples;  
+  
+        /*  
+         * If we will spill to disk, charge at the rate of seq_page_cost per page.  
+         * This cost is assumed to be evenly spread through the plan run phase,  
+         * which isn't exactly accurate but our cost model doesn't allow for  
+         * nonuniform costs within the run phase.  
+         */  
+        if (nbytes > work_mem_bytes)  
+        {  
+                double          npages = ceil(nbytes / BLCKSZ);  
+  
+                run_cost += seq_page_cost * npages;  
+        }  
+  
+        path->startup_cost = startup_cost;  
+        path->total_cost = startup_cost + run_cost;  
+}  
+```  
+  
+  
+```  
+/*  
+ * cost_rescan  
+ *              Given a finished Path, estimate the costs of rescanning it after  
+ *              having done so the first time.  For some Path types a rescan is  
+ *              cheaper than an original scan (if no parameters change), and this  
+ *              function embodies knowledge about that.  The default is to return  
+ *              the same costs stored in the Path.  (Note that the cost estimates  
+ *              actually stored in Paths are always for first scans.)  
+ *  
+ * This function is not currently intended to model effects such as rescans  
+ * being cheaper due to disk block caching; what we are concerned with is  
+ * plan types wherein the executor caches results explicitly, or doesn't  
+ * redo startup calculations, etc.  
+ */  
+static void  
+cost_rescan(PlannerInfo *root, Path *path,  
+                        Cost *rescan_startup_cost,      /* output parameters */  
+                        Cost *rescan_total_cost)  
+{  
+.........    
+                case T_Material:  
+                case T_Sort:  
+                        {  
+                                /*  
+                                 * These plan types not only materialize their results, but do  
+                                 * not implement qual filtering or projection.  So they are  
+                                 * even cheaper to rescan than the ones above.  We charge only  
+                                 * cpu_operator_cost per tuple.  (Note: keep that in sync with  
+                                 * the run_cost charge in cost_sort, and also see comments in  
+                                 * cost_material before you change it.)  
+                                 */  
+                                Cost            run_cost = cpu_operator_cost * path->rows;  
+                                double          nbytes = relation_byte_size(path->rows,  
+                                                                                                                path->pathtarget->width);  
+                                long            work_mem_bytes = work_mem * 1024L;  
+  
+                                if (nbytes > work_mem_bytes)  
+                                {  
+                                        /* It will spill, so account for re-read cost */  
+                                        double          npages = ceil(nbytes / BLCKSZ);  
+  
+                                        run_cost += seq_page_cost * npages;  
+                                }  
+                                *rescan_startup_cost = 0;  
+                                *rescan_total_cost = run_cost;  
+                        }  
+                        break;  
+```  
+  
+  
+什么时候选物化? 下面看个例子, 其实最终还是取决于代价.  
+基本上只要物化的代价低, 就会选它.  
+  
+```  
+loop 1000000 次, 物化只增加了0.5的代价.  印证了上面的算法,   
+物化的代价(全在内存中, 因为小于work_mem)  
+run_cost += 2 * cpu_operator_cost * tuples  
+2*0.0025*100 = 0.5  
+全表扫描的代价  
+cpu_tuple_cost * 100 + seq_page_cost * 1 = 2  
+```  
+  
+  
+```  
+create table a (id int, info text, crt_time timestamp);  
+  
+create table b (id int, info text, crt_time timestamp);  
+  
+insert into a select generate_series(1,1000000), md5(random()::text), clock_timestamp();  
+insert into b select generate_series(1,100), md5(random()::text), clock_timestamp();  
+  
+explain select a.*,b.* from a join b using (id);  
+  
+loop 1000000 次, 物化只增加了0.5的代价.  印证了上面的算法,   
+物化的代价(全在内存中, 因为小于work_mem)  
+run_cost += 2 * cpu_operator_cost * tuples  
+2*0.0025*100 = 0.5  
+全表扫描的代价  
+cpu_tuple_cost * 100 + seq_page_cost * 1 = 2  
+  
+postgres=# explain select a.*,b.* from a join b using (id);  
+                            QUERY PLAN                              
+------------------------------------------------------------------  
+ Nested Loop  (cost=0.00..1519348.25 rows=100 width=90)  
+   Join Filter: (a.id = b.id)  
+   ->  Seq Scan on a  (cost=0.00..19346.00 rows=1000000 width=45)  
+   ->  Materialize  (cost=0.00..2.50 rows=100 width=45)  
+         ->  Seq Scan on b  (cost=0.00..2.00 rows=100 width=45)  
+(5 rows)  
+  
+postgres=# set enable_material =off;  
+SET  
+postgres=# explain select a.*,b.* from a join b using (id);  
+                            QUERY PLAN                              
+------------------------------------------------------------------  
+ Nested Loop  (cost=0.00..3184602.00 rows=100 width=90)  
+   Join Filter: (a.id = b.id)  
+   ->  Seq Scan on b  (cost=0.00..2.00 rows=100 width=45)  
+   ->  Seq Scan on a  (cost=0.00..19346.00 rows=1000000 width=45)  
+(4 rows)  
+  
+postgres=# explain (analyze,verbose,timing,costs,buffers) select a.*,b.* from a join b using (id);  
+                                                        QUERY PLAN                                                           
+---------------------------------------------------------------------------------------------------------------------------  
+ Nested Loop  (cost=0.00..1519348.25 rows=100 width=90) (actual time=0.028..15746.781 rows=100 loops=1)  
+   Output: a.id, a.info, a.crt_time, b.id, b.info, b.crt_time  
+   Join Filter: (a.id = b.id)  
+   Rows Removed by Join Filter: 99999900  
+   Buffers: shared hit=9347  
+   ->  Seq Scan on public.a  (cost=0.00..19346.00 rows=1000000 width=45) (actual time=0.011..120.138 rows=1000000 loops=1)  
+         Output: a.id, a.info, a.crt_time  
+         Buffers: shared hit=9346  
+   ->  Materialize  (cost=0.00..2.50 rows=100 width=45) (actual time=0.000..0.007 rows=100 loops=1000000)  
+         Output: b.id, b.info, b.crt_time  
+         Buffers: shared hit=1  
+         ->  Seq Scan on public.b  (cost=0.00..2.00 rows=100 width=45) (actual time=0.008..0.025 rows=100 loops=1)  
+               Output: b.id, b.info, b.crt_time  
+               Buffers: shared hit=1  
+ Query Identifier: 5474919903957737384  
+ Planning:  
+   Buffers: shared hit=5  
+ Planning Time: 0.299 ms  
+ Execution Time: 15746.819 ms  
+(19 rows)  
+  
+postgres=# create index idx_b_1 on b (id);  
+CREATE INDEX  
+  
+  
+postgres=# explain select a.*,b.* from a join b using (id);  
+                               QUERY PLAN                                 
+------------------------------------------------------------------------  
+ Nested Loop  (cost=0.14..189349.30 rows=100 width=90)  
+   ->  Seq Scan on a  (cost=0.00..19346.00 rows=1000000 width=45)  
+   ->  Index Scan using idx_b_1 on b  (cost=0.14..0.16 rows=1 width=45)  
+         Index Cond: (id = a.id)  
+(4 rows)  
+  
+postgres=# explain (analyze,verbose,timing,costs,buffers) select a.*,b.* from a join b using (id);  
+                                                          QUERY PLAN                                                             
+-------------------------------------------------------------------------------------------------------------------------------  
+ Nested Loop  (cost=0.14..189349.30 rows=100 width=90) (actual time=0.078..673.957 rows=100 loops=1)  
+   Output: a.id, a.info, a.crt_time, b.id, b.info, b.crt_time  
+   Buffers: shared hit=1009445 read=1  
+   I/O Timings: read=0.023  
+   ->  Seq Scan on public.a  (cost=0.00..19346.00 rows=1000000 width=45) (actual time=0.011..93.566 rows=1000000 loops=1)  
+         Output: a.id, a.info, a.crt_time  
+         Buffers: shared hit=9346  
+   ->  Index Scan using idx_b_1 on public.b  (cost=0.14..0.16 rows=1 width=45) (actual time=0.000..0.000 rows=0 loops=1000000)  
+         Output: b.id, b.info, b.crt_time  
+         Index Cond: (b.id = a.id)  
+         Buffers: shared hit=1000099 read=1  
+         I/O Timings: read=0.023  
+ Query Identifier: 5474919903957737384  
+ Planning Time: 0.125 ms  
+ Execution Time: 674.002 ms  
+(15 rows)  
+```  
+  
+  
+  
+#### [期望 PostgreSQL 增加什么功能?](https://github.com/digoal/blog/issues/76 "269ac3d1c492e938c0191101c7238216")
+  
+  
+#### [PolarDB for PostgreSQL云原生分布式开源数据库](https://github.com/ApsaraDB/PolarDB-for-PostgreSQL "57258f76c37864c6e6d23383d05714ea")
+  
+  
+#### [PostgreSQL 解决方案集合](https://yq.aliyun.com/topic/118 "40cff096e9ed7122c512b35d8561d9c8")
+  
+  
+#### [德哥 / digoal's github - 公益是一辈子的事.](https://github.com/digoal/blog/blob/master/README.md "22709685feb7cab07d30f30387f0a9ae")
+  
+  
+![digoal's wechat](../pic/digoal_weixin.jpg "f7ad92eeba24523fd47a6e1a0e691b59")
+  
diff --git a/202201/readme.md b/202201/readme.md
@@ -2,6 +2,7 @@
   
 ### 文章列表  
 ----  
+##### 20220126_01.md   [《PostgreSQL nestloop Materialize 代价算法》](20220126_01.md)  
 ##### 20220125_02.md   [《德说-第87期, 开源与生命的奥义》](20220125_02.md)  
 ##### 20220125_01.md   [《PostgreSQL 15 preview - pg_basebackup 增强, 支持DB端压缩和压缩比选项》](20220125_01.md)  
 ##### 20220122_01.md   [《德说-第86期, 一》](20220122_01.md)  
diff --git a/README.md b/README.md
@@ -87,6 +87,7 @@ digoal's|PostgreSQL|文章|归类
   
 ### 所有文档如下  
 ----  
+##### 202201/20220126_01.md   [《PostgreSQL nestloop Materialize 代价算法》](202201/20220126_01.md)  
 ##### 202201/20220125_02.md   [《德说-第87期, 开源与生命的奥义》](202201/20220125_02.md)  
 ##### 202201/20220125_01.md   [《PostgreSQL 15 preview - pg_basebackup 增强, 支持DB端压缩和压缩比选项》](202201/20220125_01.md)  
 ##### 202201/20220122_01.md   [《德说-第86期, 一》](202201/20220122_01.md)