new doc

digoal · digoal · commit cc6693838dd2 · 2018-02-04T17:00:20.000+08:00
diff --git a/201802/20180204_02.md b/201802/20180204_02.md
@@ -0,0 +1,368 @@
+## PostgreSQL 11 preview - 新功能, 分区表全局索引管理    
+       
+### 作者        
+digoal           
+           
+### 日期        
+2018-02-04        
+       
+### 标签        
+PostgreSQL , 分区表 , 全局索引 , 分区索引 , 壳子索引         
+           
+----          
+          
+## 背景       
+PostgreSQL 10开始引入了原生支持的分区表，支持了range, list分区。  
+  
+PostgreSQL 11对分区表的功能进行了增强，首先增加了hash分区表。其次，增加了分区表全局索引管理(PostgreSQL 10 分区表的主表上不支持索引，如果要建索引，不得不在每个分区上去定义，不太方便的。)。  
+  
+以list分区为例，我们看看如何使用分区表的全局管理索引。  
+  
+```  
+CREATE TABLE cities (  
+    city_id      bigserial not null,  
+    name         text not null,  
+    population   bigint  
+) PARTITION BY LIST (left(lower(name), 1));  
+  
+CREATE TABLE cities_ab  
+    PARTITION OF cities (  
+    CONSTRAINT city_id_nonzero CHECK (city_id != 0)  
+) FOR VALUES IN ('a', 'b');  
+```  
+  
+## 创建分区表全局索引  
+1、创建全局索引  
+  
+```  
+create index idx_cities_1 on cities(name);  
+```  
+  
+对现有分区，会自动添加这个索引  
+  
+```  
+postgres=# \d+ cities  
+                                                    Table "public.cities"  
+   Column   |  Type  | Collation | Nullable |                 Default                 | Storage  | Stats target | Description   
+------------+--------+-----------+----------+-----------------------------------------+----------+--------------+-------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass) | plain    |              |   
+ name       | text   |           | not null |                                         | extended |              |   
+ population | bigint |           |          |                                         | plain    |              |   
+Partition key: LIST ("left"(lower(name), 1))  
+Indexes:  
+    "idx_cities_1" btree (name)  
+Partitions: cities_ab FOR VALUES IN ('a', 'b')  
+  
+postgres=# \d cities_ab  
+                               Table "public.cities_ab"  
+   Column   |  Type  | Collation | Nullable |                 Default                   
+------------+--------+-----------+----------+-----------------------------------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass)  
+ name       | text   |           | not null |   
+ population | bigint |           |          |   
+Partition of: cities FOR VALUES IN ('a', 'b')  
+Indexes:  
+    "cities_ab_name_idx" btree (name)  
+Check constraints:  
+    "city_id_nonzero" CHECK (city_id <> 0)  
+```  
+  
+元数据中，主表索引显示为'I'类型，分区上的索引显示为'i'类型。  
+  
+```  
+postgres=# select relname,relkind from pg_class where relname ~ 'citi';  
+      relname       | relkind   
+--------------------+---------  
+ idx_cities_1       | I    # 分区表 - 主表上的索引  
+ cities_ab          | r    # 分区表 - 分区表  
+ cities_ab_name_idx | i    # 分区表 - 分区表上的索引  
+ cities_city_id_seq | S    # 序列  
+ cities             | p    # 分区表 - 主表  
+(5 rows)  
+```  
+  
+2、新增分区，自动添加索引（如果这个分区上已经包含了同样定义的索引，那么会自动将这个索引attach到主表的索引中，而不会新建这个索引）  
+  
+```  
+CREATE TABLE cities_cd  
+    PARTITION OF cities (  
+    CONSTRAINT city_id_nonzero CHECK (city_id != 0)  
+) FOR VALUES IN ('c', 'd');  
+```  
+  
+自动索引如下（cities_cd_name_idx）：  
+  
+```  
+postgres=# \d+ cities_cd  
+                                                   Table "public.cities_cd"  
+   Column   |  Type  | Collation | Nullable |                 Default                 | Storage  | Stats target | Description   
+------------+--------+-----------+----------+-----------------------------------------+----------+--------------+-------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass) | plain    |              |   
+ name       | text   |           | not null |                                         | extended |              |   
+ population | bigint |           |          |                                         | plain    |              |   
+Partition of: cities FOR VALUES IN ('c', 'd')  
+Partition constraint: (("left"(lower(name), 1) IS NOT NULL) AND ("left"(lower(name), 1) = ANY (ARRAY['c'::text, 'd'::text])))  
+Indexes:  
+    "cities_cd_name_idx" btree (name)  
+Check constraints:  
+    "city_id_nonzero" CHECK (city_id <> 0)  
+```  
+  
+## 创建分区表 主表全局壳子索引  
+有些时候，不希望所有的分区表都自动创建某些索引，但是又想统一管理全局索引，怎么办？  
+  
+1、我们可以定义壳子索引，使用ONLY这个关键字，表示这个索引只建在当前这个表上。  
+  
+```  
+postgres=# create index idx_cities_2 on only cities (population);  
+CREATE INDEX  
+```  
+  
+这种方法创建出来的索引为INVALID索引。  
+  
+```  
+postgres=# \d+ cities  
+                                                    Table "public.cities"  
+   Column   |  Type  | Collation | Nullable |                 Default                 | Storage  | Stats target | Description   
+------------+--------+-----------+----------+-----------------------------------------+----------+--------------+-------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass) | plain    |              |   
+ name       | text   |           | not null |                                         | extended |              |   
+ population | bigint |           |          |                                         | plain    |              |   
+Partition key: LIST ("left"(lower(name), 1))  
+Indexes:  
+    "idx_cities_1" btree (name)  
+    "idx_cities_2" btree (population) INVALID  
+Partitions: cities_ab FOR VALUES IN ('a', 'b'),  
+            cities_cd FOR VALUES IN ('c', 'd')  
+```  
+  
+并且这个索引，也不会在分区上构建  
+  
+```  
+postgres=# \d cities_ab  
+                               Table "public.cities_ab"  
+   Column   |  Type  | Collation | Nullable |                 Default                   
+------------+--------+-----------+----------+-----------------------------------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass)  
+ name       | text   |           | not null |   
+ population | bigint |           |          |   
+Partition of: cities FOR VALUES IN ('a', 'b')  
+Indexes:  
+    "cities_ab_name_idx" btree (name)  
+Check constraints:  
+    "city_id_nonzero" CHECK (city_id <> 0)  
+```  
+  
+## 将分区表分区索引绑定到主表全局索引  
+  
+1、比如我们有一些分区想构建某个字段的索引，而其他分区确不想构建（比如对于某些分区的数据，我们不想按population字段查询，没有必要建立那些分区的索引）  
+  
+```  
+postgres=# create index idx_cities_ab_2 on cities_ab (population);  
+CREATE INDEX  
+```  
+  
+2、将这个分区上的索引，绑定到INVALID的全局壳子索引下面  
+  
+```  
+postgres=# alter index idx_cities_2 attach partition idx_cities_ab_2;  
+ALTER INDEX  
+```  
+  
+```  
+postgres=# \d+ cities  
+                                                    Table "public.cities"  
+   Column   |  Type  | Collation | Nullable |                 Default                 | Storage  | Stats target | Description   
+------------+--------+-----------+----------+-----------------------------------------+----------+--------------+-------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass) | plain    |              |   
+ name       | text   |           | not null |                                         | extended |              |   
+ population | bigint |           |          |                                         | plain    |              |   
+Partition key: LIST ("left"(lower(name), 1))  
+Indexes:  
+    "idx_cities_1" btree (name)  
+    "idx_cities_2" btree (population) INVALID  
+Partitions: cities_ab FOR VALUES IN ('a', 'b'),  
+            cities_cd FOR VALUES IN ('c', 'd')  
+```  
+  
+3、那么现在的索引结构是这样的  
+  
+```  
+postgres=# select relname,relkind from pg_class where relname ~ 'citi';  
+      relname       | relkind   
+--------------------+---------  
+ idx_cities_1       | I  
+ cities_ab          | r  
+ cities_ab_name_idx | i  
+ cities_cd_name_idx | i  
+ cities_cd          | r  
+ idx_cities_2       | I  
+ idx_cities_ab_2    | i  
+ cities_city_id_seq | S  
+ cities             | p  
+(9 rows)  
+```  
+  
+全局壳子索引的好处：  
+  
+1、新增的分区表，不会自动创建这个索引，而是需要手工添加。满足一些特殊需求（例如有些分区不会检索某个字段，而其他分区需要检索，那么可以区别对待。）、  
+  
+## 全局索引管理的好处  
+  
+1、新增分区自动创建全局索引一样的索引（如果分区中已经包含了一个定义一致的本地索引，那么这个索引会自动attach到全局索引下，并被其管理）。  
+  
+2、删除全局索引时，自动删除已经attach在这个全局索引下面的所有索引。  
+  
+```  
+postgres=# drop index idx_cities_1;  
+DROP INDEX  
+postgres=# \d cities  
+                                Table "public.cities"  
+   Column   |  Type  | Collation | Nullable |                 Default                   
+------------+--------+-----------+----------+-----------------------------------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass)  
+ name       | text   |           | not null |   
+ population | bigint |           |          |   
+Partition key: LIST ("left"(lower(name), 1))  
+Indexes:  
+    "idx_cities_2" btree (population) INVALID  
+Number of partitions: 2 (Use \d+ to list them.)  
+  
+postgres=# \d cities_ab  
+                               Table "public.cities_ab"  
+   Column   |  Type  | Collation | Nullable |                 Default                   
+------------+--------+-----------+----------+-----------------------------------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass)  
+ name       | text   |           | not null |   
+ population | bigint |           |          |   
+Partition of: cities FOR VALUES IN ('a', 'b')  
+Indexes:  
+    "idx_cities_ab_2" btree (population)  
+Check constraints:  
+    "city_id_nonzero" CHECK (city_id <> 0)  
+  
+postgres=# \d cities_cd  
+                               Table "public.cities_cd"  
+   Column   |  Type  | Collation | Nullable |                 Default                   
+------------+--------+-----------+----------+-----------------------------------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass)  
+ name       | text   |           | not null |   
+ population | bigint |           |          |   
+Partition of: cities FOR VALUES IN ('c', 'd')  
+Check constraints:  
+    "city_id_nonzero" CHECK (city_id <> 0)  
+  
+```  
+  
+在另一个分区上，创建一个本地索引。并且不将这个索引attach到主表的全局壳子索引上。  
+  
+那么在删除主表索引时，这个本地索引不会被自动删除。  
+  
+```  
+postgres=# create index idx_cities_cd_2 on cities_cd (population );  
+CREATE INDEX  
+postgres=# drop index idx_cities_2;  
+DROP INDEX  
+postgres=# \d cities  
+                                Table "public.cities"  
+   Column   |  Type  | Collation | Nullable |                 Default                   
+------------+--------+-----------+----------+-----------------------------------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass)  
+ name       | text   |           | not null |   
+ population | bigint |           |          |   
+Partition key: LIST ("left"(lower(name), 1))  
+Number of partitions: 2 (Use \d+ to list them.)  
+  
+postgres=# \d cities_ab  
+                               Table "public.cities_ab"  
+   Column   |  Type  | Collation | Nullable |                 Default                   
+------------+--------+-----------+----------+-----------------------------------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass)  
+ name       | text   |           | not null |   
+ population | bigint |           |          |   
+Partition of: cities FOR VALUES IN ('a', 'b')  
+Check constraints:  
+    "city_id_nonzero" CHECK (city_id <> 0)  
+  
+postgres=# \d cities_cd  
+                               Table "public.cities_cd"  
+   Column   |  Type  | Collation | Nullable |                 Default                   
+------------+--------+-----------+----------+-----------------------------------------  
+ city_id    | bigint |           | not null | nextval('cities_city_id_seq'::regclass)  
+ name       | text   |           | not null |   
+ population | bigint |           |          |   
+Partition of: cities FOR VALUES IN ('c', 'd')  
+Indexes:  
+    "idx_cities_cd_2" btree (population)  
+Check constraints:  
+    "city_id_nonzero" CHECK (city_id <> 0)  
+```  
+  
+## 分区索引好处  
+  
+分区索引不被全局管理，比较灵活，但是管理麻烦一些。  
+  
+好处是不同的分区可以拥有不同的索引定义。在一些数据特殊管理的场景很灵活。  
+  
+例如某些分区会经常检索A字段，但是某些分区确经常检索B字段，那么就可以为不同的分区创建不同的本地索引，而不是全局一起构建。可以节约一些成本，但是管理成本可能会上升。  
+  
+## 元信息  
+  
+元数据中，主表索引显示为'I'类型，分区上的索引显示为'i'类型。  
+  
+```  
+postgres=# select relname,relkind from pg_class where relname ~ 'citi';  
+      relname       | relkind   
+--------------------+---------  
+ idx_cities_1       | I    # 分区表 - 主表上的索引  
+ cities_ab          | r    # 分区表 - 分区表  
+ cities_ab_name_idx | i    # 分区表 - 分区表上的索引  
+ cities_city_id_seq | S    # 序列  
+ cities             | p    # 分区表 - 主表  
+(5 rows)  
+```  
+  
+解释relkind中的含义：  
+  
+```  
+I = partiton table global index,  分区表 - 主表上的索引  
+r = ordinary table,               普通表  
+i = index,                        普通表的索引，或者分区表分区上的本地索引  
+S = sequence,                     序列  
+t = TOAST table,                  切片表（变长字段压缩后超过BLOCK的1/4时，会存到切片表中）  
+v = view,                         视图  
+m = materialized view,            物化视图  
+c = composite type,               自定义复合类型  
+f = foreign table,                外部表  
+p = partitioned table             分区表 - 主表  
+```  
+  
+  
+## 小结  
+PostgreSQL 11 允许对分区表的主表创建索引，并内部支持了全局索引的管理机制（包括依赖、自动attach等）。  
+  
+全局索引带来的好处，  
+  
+1、当删除全局索引时，会将所有分区的索引自动删除。  
+  
+2、当新增分区时，会自动创建与全局索引一致的索引（假设这个分区已有这样的索引，那么会自动将已有索引attach到全局索引下，便于管理）  
+  
+3、对分区表创建索引时，会自动在所有分区下创建本地索引，并将分区本地索引ATTACH到全局索引下便于管理。  
+  
+另外，全局索引还有一种壳子模式，即使用only关键字创建，它的好处：  
+  
+1、使用ONLY关键字在分区表-主表上创建索引时，索引状态为INVALID，表示新建分区时，不会自动添加这个同类结构的索引，也不会自动ATTACH已有索引到这个全局索引下面。  
+  
+2、使用全局索引的壳子模式，对不同的分区采用不同的索引，仅对需要用于管理的索引ATTACH到这个壳子中来管理。可以满足一些特殊业务场景（例如某些分区会经常检索A字段，但是某些分区确经常检索B字段，那么就可以为不同的分区创建不同的本地索引，而不是全局一起构建。可以节约一些成本，但是管理成本可能会上升。）。  
+  
+PostgreSQL 11的全局索引使用非常灵活。  
+  
+## 参考  
+  
+https://www.postgresql.org/docs/devel/static/sql-alterindex.html  
+  
+https://www.postgresql.org/docs/devel/static/sql-createindex.html  
+  
+https://www.postgresql.org/docs/devel/static/catalog-pg-class.html  
+  
+  
diff --git a/201802/readme.md b/201802/readme.md
@@ -1,5 +1,6 @@
 ### 文章列表  
 ----  
+##### 20180204_02.md   [《PostgreSQL 11 preview - 新功能, 分区表全局索引管理》](20180204_02.md)  
 ##### 20180204_01.md   [《PostgreSQL 11 preview - 并行排序、并行索引 (性能线性暴增) 单实例100亿TOP-K仅40秒》](20180204_01.md)  
 ##### 20180202_02.md   [《PostgreSQL 11 preview - 分区表智能并行JOIN (已类似MPP架构，性能暴增)》](20180202_02.md)  
 ##### 20180202_01.md   [《PostgreSQL 相似搜索设计与性能 - 地址、QA、POI等文本 毫秒级相似搜索实践》](20180202_01.md)  
diff --git a/README.md b/README.md
@@ -29,6 +29,7 @@ digoal's|PostgreSQL|文章|归类
   
 ### 未归类文档如下  
 ----  
+##### 201802/20180204_02.md   [《PostgreSQL 11 preview - 新功能, 分区表全局索引管理》](201802/20180204_02.md)  
 ##### 201802/20180204_01.md   [《PostgreSQL 11 preview - 并行排序、并行索引 (性能线性暴增) 单实例100亿TOP-K仅40秒》](201802/20180204_01.md)  
 ##### 201802/20180202_02.md   [《PostgreSQL 11 preview - 分区表智能并行JOIN (已类似MPP架构，性能暴增)》](201802/20180202_02.md)  
 ##### 201802/20180202_01.md   [《PostgreSQL 相似搜索设计与性能 - 地址、QA、POI等文本 毫秒级相似搜索实践》](201802/20180202_01.md)