Skip to content

Commit cc66938

Browse files
committed
new doc
1 parent 849ac19 commit cc66938

File tree

3 files changed

+370
-0
lines changed

3 files changed

+370
-0
lines changed

201802/20180204_02.md

Lines changed: 368 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,368 @@
1+
## PostgreSQL 11 preview - 新功能, 分区表全局索引管理
2+
3+
### 作者
4+
digoal
5+
6+
### 日期
7+
2018-02-04
8+
9+
### 标签
10+
PostgreSQL , 分区表 , 全局索引 , 分区索引 , 壳子索引
11+
12+
----
13+
14+
## 背景
15+
PostgreSQL 10开始引入了原生支持的分区表,支持了range, list分区。
16+
17+
PostgreSQL 11对分区表的功能进行了增强,首先增加了hash分区表。其次,增加了分区表全局索引管理(PostgreSQL 10 分区表的主表上不支持索引,如果要建索引,不得不在每个分区上去定义,不太方便的。)。
18+
19+
以list分区为例,我们看看如何使用分区表的全局管理索引。
20+
21+
```
22+
CREATE TABLE cities (
23+
city_id bigserial not null,
24+
name text not null,
25+
population bigint
26+
) PARTITION BY LIST (left(lower(name), 1));
27+
28+
CREATE TABLE cities_ab
29+
PARTITION OF cities (
30+
CONSTRAINT city_id_nonzero CHECK (city_id != 0)
31+
) FOR VALUES IN ('a', 'b');
32+
```
33+
34+
## 创建分区表全局索引
35+
1、创建全局索引
36+
37+
```
38+
create index idx_cities_1 on cities(name);
39+
```
40+
41+
对现有分区,会自动添加这个索引
42+
43+
```
44+
postgres=# \d+ cities
45+
Table "public.cities"
46+
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
47+
------------+--------+-----------+----------+-----------------------------------------+----------+--------------+-------------
48+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass) | plain | |
49+
name | text | | not null | | extended | |
50+
population | bigint | | | | plain | |
51+
Partition key: LIST ("left"(lower(name), 1))
52+
Indexes:
53+
"idx_cities_1" btree (name)
54+
Partitions: cities_ab FOR VALUES IN ('a', 'b')
55+
56+
postgres=# \d cities_ab
57+
Table "public.cities_ab"
58+
Column | Type | Collation | Nullable | Default
59+
------------+--------+-----------+----------+-----------------------------------------
60+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass)
61+
name | text | | not null |
62+
population | bigint | | |
63+
Partition of: cities FOR VALUES IN ('a', 'b')
64+
Indexes:
65+
"cities_ab_name_idx" btree (name)
66+
Check constraints:
67+
"city_id_nonzero" CHECK (city_id <> 0)
68+
```
69+
70+
元数据中,主表索引显示为'I'类型,分区上的索引显示为'i'类型。
71+
72+
```
73+
postgres=# select relname,relkind from pg_class where relname ~ 'citi';
74+
relname | relkind
75+
--------------------+---------
76+
idx_cities_1 | I # 分区表 - 主表上的索引
77+
cities_ab | r # 分区表 - 分区表
78+
cities_ab_name_idx | i # 分区表 - 分区表上的索引
79+
cities_city_id_seq | S # 序列
80+
cities | p # 分区表 - 主表
81+
(5 rows)
82+
```
83+
84+
2、新增分区,自动添加索引(如果这个分区上已经包含了同样定义的索引,那么会自动将这个索引attach到主表的索引中,而不会新建这个索引)
85+
86+
```
87+
CREATE TABLE cities_cd
88+
PARTITION OF cities (
89+
CONSTRAINT city_id_nonzero CHECK (city_id != 0)
90+
) FOR VALUES IN ('c', 'd');
91+
```
92+
93+
自动索引如下(cities_cd_name_idx):
94+
95+
```
96+
postgres=# \d+ cities_cd
97+
Table "public.cities_cd"
98+
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
99+
------------+--------+-----------+----------+-----------------------------------------+----------+--------------+-------------
100+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass) | plain | |
101+
name | text | | not null | | extended | |
102+
population | bigint | | | | plain | |
103+
Partition of: cities FOR VALUES IN ('c', 'd')
104+
Partition constraint: (("left"(lower(name), 1) IS NOT NULL) AND ("left"(lower(name), 1) = ANY (ARRAY['c'::text, 'd'::text])))
105+
Indexes:
106+
"cities_cd_name_idx" btree (name)
107+
Check constraints:
108+
"city_id_nonzero" CHECK (city_id <> 0)
109+
```
110+
111+
## 创建分区表 主表全局壳子索引
112+
有些时候,不希望所有的分区表都自动创建某些索引,但是又想统一管理全局索引,怎么办?
113+
114+
1、我们可以定义壳子索引,使用ONLY这个关键字,表示这个索引只建在当前这个表上。
115+
116+
```
117+
postgres=# create index idx_cities_2 on only cities (population);
118+
CREATE INDEX
119+
```
120+
121+
这种方法创建出来的索引为INVALID索引。
122+
123+
```
124+
postgres=# \d+ cities
125+
Table "public.cities"
126+
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
127+
------------+--------+-----------+----------+-----------------------------------------+----------+--------------+-------------
128+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass) | plain | |
129+
name | text | | not null | | extended | |
130+
population | bigint | | | | plain | |
131+
Partition key: LIST ("left"(lower(name), 1))
132+
Indexes:
133+
"idx_cities_1" btree (name)
134+
"idx_cities_2" btree (population) INVALID
135+
Partitions: cities_ab FOR VALUES IN ('a', 'b'),
136+
cities_cd FOR VALUES IN ('c', 'd')
137+
```
138+
139+
并且这个索引,也不会在分区上构建
140+
141+
```
142+
postgres=# \d cities_ab
143+
Table "public.cities_ab"
144+
Column | Type | Collation | Nullable | Default
145+
------------+--------+-----------+----------+-----------------------------------------
146+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass)
147+
name | text | | not null |
148+
population | bigint | | |
149+
Partition of: cities FOR VALUES IN ('a', 'b')
150+
Indexes:
151+
"cities_ab_name_idx" btree (name)
152+
Check constraints:
153+
"city_id_nonzero" CHECK (city_id <> 0)
154+
```
155+
156+
## 将分区表分区索引绑定到主表全局索引
157+
158+
1、比如我们有一些分区想构建某个字段的索引,而其他分区确不想构建(比如对于某些分区的数据,我们不想按population字段查询,没有必要建立那些分区的索引)
159+
160+
```
161+
postgres=# create index idx_cities_ab_2 on cities_ab (population);
162+
CREATE INDEX
163+
```
164+
165+
2、将这个分区上的索引,绑定到INVALID的全局壳子索引下面
166+
167+
```
168+
postgres=# alter index idx_cities_2 attach partition idx_cities_ab_2;
169+
ALTER INDEX
170+
```
171+
172+
```
173+
postgres=# \d+ cities
174+
Table "public.cities"
175+
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
176+
------------+--------+-----------+----------+-----------------------------------------+----------+--------------+-------------
177+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass) | plain | |
178+
name | text | | not null | | extended | |
179+
population | bigint | | | | plain | |
180+
Partition key: LIST ("left"(lower(name), 1))
181+
Indexes:
182+
"idx_cities_1" btree (name)
183+
"idx_cities_2" btree (population) INVALID
184+
Partitions: cities_ab FOR VALUES IN ('a', 'b'),
185+
cities_cd FOR VALUES IN ('c', 'd')
186+
```
187+
188+
3、那么现在的索引结构是这样的
189+
190+
```
191+
postgres=# select relname,relkind from pg_class where relname ~ 'citi';
192+
relname | relkind
193+
--------------------+---------
194+
idx_cities_1 | I
195+
cities_ab | r
196+
cities_ab_name_idx | i
197+
cities_cd_name_idx | i
198+
cities_cd | r
199+
idx_cities_2 | I
200+
idx_cities_ab_2 | i
201+
cities_city_id_seq | S
202+
cities | p
203+
(9 rows)
204+
```
205+
206+
全局壳子索引的好处:
207+
208+
1、新增的分区表,不会自动创建这个索引,而是需要手工添加。满足一些特殊需求(例如有些分区不会检索某个字段,而其他分区需要检索,那么可以区别对待。)、
209+
210+
## 全局索引管理的好处
211+
212+
1、新增分区自动创建全局索引一样的索引(如果分区中已经包含了一个定义一致的本地索引,那么这个索引会自动attach到全局索引下,并被其管理)。
213+
214+
2、删除全局索引时,自动删除已经attach在这个全局索引下面的所有索引。
215+
216+
```
217+
postgres=# drop index idx_cities_1;
218+
DROP INDEX
219+
postgres=# \d cities
220+
Table "public.cities"
221+
Column | Type | Collation | Nullable | Default
222+
------------+--------+-----------+----------+-----------------------------------------
223+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass)
224+
name | text | | not null |
225+
population | bigint | | |
226+
Partition key: LIST ("left"(lower(name), 1))
227+
Indexes:
228+
"idx_cities_2" btree (population) INVALID
229+
Number of partitions: 2 (Use \d+ to list them.)
230+
231+
postgres=# \d cities_ab
232+
Table "public.cities_ab"
233+
Column | Type | Collation | Nullable | Default
234+
------------+--------+-----------+----------+-----------------------------------------
235+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass)
236+
name | text | | not null |
237+
population | bigint | | |
238+
Partition of: cities FOR VALUES IN ('a', 'b')
239+
Indexes:
240+
"idx_cities_ab_2" btree (population)
241+
Check constraints:
242+
"city_id_nonzero" CHECK (city_id <> 0)
243+
244+
postgres=# \d cities_cd
245+
Table "public.cities_cd"
246+
Column | Type | Collation | Nullable | Default
247+
------------+--------+-----------+----------+-----------------------------------------
248+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass)
249+
name | text | | not null |
250+
population | bigint | | |
251+
Partition of: cities FOR VALUES IN ('c', 'd')
252+
Check constraints:
253+
"city_id_nonzero" CHECK (city_id <> 0)
254+
255+
```
256+
257+
在另一个分区上,创建一个本地索引。并且不将这个索引attach到主表的全局壳子索引上。
258+
259+
那么在删除主表索引时,这个本地索引不会被自动删除。
260+
261+
```
262+
postgres=# create index idx_cities_cd_2 on cities_cd (population );
263+
CREATE INDEX
264+
postgres=# drop index idx_cities_2;
265+
DROP INDEX
266+
postgres=# \d cities
267+
Table "public.cities"
268+
Column | Type | Collation | Nullable | Default
269+
------------+--------+-----------+----------+-----------------------------------------
270+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass)
271+
name | text | | not null |
272+
population | bigint | | |
273+
Partition key: LIST ("left"(lower(name), 1))
274+
Number of partitions: 2 (Use \d+ to list them.)
275+
276+
postgres=# \d cities_ab
277+
Table "public.cities_ab"
278+
Column | Type | Collation | Nullable | Default
279+
------------+--------+-----------+----------+-----------------------------------------
280+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass)
281+
name | text | | not null |
282+
population | bigint | | |
283+
Partition of: cities FOR VALUES IN ('a', 'b')
284+
Check constraints:
285+
"city_id_nonzero" CHECK (city_id <> 0)
286+
287+
postgres=# \d cities_cd
288+
Table "public.cities_cd"
289+
Column | Type | Collation | Nullable | Default
290+
------------+--------+-----------+----------+-----------------------------------------
291+
city_id | bigint | | not null | nextval('cities_city_id_seq'::regclass)
292+
name | text | | not null |
293+
population | bigint | | |
294+
Partition of: cities FOR VALUES IN ('c', 'd')
295+
Indexes:
296+
"idx_cities_cd_2" btree (population)
297+
Check constraints:
298+
"city_id_nonzero" CHECK (city_id <> 0)
299+
```
300+
301+
## 分区索引好处
302+
303+
分区索引不被全局管理,比较灵活,但是管理麻烦一些。
304+
305+
好处是不同的分区可以拥有不同的索引定义。在一些数据特殊管理的场景很灵活。
306+
307+
例如某些分区会经常检索A字段,但是某些分区确经常检索B字段,那么就可以为不同的分区创建不同的本地索引,而不是全局一起构建。可以节约一些成本,但是管理成本可能会上升。
308+
309+
## 元信息
310+
311+
元数据中,主表索引显示为'I'类型,分区上的索引显示为'i'类型。
312+
313+
```
314+
postgres=# select relname,relkind from pg_class where relname ~ 'citi';
315+
relname | relkind
316+
--------------------+---------
317+
idx_cities_1 | I # 分区表 - 主表上的索引
318+
cities_ab | r # 分区表 - 分区表
319+
cities_ab_name_idx | i # 分区表 - 分区表上的索引
320+
cities_city_id_seq | S # 序列
321+
cities | p # 分区表 - 主表
322+
(5 rows)
323+
```
324+
325+
解释relkind中的含义:
326+
327+
```
328+
I = partiton table global index, 分区表 - 主表上的索引
329+
r = ordinary table, 普通表
330+
i = index, 普通表的索引,或者分区表分区上的本地索引
331+
S = sequence, 序列
332+
t = TOAST table, 切片表(变长字段压缩后超过BLOCK的1/4时,会存到切片表中)
333+
v = view, 视图
334+
m = materialized view, 物化视图
335+
c = composite type, 自定义复合类型
336+
f = foreign table, 外部表
337+
p = partitioned table 分区表 - 主表
338+
```
339+
340+
341+
## 小结
342+
PostgreSQL 11 允许对分区表的主表创建索引,并内部支持了全局索引的管理机制(包括依赖、自动attach等)。
343+
344+
全局索引带来的好处,
345+
346+
1、当删除全局索引时,会将所有分区的索引自动删除。
347+
348+
2、当新增分区时,会自动创建与全局索引一致的索引(假设这个分区已有这样的索引,那么会自动将已有索引attach到全局索引下,便于管理)
349+
350+
3、对分区表创建索引时,会自动在所有分区下创建本地索引,并将分区本地索引ATTACH到全局索引下便于管理。
351+
352+
另外,全局索引还有一种壳子模式,即使用only关键字创建,它的好处:
353+
354+
1、使用ONLY关键字在分区表-主表上创建索引时,索引状态为INVALID,表示新建分区时,不会自动添加这个同类结构的索引,也不会自动ATTACH已有索引到这个全局索引下面。
355+
356+
2、使用全局索引的壳子模式,对不同的分区采用不同的索引,仅对需要用于管理的索引ATTACH到这个壳子中来管理。可以满足一些特殊业务场景(例如某些分区会经常检索A字段,但是某些分区确经常检索B字段,那么就可以为不同的分区创建不同的本地索引,而不是全局一起构建。可以节约一些成本,但是管理成本可能会上升。)。
357+
358+
PostgreSQL 11的全局索引使用非常灵活。
359+
360+
## 参考
361+
362+
https://www.postgresql.org/docs/devel/static/sql-alterindex.html
363+
364+
https://www.postgresql.org/docs/devel/static/sql-createindex.html
365+
366+
https://www.postgresql.org/docs/devel/static/catalog-pg-class.html
367+
368+

201802/readme.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
### 文章列表
22
----
3+
##### 20180204_02.md [《PostgreSQL 11 preview - 新功能, 分区表全局索引管理》](20180204_02.md)
34
##### 20180204_01.md [《PostgreSQL 11 preview - 并行排序、并行索引 (性能线性暴增) 单实例100亿TOP-K仅40秒》](20180204_01.md)
45
##### 20180202_02.md [《PostgreSQL 11 preview - 分区表智能并行JOIN (已类似MPP架构,性能暴增)》](20180202_02.md)
56
##### 20180202_01.md [《PostgreSQL 相似搜索设计与性能 - 地址、QA、POI等文本 毫秒级相似搜索实践》](20180202_01.md)

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ digoal's|PostgreSQL|文章|归类
2929

3030
### 未归类文档如下
3131
----
32+
##### 201802/20180204_02.md [《PostgreSQL 11 preview - 新功能, 分区表全局索引管理》](201802/20180204_02.md)
3233
##### 201802/20180204_01.md [《PostgreSQL 11 preview - 并行排序、并行索引 (性能线性暴增) 单实例100亿TOP-K仅40秒》](201802/20180204_01.md)
3334
##### 201802/20180202_02.md [《PostgreSQL 11 preview - 分区表智能并行JOIN (已类似MPP架构,性能暴增)》](201802/20180202_02.md)
3435
##### 201802/20180202_01.md [《PostgreSQL 相似搜索设计与性能 - 地址、QA、POI等文本 毫秒级相似搜索实践》](201802/20180202_01.md)

0 commit comments

Comments
 (0)