Skip to content

Commit 706429a

Browse files
committed
new doc
1 parent 168c782 commit 706429a

File tree

5 files changed

+282
-3
lines changed

5 files changed

+282
-3
lines changed

201708/20170802_02.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## (新零售)商户网格化运营 - 阿里云RDS PostgreSQL最佳实践
1+
## (新零售)商户网格化运营 - 阿里云RDS PostgreSQL、HybridDB for PostgreSQL最佳实践
22
33
### 作者
44
digoal

201708/20170803_01.md

Lines changed: 277 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,277 @@
1+
## 菜鸟末端轨迹 面面、点判断 空间索引性能 - 阿里云RDS PostgreSQL最佳实践
2+
3+
### 作者
4+
digoal
5+
6+
### 日期
7+
2017-08-03
8+
9+
### 标签
10+
PostgreSQL , PostGIS , 多边形 , 面 , 点 , 面点判断 , 菜鸟
11+
12+
----
13+
14+
## 背景
15+
菜鸟末端轨迹项目中涉及的一个关键需求,面面判断。
16+
17+
在数据库中存储了一些多边形记录,约几百万到千万条记录,例如一个小区,在地图上是一个多边形。
18+
19+
不同的快递公司,会有各自不同的多边形划分方法(网点负责的片区(多边形),某个快递员负责的片区(多边形))。
20+
21+
用户在寄件时,根据用户的位置,查找对应快递公司负责这个片区的网点、或者负责该片区的快递员。
22+
23+
![pic](20170803_01_pic_001.jpg)
24+
25+
## 一、需求
26+
1、在数据库中存储了一些静态的面信息,代表小区、园区、写字楼等等。所有的面不相交。
27+
28+
2、为了支持不同的业务类型,对一个地图,可能划分为不同的多边形组成。
29+
30+
例如不同的快递公司,会有各自不同的多边形划分方法(网点负责的片区(多边形),某个快递员负责的片区(多边形))。
31+
32+
因此在一张地图上,有多个图层,每个图层的多边形划分方法可能不一样。
33+
34+
3、快速的根据快递公司、客户的位置,求包含这个点的多边形(即得到对应快递公司负责这个片区的网点、或者负责该片区的快递员)。
35+
36+
## 二、架构设计
37+
用到阿里云的RDS PostgreSQL,以及PG提供的PostGIS插件。
38+
39+
我们需要用到PostGIS的函数有两个
40+
41+
http://postgis.net/docs/manual-2.3/ST_Within.html
42+
43+
1、ST_within
44+
45+
ST_Within — Returns true if the geometry A is completely inside geometry B
46+
47+
boolean ST_Within(geometry A, geometry B);
48+
49+
Returns TRUE if geometry A is completely inside geometry B. For this function to make sense, the source geometries must both be of the same coordinate projection, having the same SRID. It is a given that if ST_Within(A,B) is true and ST_Within(B,A) is true, then the two geometries are considered spatially equal.
50+
51+
This function call will automatically include a bounding box comparison that will make use of any indexes that are available on the geometries. To avoid index use, use the function _ST_Within.
52+
53+
```
54+
-- a circle within a circle
55+
SELECT ST_Within(smallc,smallc) As smallinsmall,
56+
ST_Within(smallc, bigc) As smallinbig,
57+
ST_Within(bigc,smallc) As biginsmall,
58+
ST_Within(ST_Union(smallc, bigc), bigc) as unioninbig,
59+
ST_Within(bigc, ST_Union(smallc, bigc)) as biginunion,
60+
ST_Equals(bigc, ST_Union(smallc, bigc)) as bigisunion
61+
FROM
62+
(
63+
SELECT ST_Buffer(ST_GeomFromText('POINT(50 50)'), 20) As smallc,
64+
ST_Buffer(ST_GeomFromText('POINT(50 50)'), 40) As bigc) As foo;
65+
-- Result
66+
smallinsmall | smallinbig | biginsmall | unioninbig | biginunion | bigisunion
67+
--------------+------------+------------+------------+------------+------------
68+
t | t | f | t | t | t
69+
(1 row)
70+
```
71+
72+
2、ST_Contains
73+
74+
ST_Contains — Returns true if and only if no points of B lie in the exterior of A, and at least one point of the interior of B lies in the interior of A.
75+
76+
boolean ST_Contains(geometry geomA, geometry geomB);
77+
78+
Returns TRUE if geometry B is completely inside geometry A. For this function to make sense, the source geometries must both be of the same coordinate projection, having the same SRID. ST_Contains is the inverse of ST_Within. So ST_Contains(A,B) implies ST_Within(B,A) except in the case of invalid geometries where the result is always false regardless or not defined.
79+
80+
This function call will automatically include a bounding box comparison that will make use of any indexes that are available on the geometries. To avoid index use, use the function _ST_Contains.
81+
82+
```
83+
-- A circle within a circle
84+
SELECT ST_Contains(smallc, bigc) As smallcontainsbig,
85+
ST_Contains(bigc,smallc) As bigcontainssmall,
86+
ST_Contains(bigc, ST_Union(smallc, bigc)) as bigcontainsunion,
87+
ST_Equals(bigc, ST_Union(smallc, bigc)) as bigisunion,
88+
ST_Covers(bigc, ST_ExteriorRing(bigc)) As bigcoversexterior,
89+
ST_Contains(bigc, ST_ExteriorRing(bigc)) As bigcontainsexterior
90+
FROM (SELECT ST_Buffer(ST_GeomFromText('POINT(1 2)'), 10) As smallc,
91+
ST_Buffer(ST_GeomFromText('POINT(1 2)'), 20) As bigc) As foo;
92+
93+
-- Result
94+
smallcontainsbig | bigcontainssmall | bigcontainsunion | bigisunion | bigcoversexterior | bigcontainsexterior
95+
------------------+------------------+------------------+------------+-------------------+---------------------
96+
f | t | t | t | t | f
97+
98+
-- Example demonstrating difference between contains and contains properly
99+
SELECT ST_GeometryType(geomA) As geomtype, ST_Contains(geomA,geomA) AS acontainsa, ST_ContainsProperly(geomA, geomA) AS acontainspropa,
100+
ST_Contains(geomA, ST_Boundary(geomA)) As acontainsba, ST_ContainsProperly(geomA, ST_Boundary(geomA)) As acontainspropba
101+
FROM (VALUES ( ST_Buffer(ST_Point(1,1), 5,1) ),
102+
( ST_MakeLine(ST_Point(1,1), ST_Point(-1,-1) ) ),
103+
( ST_Point(1,1) )
104+
) As foo(geomA);
105+
106+
geomtype | acontainsa | acontainspropa | acontainsba | acontainspropba
107+
--------------+------------+----------------+-------------+-----------------
108+
ST_Polygon | t | f | f | f
109+
ST_LineString | t | f | f | f
110+
ST_Point | t | t | f | f
111+
```
112+
113+
![pic](../201708/20170802_02_pic_005.jpg)
114+
115+
![pic](../201708/20170802_02_pic_006.jpg)
116+
117+
## 三、DEMO与性能
118+
为了简化测试,采样PG内置的几何类型进行测试,用法与PostGIS是类似的。
119+
120+
1、创建测试表
121+
122+
```
123+
postgres=# create table po(id int, typid int, po polygon);
124+
CREATE TABLE
125+
```
126+
127+
2、创建分区表或分区索引
128+
129+
```
130+
create extension btree_gist;
131+
create index idx_po_1 on po using gist(typid, po);
132+
```
133+
134+
3、创建空间排他约束,可选
135+
136+
如果要求单个typid内的po不重叠,可以创建空间排他约束
137+
138+
```
139+
create table tbl_po(id int, typid int, po polygon)
140+
PARTITION BY LIST (typid);
141+
142+
CREATE TABLE tbl_po_1
143+
PARTITION OF tbl_po (
144+
EXCLUDE USING gist (po WITH &&)
145+
) FOR VALUES IN (1);
146+
147+
...
148+
149+
CREATE TABLE tbl_po_20
150+
PARTITION OF tbl_po (
151+
EXCLUDE USING gist (po WITH &&)
152+
) FOR VALUES IN (20);
153+
154+
查看某分区表的空间排他约束如下
155+
156+
postgres=# \d tbl_po_1
157+
Table "postgres.tbl_po_1"
158+
Column | Type | Collation | Nullable | Default
159+
--------+---------+-----------+----------+---------
160+
id | integer | | |
161+
typid | integer | | |
162+
po | polygon | | |
163+
Partition of: tbl_po FOR VALUES IN (1)
164+
Indexes:
165+
"tbl_po_1_po_excl" EXCLUDE USING gist (po WITH &&)
166+
```
167+
168+
4、写入1000万多边形测试数据
169+
170+
```
171+
insert into po select id, random()*20, polygon('(('||x1||','||y1||'),('||x2||','||y2||'),('||x3||','||y3||'))') from (select id, 180-random()*180 x1, 180-random()*180 x2, 180-random()*180 x3, 90-random()*90 y1, 90-random()*90 y2, 90-random()*90 y3 from generate_series(1,10000000) t(id)) t;
172+
```
173+
174+
5、测试面点判断性能
175+
176+
查询包含point(1,1)的多边形,响应时间0.3毫秒。
177+
178+
```
179+
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from po where po @> polygon('((1,1),(1,1),(1,1))') limit 1;
180+
QUERY PLAN
181+
-------------------------------------------------------------------------------------------------------------------------------------
182+
Limit (cost=0.42..1.71 rows=1 width=93) (actual time=0.326..0.326 rows=1 loops=1)
183+
Output: id, typid, po
184+
Buffers: shared hit=71
185+
-> Index Scan using idx_po_1 on postgres.po (cost=0.42..12940.12 rows=10000 width=93) (actual time=0.325..0.325 rows=1 loops=1)
186+
Output: id, typid, po
187+
Index Cond: (po.po @> '((1,1),(1,1),(1,1))'::polygon)
188+
Rows Removed by Index Recheck: 13
189+
Buffers: shared hit=71
190+
Planning time: 0.032 ms
191+
Execution time: 0.338 ms
192+
(10 rows)
193+
```
194+
195+
6、压测
196+
197+
```
198+
vi test.sql
199+
\set x random(-180,180)
200+
\set y random(-90,90)
201+
select * from po where po @> polygon('((:x,:y),(:x,:y),(:x,:y))') limit 1;
202+
203+
pgbench -M simple -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 100
204+
transaction type: ./test.sql
205+
scaling factor: 1
206+
query mode: simple
207+
number of clients: 64
208+
number of threads: 64
209+
duration: 100 s
210+
number of transactions actually processed: 30106928
211+
latency average = 0.213 ms
212+
latency stddev = 0.159 ms
213+
tps = 301049.099821 (including connections establishing)
214+
tps = 301089.965566 (excluding connections establishing)
215+
script statistics:
216+
- statement latencies in milliseconds:
217+
0.002 \set x random(-180,180)
218+
0.001 \set y random(-90,90)
219+
0.218 select * from po where po @> polygon('((:x,:y),(:x,:y),(:x,:y))') limit 1;
220+
```
221+
222+
**TPS:30万 ,平均响应时间:0.2毫秒**
223+
224+
## 四、技术点
225+
226+
1、空间排他约束
227+
228+
这个约束可以用于强制记录中的多边形不相交。例如地图这类严谨数据,绝对不可能出现两个多边形相交的,否则就有领土纷争了。
229+
230+
PostgreSQL就是这么严谨,意不意外。
231+
232+
2、分区表
233+
234+
本例中不同的快递公司,对应不同的图层,每个快递公司根据网点、快递员负责的片区(多边形)划分为多个多边形。
235+
236+
使用LIST分区,每个分区对应一家快递公司。
237+
238+
3、空间索引
239+
240+
GiST空间索引,支持KNN、包含、相交、上下左右等空间搜索。
241+
242+
效率极高。
243+
244+
4、空间分区索引
245+
246+
[《分区索引的应用和实践 - 阿里云RDS PostgreSQL最佳实践》](../201707/20170721_01.md)
247+
248+
5、面面、点判断
249+
250+
面面判断或面点判断是本例的主要需求,用户在寄包裹时,根据用户位置在数据库的一千万多边形中找出覆盖这个点的多边形。
251+
252+
## 五、云端产品
253+
254+
[阿里云 RDS PostgreSQL](https://www.aliyun.com/product/rds/postgresql)
255+
256+
## 六、类似场景、案例
257+
258+
[《PostgreSQL 物流轨迹系统数据库需求分析与设计 - 包裹侠实时跟踪与召回》](../201704/20170418_01.md)
259+
260+
## 七、小结
261+
菜鸟末端轨迹项目中涉及的一个关键需求,面面判断。
262+
263+
在数据库中存储了一些多边形记录,约几百万到千万条记录,例如一个小区,在地图上是一个多边形。
264+
265+
不同的快递公司,会有各自不同的多边形划分方法(网点负责的片区(多边形),某个快递员负责的片区(多边形))。
266+
267+
用户在寄件时,根据用户的位置,查找对应快递公司负责这个片区的网点、或者负责该片区的快递员。
268+
269+
**使用阿里云RDS PostgreSQL,用户存放约1千万的多边形数据,单库实现了每秒30万的处理请求,单次请求平均响应时间约0.2毫秒。**
270+
271+
惊不惊喜、意不意外。
272+
273+
## 八、参考
274+
275+
http://postgis.net/docs/manual-2.3/ST_Within.html
276+
277+
[《分区索引的应用和实践 - 阿里云RDS PostgreSQL最佳实践》](../201707/20170721_01.md)

201708/20170803_01_pic_001.jpg

113 KB
Loading

201708/readme.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
### 文章列表
22
----
3-
##### 20170802_02.md [《(新零售)商户网格化运营 - 阿里云RDS PostgreSQL最佳实践》](20170802_02.md)
3+
##### 20170803_01.md [《菜鸟末端轨迹 面面、点判断 空间索引性能 - 阿里云RDS PostgreSQL最佳实践》](20170803_01.md)
4+
##### 20170802_02.md [《(新零售)商户网格化运营 - 阿里云RDS PostgreSQL、HybridDB for PostgreSQL最佳实践》](20170802_02.md)
45
##### 20170802_01.md [《plpgsql 编程 - JSON数组循环》](20170802_01.md)
56
##### 20170801_03.md [[招聘] [鲁邦通] PostgreSQL DBA》](20170801_03.md)
67
##### 20170801_02.md [[招聘] [HelloBike] PostgreSQL DBA》](20170801_02.md)

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,8 @@ digoal's|PostgreSQL|文章|归类
2929

3030
### 未归类文档如下
3131
----
32-
##### 201708/20170802_02.md [《(新零售)商户网格化运营 - 阿里云RDS PostgreSQL最佳实践》](201708/20170802_02.md)
32+
##### 201708/20170803_01.md [《菜鸟末端轨迹 面面、点判断 空间索引性能 - 阿里云RDS PostgreSQL最佳实践》](201708/20170803_01.md)
33+
##### 201708/20170802_02.md [《(新零售)商户网格化运营 - 阿里云RDS PostgreSQL、HybridDB for PostgreSQL最佳实践》](201708/20170802_02.md)
3334
##### 201708/20170802_01.md [《plpgsql 编程 - JSON数组循环》](201708/20170802_01.md)
3435
##### 201708/20170801_03.md [[招聘] [鲁邦通] PostgreSQL DBA》](201708/20170801_03.md)
3536
##### 201708/20170801_02.md [[招聘] [HelloBike] PostgreSQL DBA》](201708/20170801_02.md)

0 commit comments

Comments
 (0)