new doc

digoal · digoal · commit 706429a602e9 · 2017-08-03T15:21:11.000+08:00
diff --git a/201708/20170802_02.md b/201708/20170802_02.md
@@ -1,4 +1,4 @@
-## (新零售)商户网格化运营 - 阿里云RDS PostgreSQL最佳实践    
+## (新零售)商户网格化运营 - 阿里云RDS PostgreSQL、HybridDB for PostgreSQL最佳实践    
                          
 ### 作者                          
 digoal                         
diff --git a/201708/20170803_01.md b/201708/20170803_01.md
@@ -0,0 +1,277 @@
+## 菜鸟末端轨迹 面面、点判断 空间索引性能 - 阿里云RDS PostgreSQL最佳实践      
+                           
+### 作者                            
+digoal                           
+                             
+### 日期                             
+2017-08-03                       
+                                      
+### 标签                      
+PostgreSQL , PostGIS , 多边形 , 面 , 点 , 面点判断 , 菜鸟       
+                      
+----                      
+                       
+## 背景   
+菜鸟末端轨迹项目中涉及的一个关键需求，面面判断。  
+  
+在数据库中存储了一些多边形记录，约几百万到千万条记录，例如一个小区，在地图上是一个多边形。  
+  
+不同的快递公司，会有各自不同的多边形划分方法（网点负责的片区(多边形)，某个快递员负责的片区(多边形)）。  
+  
+用户在寄件时，根据用户的位置，查找对应快递公司负责这个片区的网点、或者负责该片区的快递员。  
+  
+![pic](20170803_01_pic_001.jpg)  
+  
+## 一、需求  
+1、在数据库中存储了一些静态的面信息，代表小区、园区、写字楼等等。所有的面不相交。  
+  
+2、为了支持不同的业务类型，对一个地图，可能划分为不同的多边形组成。  
+  
+例如不同的快递公司，会有各自不同的多边形划分方法（网点负责的片区(多边形)，某个快递员负责的片区(多边形)）。  
+  
+因此在一张地图上，有多个图层，每个图层的多边形划分方法可能不一样。  
+  
+3、快速的根据快递公司、客户的位置，求包含这个点的多边形（即得到对应快递公司负责这个片区的网点、或者负责该片区的快递员）。  
+  
+## 二、架构设计  
+用到阿里云的RDS PostgreSQL，以及PG提供的PostGIS插件。  
+  
+我们需要用到PostGIS的函数有两个    
+    
+http://postgis.net/docs/manual-2.3/ST_Within.html    
+    
+1、ST_within    
+    
+ST_Within — Returns true if the geometry A is completely inside geometry B    
+    
+boolean ST_Within(geometry A, geometry B);    
+    
+Returns TRUE if geometry A is completely inside geometry B. For this function to make sense, the source geometries must both be of the same coordinate projection, having the same SRID. It is a given that if ST_Within(A,B) is true and ST_Within(B,A) is true, then the two geometries are considered spatially equal.    
+    
+This function call will automatically include a bounding box comparison that will make use of any indexes that are available on the geometries. To avoid index use, use the function _ST_Within.    
+    
+```    
+-- a circle within a circle    
+SELECT ST_Within(smallc,smallc) As smallinsmall,    
+        ST_Within(smallc, bigc) As smallinbig,    
+        ST_Within(bigc,smallc) As biginsmall,    
+        ST_Within(ST_Union(smallc, bigc), bigc) as unioninbig,    
+        ST_Within(bigc, ST_Union(smallc, bigc)) as biginunion,    
+        ST_Equals(bigc, ST_Union(smallc, bigc)) as bigisunion    
+FROM    
+(    
+SELECT ST_Buffer(ST_GeomFromText('POINT(50 50)'), 20) As smallc,    
+        ST_Buffer(ST_GeomFromText('POINT(50 50)'), 40) As bigc) As foo;    
+-- Result    
+ smallinsmall | smallinbig | biginsmall | unioninbig | biginunion | bigisunion    
+--------------+------------+------------+------------+------------+------------    
+ t            | t          | f          | t          | t          | t    
+(1 row)    
+```    
+    
+2、ST_Contains    
+    
+ST_Contains — Returns true if and only if no points of B lie in the exterior of A, and at least one point of the interior of B lies in the interior of A.    
+    
+boolean ST_Contains(geometry geomA, geometry geomB);    
+    
+Returns TRUE if geometry B is completely inside geometry A. For this function to make sense, the source geometries must both be of the same coordinate projection, having the same SRID. ST_Contains is the inverse of ST_Within. So ST_Contains(A,B) implies ST_Within(B,A) except in the case of invalid geometries where the result is always false regardless or not defined.    
+    
+This function call will automatically include a bounding box comparison that will make use of any indexes that are available on the geometries. To avoid index use, use the function _ST_Contains.    
+    
+```    
+-- A circle within a circle    
+SELECT ST_Contains(smallc, bigc) As smallcontainsbig,    
+           ST_Contains(bigc,smallc) As bigcontainssmall,    
+           ST_Contains(bigc, ST_Union(smallc, bigc)) as bigcontainsunion,    
+           ST_Equals(bigc, ST_Union(smallc, bigc)) as bigisunion,    
+           ST_Covers(bigc, ST_ExteriorRing(bigc)) As bigcoversexterior,    
+           ST_Contains(bigc, ST_ExteriorRing(bigc)) As bigcontainsexterior    
+FROM (SELECT ST_Buffer(ST_GeomFromText('POINT(1 2)'), 10) As smallc,    
+                         ST_Buffer(ST_GeomFromText('POINT(1 2)'), 20) As bigc) As foo;    
+    
+-- Result    
+  smallcontainsbig | bigcontainssmall | bigcontainsunion | bigisunion | bigcoversexterior | bigcontainsexterior    
+------------------+------------------+------------------+------------+-------------------+---------------------    
+ f                | t                | t                | t          | t        | f    
+    
+-- Example demonstrating difference between contains and contains properly    
+SELECT ST_GeometryType(geomA) As geomtype, ST_Contains(geomA,geomA) AS acontainsa, ST_ContainsProperly(geomA, geomA) AS acontainspropa,    
+   ST_Contains(geomA, ST_Boundary(geomA)) As acontainsba, ST_ContainsProperly(geomA, ST_Boundary(geomA)) As acontainspropba    
+FROM (VALUES ( ST_Buffer(ST_Point(1,1), 5,1) ),    
+                         ( ST_MakeLine(ST_Point(1,1), ST_Point(-1,-1) ) ),    
+                         ( ST_Point(1,1) )    
+          ) As foo(geomA);    
+    
+  geomtype    | acontainsa | acontainspropa | acontainsba | acontainspropba    
+--------------+------------+----------------+-------------+-----------------    
+ST_Polygon    | t          | f              | f           | f    
+ST_LineString | t          | f              | f           | f    
+ST_Point      | t          | t              | f           | f    
+```    
+    
+![pic](../201708/20170802_02_pic_005.jpg)    
+    
+![pic](../201708/20170802_02_pic_006.jpg)    
+  
+## 三、DEMO与性能  
+为了简化测试，采样PG内置的几何类型进行测试，用法与PostGIS是类似的。  
+  
+1、创建测试表  
+  
+```  
+postgres=# create table po(id int, typid int, po polygon);  
+CREATE TABLE  
+```  
+  
+2、创建分区表或分区索引  
+  
+```  
+create extension btree_gist;  
+create index idx_po_1 on po using gist(typid, po);  
+```  
+  
+3、创建空间排他约束，可选  
+  
+如果要求单个typid内的po不重叠，可以创建空间排他约束  
+  
+```  
+create table tbl_po(id int, typid int, po polygon)  
+PARTITION BY LIST (typid);  
+  
+CREATE TABLE tbl_po_1  
+    PARTITION OF tbl_po (  
+    EXCLUDE USING gist (po WITH &&)  
+) FOR VALUES IN (1);  
+  
+...  
+  
+CREATE TABLE tbl_po_20  
+    PARTITION OF tbl_po (  
+    EXCLUDE USING gist (po WITH &&)  
+) FOR VALUES IN (20);  
+  
+查看某分区表的空间排他约束如下  
+  
+postgres=# \d tbl_po_1  
+             Table "postgres.tbl_po_1"  
+ Column |  Type   | Collation | Nullable | Default   
+--------+---------+-----------+----------+---------  
+ id     | integer |           |          |   
+ typid  | integer |           |          |   
+ po     | polygon |           |          |   
+Partition of: tbl_po FOR VALUES IN (1)  
+Indexes:  
+    "tbl_po_1_po_excl" EXCLUDE USING gist (po WITH &&)  
+```  
+  
+4、写入1000万多边形测试数据  
+  
+```  
+insert into po select id, random()*20, polygon('(('||x1||','||y1||'),('||x2||','||y2||'),('||x3||','||y3||'))') from (select id, 180-random()*180 x1, 180-random()*180 x2, 180-random()*180 x3, 90-random()*90 y1, 90-random()*90 y2, 90-random()*90 y3 from generate_series(1,10000000) t(id)) t;  
+```  
+  
+5、测试面点判断性能  
+  
+查询包含point(1,1)的多边形，响应时间0.3毫秒。  
+  
+```  
+postgres=# explain (analyze,verbose,timing,costs,buffers) select * from po where po @> polygon('((1,1),(1,1),(1,1))') limit 1;  
+                                                             QUERY PLAN                                                                
+-------------------------------------------------------------------------------------------------------------------------------------  
+ Limit  (cost=0.42..1.71 rows=1 width=93) (actual time=0.326..0.326 rows=1 loops=1)  
+   Output: id, typid, po  
+   Buffers: shared hit=71  
+   ->  Index Scan using idx_po_1 on postgres.po  (cost=0.42..12940.12 rows=10000 width=93) (actual time=0.325..0.325 rows=1 loops=1)  
+         Output: id, typid, po  
+         Index Cond: (po.po @> '((1,1),(1,1),(1,1))'::polygon)  
+         Rows Removed by Index Recheck: 13  
+         Buffers: shared hit=71  
+ Planning time: 0.032 ms  
+ Execution time: 0.338 ms  
+(10 rows)  
+```  
+  
+6、压测  
+  
+```  
+vi test.sql  
+\set x random(-180,180)  
+\set y random(-90,90)  
+select * from po where po @> polygon('((:x,:y),(:x,:y),(:x,:y))') limit 1;  
+  
+pgbench -M simple -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 100  
+transaction type: ./test.sql  
+scaling factor: 1  
+query mode: simple  
+number of clients: 64  
+number of threads: 64  
+duration: 100 s  
+number of transactions actually processed: 30106928  
+latency average = 0.213 ms  
+latency stddev = 0.159 ms  
+tps = 301049.099821 (including connections establishing)  
+tps = 301089.965566 (excluding connections establishing)  
+script statistics:  
+ - statement latencies in milliseconds:  
+         0.002  \set x random(-180,180)  
+         0.001  \set y random(-90,90)  
+         0.218  select * from po where po @> polygon('((:x,:y),(:x,:y),(:x,:y))') limit 1;  
+```  
+  
+**TPS：30万 ，平均响应时间：0.2毫秒**  
+  
+## 四、技术点  
+  
+1、空间排他约束  
+  
+这个约束可以用于强制记录中的多边形不相交。例如地图这类严谨数据，绝对不可能出现两个多边形相交的，否则就有领土纷争了。  
+  
+PostgreSQL就是这么严谨，意不意外。  
+  
+2、分区表  
+  
+本例中不同的快递公司，对应不同的图层，每个快递公司根据网点、快递员负责的片区(多边形)划分为多个多边形。  
+  
+使用LIST分区，每个分区对应一家快递公司。  
+  
+3、空间索引  
+  
+GiST空间索引，支持KNN、包含、相交、上下左右等空间搜索。  
+  
+效率极高。  
+  
+4、空间分区索引  
+  
+[《分区索引的应用和实践 - 阿里云RDS PostgreSQL最佳实践》](../201707/20170721_01.md)    
+  
+5、面面、点判断  
+  
+面面判断或面点判断是本例的主要需求，用户在寄包裹时，根据用户位置在数据库的一千万多边形中找出覆盖这个点的多边形。  
+  
+## 五、云端产品  
+  
+[阿里云 RDS PostgreSQL](https://www.aliyun.com/product/rds/postgresql)     
+  
+## 六、类似场景、案例  
+  
+[《PostgreSQL 物流轨迹系统数据库需求分析与设计 - 包裹侠实时跟踪与召回》](../201704/20170418_01.md)    
+  
+## 七、小结  
+菜鸟末端轨迹项目中涉及的一个关键需求，面面判断。  
+  
+在数据库中存储了一些多边形记录，约几百万到千万条记录，例如一个小区，在地图上是一个多边形。  
+  
+不同的快递公司，会有各自不同的多边形划分方法（网点负责的片区(多边形)，某个快递员负责的片区(多边形)）。  
+  
+用户在寄件时，根据用户的位置，查找对应快递公司负责这个片区的网点、或者负责该片区的快递员。  
+  
+**使用阿里云RDS PostgreSQL，用户存放约1千万的多边形数据，单库实现了每秒30万的处理请求，单次请求平均响应时间约0.2毫秒。**  
+  
+惊不惊喜、意不意外。  
+  
+## 八、参考  
+  
+http://postgis.net/docs/manual-2.3/ST_Within.html    
+  
+[《分区索引的应用和实践 - 阿里云RDS PostgreSQL最佳实践》](../201707/20170721_01.md)    
diff --git a/201708/20170803_01_pic_001.jpg b/201708/20170803_01_pic_001.jpg
diff --git a/201708/readme.md b/201708/readme.md
@@ -1,6 +1,7 @@
 ### 文章列表  
 ----  
-##### 20170802_02.md   [《(新零售)商户网格化运营 - 阿里云RDS PostgreSQL最佳实践》](20170802_02.md)  
+##### 20170803_01.md   [《菜鸟末端轨迹 面面、点判断 空间索引性能 - 阿里云RDS PostgreSQL最佳实践》](20170803_01.md)  
+##### 20170802_02.md   [《(新零售)商户网格化运营 - 阿里云RDS PostgreSQL、HybridDB for PostgreSQL最佳实践》](20170802_02.md)  
 ##### 20170802_01.md   [《plpgsql 编程 - JSON数组循环》](20170802_01.md)  
 ##### 20170801_03.md   [《[招聘] [鲁邦通] PostgreSQL DBA》](20170801_03.md)  
 ##### 20170801_02.md   [《[招聘] [HelloBike] PostgreSQL DBA》](20170801_02.md)  
diff --git a/README.md b/README.md
@@ -29,7 +29,8 @@ digoal's|PostgreSQL|文章|归类
   
 ### 未归类文档如下  
 ----  
-##### 201708/20170802_02.md   [《(新零售)商户网格化运营 - 阿里云RDS PostgreSQL最佳实践》](201708/20170802_02.md)  
+##### 201708/20170803_01.md   [《菜鸟末端轨迹 面面、点判断 空间索引性能 - 阿里云RDS PostgreSQL最佳实践》](201708/20170803_01.md)  
+##### 201708/20170802_02.md   [《(新零售)商户网格化运营 - 阿里云RDS PostgreSQL、HybridDB for PostgreSQL最佳实践》](201708/20170802_02.md)  
 ##### 201708/20170802_01.md   [《plpgsql 编程 - JSON数组循环》](201708/20170802_01.md)  
 ##### 201708/20170801_03.md   [《[招聘] [鲁邦通] PostgreSQL DBA》](201708/20170801_03.md)  
 ##### 201708/20170801_02.md   [《[招聘] [HelloBike] PostgreSQL DBA》](201708/20170801_02.md)  

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-## (新零售)商户网格化运营 - 阿里云RDS PostgreSQL最佳实践`
	`1`	`+## (新零售)商户网格化运营 - 阿里云RDS PostgreSQL、HybridDB for PostgreSQL最佳实践`
`2`	`2`
`3`	`3`	`### 作者`
`4`	`4`	`digoal`