|
| 1 | +## 分库架构全局唯一序列生成方案: Snowflake ID implementation for PostgreSQL |
| 2 | + |
| 3 | +### 作者 |
| 4 | +digoal |
| 5 | + |
| 6 | +### 日期 |
| 7 | +2024-01-08 |
| 8 | + |
| 9 | +### 标签 |
| 10 | +PostgreSQL , PolarDB , DuckDB , 分库分表 , 全局唯一 , snowflake ID |
| 11 | + |
| 12 | +---- |
| 13 | + |
| 14 | +## 背景 |
| 15 | + |
| 16 | +Snowflake ID implementation for PostgreSQL |
| 17 | +- https://github.com/pgEdge/snowflake |
| 18 | + |
| 19 | +当业务使用了分库特性时, 如果业务有全局唯一的需求, 需要能保证全局唯一的序列生成器, 传统方法为 UUID. |
| 20 | + |
| 21 | +也可以使用多个字段来实现, 例如一个nodeid字段, 一个dbid字段, 一个schemaid字段, 一个partitionid字段, 一个sequence字段. |
| 22 | + |
| 23 | +nodeid代表实例 ID(实例之间唯一), dbid代表数据库ID(同一台实例的不同数据库之间唯一), schemaid在同一个数据库内不同schema之间唯一, partitionId代表表分区ID(同一个schema下不同分区之间唯一). |
| 24 | + |
| 25 | +不论是使用Uuid还是多字段实现, 都有其缺陷, uuid属于字符串, 占用空间更大, 由于其无序性容易造成索引膨胀. 多字段需要集中管理元数据, 也比较容易出错, 配置错误会导致不唯一. |
| 26 | + |
| 27 | +snowflake id是int8类型的全局唯一序列生成器, 支持1023个实例. |
| 28 | +- bit 63 - unused (sign of int8) |
| 29 | +- bits 22-62 - timestamp with millisecond precision |
| 30 | +- bits 10-21 - counter within one millisecond |
| 31 | +- bits 0-9 - unique PostgreSQL node number set in postgresql.conf |
| 32 | + |
| 33 | +除了snowflake id, 也有其他的方案, 例如 Instagram 的方案: |
| 34 | +- [《PostgreSQL 全局ID分配(数据字典化)服务 设计实践》](../201802/20180227_02.md) |
| 35 | +- [《PostgreSQL sharding有序UUID最佳实践 - serial global uuid stored in 64bit int8》](../201301/20130122_01.md) |
| 36 | + |
| 37 | +安装snowflake id: |
| 38 | +``` |
| 39 | +git clone --depth 1 https://github.com/pgEdge/snowflake |
| 40 | +cd snowflake |
| 41 | +USE_PGXS=1 make install |
| 42 | +``` |
| 43 | + |
| 44 | +已集成到学习镜像: |
| 45 | +- [《2023-PostgreSQL Docker镜像学习环境 ARM64版, 已集成热门插件和工具》](../202308/20230814_02.md) |
| 46 | +- [《2023-PostgreSQL Docker镜像学习环境 AMD64版, 已集成热门插件和工具》](../202307/20230710_03.md) |
| 47 | + |
| 48 | +# Snowflake Sequences for PostgreSQL |
| 49 | + |
| 50 | +* [Implementation](#implementation) |
| 51 | +* [Installation](#installation) |
| 52 | + * [Installation from binaries](#installation-from-pgedge-binaries) |
| 53 | + * [Installation from source code](#installation-from-source-code) |
| 54 | +* [Configuration](#configuration) |
| 55 | +* [API](#api) |
| 56 | +* [Examples](#examples) |
| 57 | + * [New table using a snowflake as a PK](#new-table-using-a-snowflake-as-pk) |
| 58 | + * [Converting an existing bigserial column into a snowflake](#converting-an-existing-bigserial-column-into-a-snowflake) |
| 59 | + |
| 60 | +## Implementation |
| 61 | + |
| 62 | +**Snowflake** is a PostgreSQL extension providing an `int8` |
| 63 | +and `sequence` based unique ID solution to optionally replace |
| 64 | +the PostgreSQL built-in `bigserial` data type. |
| 65 | + |
| 66 | +Internally **Snowflakes** are 64 bit integers represented externally as `bigint` values. The 64 bits are divided into bit fields |
| 67 | + |
| 68 | +``` |
| 69 | +bit 63 - unused (sign of int8) |
| 70 | +bits 22-62 - timestamp with millisecond precision |
| 71 | +bits 10-21 - counter within one millisecond |
| 72 | +bits 0-9 - unique PostgreSQL node number set in postgresql.conf |
| 73 | +``` |
| 74 | + |
| 75 | +* The timestamp is a 41-bit unsigned value representing millisecond |
| 76 | + precision and an epoch of 2023-01-01. |
| 77 | + |
| 78 | +* The counter is a 12-bit unsigned value that increments per ID allocation. |
| 79 | + This provides for 4096 unique IDs per millisecond, or 4 million IDs per |
| 80 | + second. Even the most aggressive allocation of sequences cannot reach |
| 81 | + one million per second and a modified pgbench with a **Snowflake** based |
| 82 | + primary key on the `history` table can only generate IDs in the double |
| 83 | + digit per millisecond range. |
| 84 | + |
| 85 | + Should it be possible in the future to generate more than 4096 |
| 86 | + **snowflakes** per millisecond the algorithm is going to bump the |
| 87 | + timestamp one millisecond into the future to keep **snowflakes** |
| 88 | + unique. |
| 89 | + |
| 90 | +* The `node number` is a 10-bit unique identifier of the PostgreSQL |
| 91 | + instance inside a global cluster. This value must be set with the |
| 92 | + GUC `snowflake.node` in the `postgresql.conf` file. |
| 93 | + |
| 94 | +With this design a **Snowflake** ID is unique within one `sequence` |
| 95 | +across multiple PostgreSQL instances in a distributed cluster. |
| 96 | + |
| 97 | +## Installation |
| 98 | + |
| 99 | +### Installation from pgEdge binaries |
| 100 | + |
| 101 | +Go to `pgedge.org` and run `install.py` (takes two seconds) to install pgEdge `nodectl` |
| 102 | +After running the installer simply run the following command, as a non root user, from your home directory |
| 103 | + |
| 104 | +``` |
| 105 | +./nodectl install pg17 --start : install snowflake |
| 106 | +``` |
| 107 | + |
| 108 | +### Installation from source code |
| 109 | + |
| 110 | +For installation from source code it is assumed that the user is |
| 111 | +familiar with how to build standard PostgreSQL extensions from a |
| 112 | +source. |
| 113 | + |
| 114 | +``` |
| 115 | +cd contrib |
| 116 | +git clone https://github.com/pgEdge/snowflake.git |
| 117 | +cd snowflake |
| 118 | +make |
| 119 | +make install |
| 120 | +``` |
| 121 | + |
| 122 | +## Configuration |
| 123 | + |
| 124 | +The **Snowflake** extension uses a custom GUC `snowflake.node`. |
| 125 | +This configuration variable determines the `node` part of every |
| 126 | +**snowflake**, generated by this PostgreSQL instance. The |
| 127 | +permitted values are 1 thru 1023. |
| 128 | + |
| 129 | +This configuration option has an invalid default value (on purpose). |
| 130 | +If not set, the **Snowflake** extension will throw an exception on |
| 131 | +a call to `snowflake.nextval()`. This (hopefully) prevents you from |
| 132 | +accidentally missing this GUC in your postgresql.conf file. |
| 133 | + |
| 134 | +If you ever intend to use **Snowflakes** in a multi-node, distributed |
| 135 | +or replicated setup, it is important to set the GUC to a unique value |
| 136 | +for each PostgreSQL instance. There is nothing |
| 137 | +in place to prevent you from shooting yourself in the foot by |
| 138 | +assigning multiple PostgreSQL instances in a multi-master cluster |
| 139 | +the same node number. |
| 140 | + |
| 141 | +## API |
| 142 | + |
| 143 | +After creating the `extension` via |
| 144 | +``` |
| 145 | +CREATE EXTENSION snowflake; |
| 146 | +``` |
| 147 | +the following functions become available: |
| 148 | + |
| 149 | +* `snowflake.nextval([sequence regclass])` |
| 150 | + Generates the next **snowflake** for the given sequence. If no |
| 151 | + sequence is specified the internal, database-wide sequence |
| 152 | + `snowflake.id_seq` will be used. |
| 153 | + |
| 154 | + **NOTE:** **snowflakes** are only unique per database, per sequence. |
| 155 | + If you need **snowflakes** to be unique across all **snowflake** |
| 156 | + columns within a database, you need to use the built-in sequence |
| 157 | + `snowflake.id_seq`. However, that may interfere with the usage |
| 158 | + of `currval()` of your application logic. |
| 159 | + |
| 160 | +* `snowflake.currval([sequence regclass])` |
| 161 | + Returns the current value of the given sequence (or the default, internal |
| 162 | + sequence). Like for PostgreSQL sequences this value is undefined until |
| 163 | + the function `snowflake.nextval()` has been called for the sequence in |
| 164 | + the current session. |
| 165 | + |
| 166 | +* `snowflake.get_epoch(snowflake int8)` |
| 167 | + Returns the timestamp part of the given **snowflake** as EPOCH |
| 168 | + (seconds since 2023-01-01) as a NUMERIC value with precision of |
| 169 | + three digits. One can use `to_timestamp(snowflake.get_epoch(<value>))` |
| 170 | + to convert this into an actual timestamp. |
| 171 | + |
| 172 | +* `snowflake.get_count(snowflake int8)` |
| 173 | + Returns the count part of the given **snowflake** as int4 value. |
| 174 | + This is a unique value within the milliseconds of the **snowflake**'s |
| 175 | + timestamp. |
| 176 | + |
| 177 | +* `snowflake.get_node(snowflake int8)` |
| 178 | + Returns the setting of GUC `snowflake.node` in postgresql.conf at |
| 179 | + the time, this **snowflake** was allocated. |
| 180 | + |
| 181 | +* `snowflake.format(snowflake int8)` |
| 182 | + Returns a `jsonb` object of the given **snowflake** like: |
| 183 | + `{"node": 1, "ts": "2023-10-16 17:57:26.361+00", "count": 0}` |
| 184 | + |
| 185 | +## Examples |
| 186 | + |
| 187 | +### New table using a **snowflake** as PK |
| 188 | +This SQL code shows how to use **snowflake** in a newly created table. |
| 189 | +``` |
| 190 | +-- Assuming extension snowflake has been installed |
| 191 | + |
| 192 | +CREATE TABLE table1 ( |
| 193 | + id bigint PRIMARY KEY DEFAULT snowflake.nextval(), |
| 194 | + some_data text |
| 195 | +); |
| 196 | + |
| 197 | +INSERT INTO table1 (some_data) VALUES ('first row'); |
| 198 | +INSERT INTO table1 (some_data) VALUES ('second row'); |
| 199 | + |
| 200 | +SELECT id, snowflake.format(id), some_data FROM table1; |
| 201 | +``` |
| 202 | +Result: |
| 203 | +``` |
| 204 | + id | format | some_data |
| 205 | +-------------------+-----------------------------------------------------------+------------ |
| 206 | + 18014518154714241 | {"node": 1, "ts": "2023-10-16 18:47:12.257+00", "count": 0} | first row |
| 207 | + 18014518154714242 | {"node": 1, "ts": "2023-10-16 18:47:12.258+00", "count": 0} | second row |
| 208 | +(2 rows) |
| 209 | +``` |
| 210 | + |
| 211 | +### Converting an existing bigserial column into a snowflake |
| 212 | +This SQL code demonstrates how to convert an existing `bigserial` or |
| 213 | +`serial8` column into a **snowflake**. |
| 214 | +``` |
| 215 | +CREATE TABLE table2 ( |
| 216 | + id bigserial PRIMARY KEY, |
| 217 | + some_data text |
| 218 | +); |
| 219 | + |
| 220 | +INSERT INTO table2 (some_data) VALUES ('first row'); |
| 221 | +INSERT INTO table2 (some_data) VALUES ('second row'); |
| 222 | + |
| 223 | +ALTER TABLE table2 ALTER COLUMN id SET DEFAULT snowflake.nextval(); |
| 224 | + |
| 225 | +-- Alternatively, if you need to retain the ability to retrieve |
| 226 | +-- the snowflake.currval('table2_id_seq') of the sequence individually, use |
| 227 | +-- ALTER TABLE table2 ALTER COLUMN id SET DEFAULT snowflake.nextval('table2_id_seq'); |
| 228 | + |
| 229 | +INSERT INTO table2 (some_data) VALUES ('third row'); |
| 230 | +INSERT INTO table2 (some_data) VALUES ('fourth row'); |
| 231 | + |
| 232 | +SELECT id, snowflake.format(id), some_data FROM table2; |
| 233 | +``` |
| 234 | +Result: |
| 235 | +``` |
| 236 | + id | format | some_data |
| 237 | +-------------------+-----------------------------------------------------------+------------ |
| 238 | + 1 | {"node": 0, "ts": "2023-01-01 00:00:00.001+00", "count": 0} | first row |
| 239 | + 2 | {"node": 0, "ts": "2023-01-01 00:00:00.002+00", "count": 0} | second row |
| 240 | + 18014518155600128 | {"node": 1, "ts": "2023-10-16 19:01:58.144+00", "count": 0} | third row |
| 241 | + 18014518155600129 | {"node": 1, "ts": "2023-10-16 19:01:58.145+00", "count": 0} | fourth row |
| 242 | +(4 rows) |
| 243 | +``` |
| 244 | +**NOTE:** The `int8` value of ID remains unique, although it does |
| 245 | +jump ahead quite a bit (which is in compliance with standard PostgreSQL |
| 246 | +sequences that may have gaps). Because the **snowflake** EPOCH is |
| 247 | +2023-01-01; your existing database would have had to use over 18 **trillion** |
| 248 | +sequence numbers before this conversion to cause any trouble with |
| 249 | +possible duplicate key values. |
| 250 | + |
| 251 | + |
0 commit comments