Skip to content

Commit 814384c

Browse files
committed
new doc
1 parent 4e0f2cc commit 814384c

File tree

12 files changed

+1720
-4
lines changed

12 files changed

+1720
-4
lines changed

202106/readme.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@
5252
##### 20210614_03.md [《MacOS 虚拟机 virtualbox》](20210614_03.md)
5353
##### 20210614_02.md [《MacOS 检查SSD磁盘寿命(smartctl) , 查看传感器数据(风扇,温度等)(istats)》](20210614_02.md)
5454
##### 20210614_01.md [《重新发现PostgreSQL之美 - 24 滑动窗口分析 2000x》](20210614_01.md)
55-
##### 20210613_02.md [《重新发现PostgreSQL之美 - 23 彭祖的长寿秘诀](20210613_02.md)
55+
##### 20210613_02.md [《重新发现PostgreSQL之美 - 23 奥卡姆剃刀原则应用 优化not in大列表](20210613_02.md)
5656
##### 20210613_01.md [《重新发现PostgreSQL之美 - 22 黄帝内经》](20210613_01.md)
5757
##### 20210612_03.md [《重新发现PostgreSQL之美 - 21 探访宇航员的食物》](20210612_03.md)
5858
##### 20210612_02.md [《传统数据库30年不思进取, 而PG开辟出一条新道路》](20210612_02.md)

202311/readme.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,5 +19,5 @@
1919
##### 20231111_01.md [《开源PolarDB|PostgreSQL 应用开发者&DBA 公开课 - 3.2 PostgreSQL社区版本必学知识 - 数据库架构、应用、管理、优化等通识》](20231111_01.md)
2020
##### 20231105_03.md [《chrome|firefox disable DASH 让youtube支持持续无限缓冲视频》](20231105_03.md)
2121
##### 20231105_02.md [《OpenWRT|lede|asuswrt|merlin|koolshare|koolcenter|软路由 科普文》](20231105_02.md)
22-
##### 20231105_01.md [解决 终端复制粘贴后前后会多出`0~``~1`的问题? 终端无法复制的问题?》](20231105_01.md)
22+
##### 20231105_01.md [解决macos,linux 终端复制粘贴后前后会多出`0~``~1`的问题? 终端无法复制的问题?》](20231105_01.md)
2323
##### 20231104_01.md [《德说-第265期, 掌握超级信息检索技术, 产生信息差 - 普通人变强的唯一外挂 | 全套宝藏实操秘技》](20231104_01.md)

202312/20231213_01.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -225,3 +225,21 @@ postgres=# select solve_24_game(1,2,3,4);
225225

226226
生产力工具要我说还得靠编程语言, 所以非常推荐PostgreSQL|PolarDB这类支持高级编程语言函数的数据库, 要不怎么能成就德哥的七步杀一人呢?
227227

228+
229+
#### [期望 PostgreSQL|开源PolarDB 增加什么功能?](https://github.com/digoal/blog/issues/76 "269ac3d1c492e938c0191101c7238216")
230+
231+
232+
#### [PolarDB 云原生分布式开源数据库](https://github.com/ApsaraDB "57258f76c37864c6e6d23383d05714ea")
233+
234+
235+
#### [PolarDB 学习图谱: 训练营、培训认证、在线互动实验、解决方案、内核开发公开课、生态合作、写心得拿奖品](https://www.aliyun.com/database/openpolardb/activity "8642f60e04ed0c814bf9cb9677976bd4")
236+
237+
238+
#### [PostgreSQL 解决方案集合](../201706/20170601_02.md "40cff096e9ed7122c512b35d8561d9c8")
239+
240+
241+
#### [德哥 / digoal's github - 公益是一辈子的事.](https://github.com/digoal/blog/blob/master/README.md "22709685feb7cab07d30f30387f0a9ae")
242+
243+
244+
![digoal's wechat](../pic/digoal_weixin.jpg "f7ad92eeba24523fd47a6e1a0e691b59")
245+

202312/20231214_01.md

Lines changed: 301 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,301 @@
1+
## PostgreSQL 基于逻辑复制 的minimal downtime大版本升级开源项目pg_easy_replicate
2+
3+
### 作者
4+
digoal
5+
6+
### 日期
7+
2023-12-14
8+
9+
### 标签
10+
PostgreSQL , PolarDB , DuckDB , pg_easy_replicate , 大版本升级 , 逻辑复制
11+
12+
----
13+
14+
## 背景
15+
16+
https://github.com/shayonj/pg_easy_replicate
17+
18+
Easily setup logical replication and switchover to new database with minimal downtime
19+
20+
# pg_easy_replicate
21+
22+
[![CI](https://github.com/shayonj/pg_easy_replicate/actions/workflows/ci.yaml/badge.svg?branch=main)](https://github.com/shayonj/pg_easy_replicate/actions/workflows/ci.yaml)
23+
[![Smoke spec](https://github.com/shayonj/pg_easy_replicate/actions/workflows/smoke.yaml/badge.svg?branch=main)](https://github.com/shayonj/pg_easy_replicate/actions/workflows/ci.yaml)
24+
[![Gem Version](https://badge.fury.io/rb/pg_easy_replicate.svg?2)](https://badge.fury.io/rb/pg_easy_replicate)
25+
26+
`pg_easy_replicate` is a CLI orchestrator tool that simplifies the process of setting up [logical replication](https://www.postgresql.org/docs/current/logical-replication.html) between two PostgreSQL databases. `pg_easy_replicate` also supports switchover. After the source (primary database) is fully replicated, `pg_easy_replicate` puts it into read-only mode and via logical replication flushes all data to the new target database. This ensures zero data loss and minimal downtime for the application. This method can be useful for performing minimal downtime (up to <1min, depending) major version upgrades between two PostgreSQL databases, load testing with blue/green database setup and other similar use cases.
27+
28+
Battle tested in production at [Tines](https://www.tines.com/) 🚀
29+
30+
- [Installation](#installation)
31+
- [Requirements](#requirements)
32+
- [Limits](#limits)
33+
- [Usage](#usage)
34+
- [CLI](#cli)
35+
- [Replicating all tables with a single group](#replicating-all-tables-with-a-single-group)
36+
- [Config check](#config-check)
37+
- [Bootstrap](#bootstrap)
38+
- [Bootstrap and Config Check with special user role in AWS or GCP](#bootstrap-and-config-check-with-special-user-role-in-aws-or-gcp)
39+
- [Config Check](#config-check)
40+
- [Bootstrap](#bootstrap-1)
41+
- [Start sync](#start-sync)
42+
- [Stats](#stats)
43+
- [Performing switchover](#performing-switchover)
44+
- [Replicating single database with custom tables](#replicating-single-database-with-custom-tables)
45+
- [Switchover strategies with minimal downtime](#switchover-strategies-with-minimal-downtime)
46+
- [Rolling restart strategy](#rolling-restart-strategy)
47+
- [DNS Failover strategy](#dns-failover-strategy)
48+
- [FAQ](#faq)
49+
- [Adding internal user to pgBouncer `userlist`](#adding-internal-user-to-pgbouncer-userlist)
50+
- [Contributing](#contributing)
51+
52+
## Installation
53+
54+
Add this line to your application's Gemfile:
55+
56+
```ruby
57+
gem "pg_easy_replicate"
58+
```
59+
60+
And then execute:
61+
62+
$ bundle install
63+
64+
Or install it yourself as:
65+
66+
$ gem install pg_easy_replicate
67+
68+
This will include all dependencies accordingly as well. Make sure the following requirements are satisfied.
69+
70+
Or via Docker:
71+
72+
docker pull shayonj/pg_easy_replicate:latest
73+
74+
https://hub.docker.com/r/shayonj/pg_easy_replicate
75+
76+
## Requirements
77+
78+
- PostgreSQL 10 and later
79+
- Ruby 2.7 and later
80+
- Database users should have `SUPERUSER` permissions, or pass in a special user with privileges to create the needed role, schema, publication and subscription on both databases. More on `--special-user-role` section below.
81+
82+
## Limits
83+
84+
All [Logical Replication Restrictions](https://www.postgresql.org/docs/current/logical-replication-restrictions.html) apply.
85+
86+
## Usage
87+
88+
Ensure `SOURCE_DB_URL` and `TARGET_DB_URL` are present as environment variables in the runtime environment. The URL are of the postgres connection string format. Example:
89+
90+
```bash
91+
$ export SOURCE_DB_URL="postgres://USERNAME:PASSWORD@localhost:5432/DATABASE_NAME"
92+
$ export TARGET_DB_URL="postgres://USERNAME:PASSWORD@localhost:5433/DATABASE_NAME"
93+
```
94+
95+
Any `pg_easy_replicate` command can be run the same way with the docker image as well. As long the container is running in an environment where it has access to both the databases. Example
96+
97+
```bash
98+
docker run -e SOURCE_DB_URL="postgres://USERNAME:PASSWORD@localhost:5432/DATABASE_NAME" \
99+
-e TARGET_DB_URL="postgres://USERNAME:PASSWORD@localhost:5433/DATABASE_NAME" \
100+
-it --rm shayonj/pg_easy_replicate:latest \
101+
pg_easy_replicate config_check
102+
```
103+
104+
## CLI
105+
106+
```bash
107+
$ pg_easy_replicate
108+
pg_easy_replicate commands:
109+
pg_easy_replicate bootstrap -g, --group-name=GROUP_NAME # Sets up temporary tables for information required during runtime
110+
pg_easy_replicate cleanup -g, --group-name=GROUP_NAME # Cleans up all bootstrapped data for the respective group
111+
pg_easy_replicate config_check # Prints if source and target database have the required config
112+
pg_easy_replicate help [COMMAND] # Describe available commands or one specific command
113+
pg_easy_replicate start_sync -g, --group-name=GROUP_NAME # Starts the logical replication from source database to target database provisioned in the group
114+
pg_easy_replicate stats -g, --group-name=GROUP_NAME # Prints the statistics in JSON for the group
115+
pg_easy_replicate stop_sync -g, --group-name=GROUP_NAME # Stop the logical replication from source database to target database provisioned in the group
116+
pg_easy_replicate switchover -g, --group-name=GROUP_NAME # Puts the source database in read only mode after all the data is flushed and written
117+
pg_easy_replicate version # Prints the version
118+
119+
```
120+
121+
## Replicating all tables with a single group
122+
123+
You can create as many groups as you want for a single database. Groups are just a logical isolation of a single replication.
124+
125+
### Config check
126+
127+
```bash
128+
$ pg_easy_replicate config_check
129+
130+
✅ Config is looking good.
131+
```
132+
133+
### Bootstrap
134+
135+
Every sync will need to be bootstrapped before you can set up the sync between two databases. Bootstrap creates a new super user to perform the orchestration required during the rest of the process. It also creates some internal metadata tables for record keeping.
136+
137+
```bash
138+
$ pg_easy_replicate bootstrap --group-name database-cluster-1 --copy-schema
139+
140+
{"name":"pg_easy_replicate","hostname":"PKHXQVK6DW","pid":21485,"level":30,"time":"2023-06-19T15:51:11.015-04:00","v":0,"msg":"Setting up schema","version":"0.1.0"}
141+
...
142+
```
143+
144+
### Bootstrap and Config Check with special user role in AWS or GCP
145+
146+
If you don't want your primary login user to have `superuser` privileges or you are on AWS or GCP, you will need to pass in the special user role that has the privileges to create role, schema, publication and subscription. This is required so `pg_easy_replicate` can create a dedicated user for replication which is granted the respective special user role to carry out its functionalities.
147+
148+
For AWS the special user role is `rds_superuser`, and for GCP it is `cloudsqlsuperuser`. Please refer to docs for the most up to date information.
149+
150+
**Note**: The user in the connection url must be part of the special user role being supplied.
151+
152+
#### Config Check
153+
154+
```bash
155+
$ pg_easy_replicate config_check --special-user-role="rds_superuser" --copy-schema
156+
157+
✅ Config is looking good.
158+
```
159+
160+
#### Bootstrap
161+
162+
```bash
163+
$ pg_easy_replicate bootstrap --group-name database-cluster-1 --special-user-role="rds_superuser" --copy-schema
164+
165+
{"name":"pg_easy_replicate","hostname":"PKHXQVK6DW","pid":21485,"level":30,"time":"2023-06-19T15:51:11.015-04:00","v":0,"msg":"Setting up schema","version":"0.1.0"}
166+
...
167+
```
168+
169+
### Start sync
170+
171+
Once the bootstrap is complete, you can start the sync. Starting the sync sets up the publication, subscription and performs other minor housekeeping things.
172+
173+
```bash
174+
$ pg_easy_replicate start_sync --group-name database-cluster-1
175+
176+
{"name":"pg_easy_replicate","hostname":"PKHXQVK6DW","pid":22113,"level":30,"time":"2023-06-19T15:54:54.874-04:00","v":0,"msg":"Setting up publication","publication_name":"pger_publication_database_cluster_1","version":"0.1.0"}
177+
...
178+
```
179+
180+
### Stats
181+
182+
You can inspect or watch stats any time during the sync process. The stats give you an idea of when the sync started, current flush/write lag, how many tables are in `replicating`, `copying` or other stages, and more.
183+
184+
You can poll these stats to perform any other after the switchover is done. The stats include a `switchover_completed_at` which is updated once the switch over is complete.
185+
186+
```bash
187+
$ pg_easy_replicate stats --group-name database-cluster-1
188+
189+
{
190+
"lag_stats": [
191+
{
192+
"pid": 66,
193+
"client_addr": "192.168.128.2",
194+
"user_name": "jamesbond",
195+
"application_name": "pger_subscription_database_cluster_1",
196+
"state": "streaming",
197+
"sync_state": "async",
198+
"write_lag": "0.0",
199+
"flush_lag": "0.0",
200+
"replay_lag": "0.0"
201+
}
202+
],
203+
"message_lsn_receipts": [
204+
{
205+
"received_lsn": "0/1674688",
206+
"last_msg_send_time": "2023-06-19 19:56:35 UTC",
207+
"last_msg_receipt_time": "2023-06-19 19:56:35 UTC",
208+
"latest_end_lsn": "0/1674688",
209+
"latest_end_time": "2023-06-19 19:56:35 UTC"
210+
}
211+
],
212+
"sync_started_at": "2023-06-19 19:54:54 UTC",
213+
"sync_failed_at": null,
214+
"switchover_completed_at": null
215+
216+
....
217+
```
218+
219+
### Performing switchover
220+
221+
`pg_easy_replicate` doesn't kick off the switchover on its own. When you start the sync via `start_sync`, it starts the replication between the two databases. Once you have had the time to monitor stats and any other key metrics, you can kick off the `switchover`.
222+
223+
`switchover` will wait until all tables in the group are replicating and the delta for lag is <200kb (by calculating the `pg_wal_lsn_diff` between `sent_lsn` and `write_lsn`) and then perform the switch.
224+
225+
The switch is made by putting the user on the source database in `READ ONLY` mode, so that it is not accepting any more writes and waits for the flush lag to be `0`. It’s up to the user to kick off a rolling restart of your application containers or failover DNS (more on these below in strategies) after the switchover is complete, so that your application isn't sending any read/write requests to the old/source database.
226+
227+
```bash
228+
$ pg_easy_replicate switchover --group-name database-cluster-1
229+
230+
{"name":"pg_easy_replicate","hostname":"PKHXQVK6DW","pid":24192,"level":30,"time":"2023-06-19T16:05:23.033-04:00","v":0,"msg":"Watching lag stats","version":"0.1.0"}
231+
...
232+
```
233+
234+
## Replicating single database with custom tables
235+
236+
By default all tables are added for replication but you can create multiple groups with custom tables for the same database. Example
237+
238+
```bash
239+
240+
$ pg_easy_replicate bootstrap --group-name database-cluster-1 --copy-schema
241+
$ pg_easy_replicate start_sync --group-name database-cluster-1 --schema-name public --tables "users, posts, events"
242+
243+
...
244+
245+
$ pg_easy_replicate bootstrap --group-name database-cluster-2 --copy-schema
246+
$ pg_easy_replicate start_sync --group-name database-cluster-2 --schema-name public --tables "comments, views"
247+
248+
...
249+
$ pg_easy_replicate switchover --group-name database-cluster-1
250+
$ pg_easy_replicate switchover --group-name database-cluster-2
251+
...
252+
```
253+
254+
## Switchover strategies with minimal downtime
255+
256+
For minimal downtime, it'd be best to watch/tail the stats and wait until `switchover_completed_at` is updated with a timestamp. Once that happens you can perform any of the following strategies. Note: These are just suggestions and `pg_easy_replicate` doesn't provide any functionalities for this.
257+
258+
### Rolling restart strategy
259+
260+
In this strategy, you have a change ready to go which instructs your application to start connecting to the new database. Either using an environment variable or similar. Depending on the application type, it may or may not require a rolling restart.
261+
262+
Next, you can set up a program that watches the `stats` and waits until `switchover_completed_at` is reporting as `true`. Once that happens it kicks off a rolling restart of your application containers so they can start making connections to the DNS of the new database.
263+
264+
### DNS Failover strategy
265+
266+
In this strategy, you have a weighted based DNS system (example [AWS Route53 weighted records](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resource-record-sets-values-weighted.html)) where 100% of traffic goes to a primary origin and 0% to a secondary origin. The primary origin here is the DNS host for your source database and secondary origin is the DNS host for your target database. You can set up your application ahead of time to interact with the database using DNS from the weighted group.
267+
268+
Next, you can set up a program that watches the `stats` and waits until `switchover_completed_at` is reporting as `true`. Once that happens it updates the weight in the DNS weighted group where 100% of the requests now go to the new/target database. Note: Keeping a low `ttl` is recommended.
269+
270+
## FAQ
271+
272+
### Adding internal user to pgBouncer `userlist`
273+
274+
`pg_easy_replicate` creates a special user to orchestrate the replication. If you use pgBouncer, you may need to allow `pger_su_h1a4fb` as a user that can perform login by adding it to the `userlist`.
275+
276+
## Contributing
277+
278+
PRs most welcome. You can get started locally by
279+
280+
- `docker compose down -v && docker compose up --remove-orphans --build`
281+
- Install ruby `3.1.4` using RVM ([instruction](https://rvm.io/rvm/install#any-other-system))
282+
- `bundle exec rspec` for specs
283+
284+
285+
#### [期望 PostgreSQL|开源PolarDB 增加什么功能?](https://github.com/digoal/blog/issues/76 "269ac3d1c492e938c0191101c7238216")
286+
287+
288+
#### [PolarDB 云原生分布式开源数据库](https://github.com/ApsaraDB "57258f76c37864c6e6d23383d05714ea")
289+
290+
291+
#### [PolarDB 学习图谱: 训练营、培训认证、在线互动实验、解决方案、内核开发公开课、生态合作、写心得拿奖品](https://www.aliyun.com/database/openpolardb/activity "8642f60e04ed0c814bf9cb9677976bd4")
292+
293+
294+
#### [PostgreSQL 解决方案集合](../201706/20170601_02.md "40cff096e9ed7122c512b35d8561d9c8")
295+
296+
297+
#### [德哥 / digoal's github - 公益是一辈子的事.](https://github.com/digoal/blog/blob/master/README.md "22709685feb7cab07d30f30387f0a9ae")
298+
299+
300+
![digoal's wechat](../pic/digoal_weixin.jpg "f7ad92eeba24523fd47a6e1a0e691b59")
301+

0 commit comments

Comments
 (0)