Skip to content

Files

Latest commit

Dec 15, 2023
f400df2 · Dec 15, 2023

History

History
335 lines (274 loc) · 12.4 KB

20210715_03.md

File metadata and controls

335 lines (274 loc) · 12.4 KB

PostgreSQL 数据块恢复 - pg_filedump

作者

digoal

日期

2021-07-15

标签

PostgreSQL , pg_filedump


背景

原文:
https://www.highgo.ca/2021/07/14/first-contact-with-the-pg_filedump/

During my most recent expedition of going through PostgreSQL source code, I ran into some mysterious variables in the page structure of some indexes. I will explain later why i am calling these mysterious but first have a look at these variables…

typedef struct HashPageOpaqueData  
{  
  ...  
  uint16    hasho_page_id;  /* for identification of hash indexes */  
} HashPageOpaqueData;  

or

typedef struct SpGistPageOpaqueData  
{  
  ...  
  uint16    spgist_page_id; /* for identification of SP-GiST indexes */  
} SpGistPageOpaqueData;  

You may find that in Postgres’s kernel code, they just give them a constant, but they don’t really use them anywhere (well as far as i have looked). That was my reason for calling these variables mysterious but finally found a clue in the code comment of the macro definition SPGIST_PAGE_ID.

The page ID is for the convenience of pg_filedump and similar utilities, which otherwise would have a hard time telling pages of different index types apart.

So here comes my reason for writing this blog, this is about the pg_filedump utility, it will provide with an introduction to the utility and demo of basic functionality it provides.

Understanding and using pg_filedump

pg_filedump is a utility to format PostgreSQL heap/index/control files into a human-readable form.

So without further ado, let’s dig into it by first operating it according to the introduction in the README to intuitively understand its functions:

Create test data:

CREATE TABLE test(id int, name text);  
  
INSERT INTO test values(001, 'Aerith'), (002, 'Cloud'), (003, 'Tifa');  

View file path:

SELECT pg_relation_filepath(‘test’);  
pg_relation_filepath  
———————-  
base/14236/164333  
(1 row)  

Important: remember to execute checkpoint to ensure that the file contents are written to disk.

Using pg_filedump viewing files:

% pg_filedump $PGDATA/base/14236/164333  
​  
*******************************************************************  
* PostgreSQL File/Block Formatted Dump Utility  
*  
* File: $PGDATA/base/14236/164333  
* Options used: None  
*******************************************************************  
​  
Block    0 ********************************************************  
<Header> -----  
 Block Offset: 0x00000000         Offsets: Lower      36 (0x0024)  
 Block: Size 8192  Version    4            Upper    8072 (0x1f88)  
 LSN:  logid      0 recoff 0x618242d8      Special  8192 (0x2000)  
 Items:    3                      Free Space: 8036  
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()  
 Length (including item array): 36  
​  
<Data> -----  
 Item   1 -- Length:   35  Offset: 8152 (0x1fd8)  Flags: NORMAL  
 Item   2 -- Length:   34  Offset: 8112 (0x1fb0)  Flags: NORMAL  
 Item   3 -- Length:   33  Offset: 8072 (0x1f88)  Flags: NORMAL  
​  
​  
*** End of File Encountered. Last Block Read: 0 ***  

You can see the header information of the page and you can see that there are 3 items in the data section, but unfortunately, the actual content of the data can not be read directly. So we add the – D option:

-D Decode tuples using given comma separated list of types  
% pg_filedump -D int,text $PGDATA/base/14236/164333  
​  
*******************************************************************  
* PostgreSQL File/Block Formatted Dump Utility  
*  
* File: $PGDATA/base/14236/164333  
* Options used: -D int,text  
*******************************************************************  
​  
Block    0 ********************************************************  
<Header> -----  
 Block Offset: 0x00000000         Offsets: Lower      36 (0x0024)  
 Block: Size 8192  Version    4            Upper    8072 (0x1f88)  
 LSN:  logid      0 recoff 0x618242d8      Special  8192 (0x2000)  
 Items:    3                      Free Space: 8036  
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()  
 Length (including item array): 36  
​  
<Data> -----  
 Item   1 -- Length:   35  Offset: 8152 (0x1fd8)  Flags: NORMAL  
COPY: 1 Aerith  
 Item   2 -- Length:   34  Offset: 8112 (0x1fb0)  Flags: NORMAL  
COPY: 2 Cloud  
 Item   3 -- Length:   33  Offset: 8072 (0x1f88)  Flags: NORMAL  
COPY: 3 Tifa  
​  
​  
*** End of File Encountered. Last Block Read: 0 ***  

Now we can see the actual content, but if a tuple is deleted, will it be decoded here? To verify it, we try to execute an update statement:

UPDATE test SET name = 'Yuffie' WHERE id = 1;  

Don’t forget checkpoint

Then we look at the results again:

% ./pg_filedump -D int,text $PGDATA/base/14236/164333  
​  
*******************************************************************  
* PostgreSQL File/Block Formatted Dump Utility  
*  
* File: $PGDATA/base/14236/164333  
* Options used: -D int,text  
*******************************************************************  
​  
Block    0 ********************************************************  
<Header> -----  
 Block Offset: 0x00000000         Offsets: Lower      40 (0x0028)  
 Block: Size 8192  Version    4            Upper    8032 (0x1f60)  
 LSN:  logid      0 recoff 0x61824530      Special  8192 (0x2000)  
 Items:    4                      Free Space: 7992  
 Checksum: 0x0000  Prune XID: 0x0001b3a7  Flags: 0x0000 ()  
 Length (including item array): 40  
​  
<Data> -----  
 Item   1 -- Length:   35  Offset: 8152 (0x1fd8)  Flags: NORMAL  
COPY: 1 Aerith  
 Item   2 -- Length:   34  Offset: 8112 (0x1fb0)  Flags: NORMAL  
COPY: 2 Cloud  
 Item   3 -- Length:   33  Offset: 8072 (0x1f88)  Flags: NORMAL  
COPY: 3 Tifa  
 Item   4 -- Length:   35  Offset: 8032 (0x1f60)  Flags: NORMAL  
COPY: 1 Yuffie  
​  
​  
*** End of File Encountered. Last Block Read: 0 ***  

Hmmm, although the new tuple is updated here, the old tuple that should have been deleted also exists. So we added the -o option

-o Do not dump old values.  
% ./pg_filedump -D int,text -o $PADATA/base/14236/164333  
​  
*******************************************************************  
* PostgreSQL File/Block Formatted Dump Utility  
*  
* File: $PGDATA/base/14236/164333  
* Options used: -D int,text -o  
*******************************************************************  
​  
Block    0 ********************************************************  
<Header> -----  
 Block Offset: 0x00000000         Offsets: Lower      40 (0x0028)  
 Block: Size 8192  Version    4            Upper    8032 (0x1f60)  
 LSN:  logid      0 recoff 0x61824530      Special  8192 (0x2000)  
 Items:    4                      Free Space: 7992  
 Checksum: 0x0000  Prune XID: 0x0001b3a7  Flags: 0x0000 ()  
 Length (including item array): 40  
​  
<Data> -----  
 Item   1 -- Length:   35  Offset: 8152 (0x1fd8)  Flags: NORMAL  
tuple was removed by transaction #111527  
 Item   2 -- Length:   34  Offset: 8112 (0x1fb0)  Flags: NORMAL  
COPY: 2 Cloud  
 Item   3 -- Length:   33  Offset: 8072 (0x1f88)  Flags: NORMAL  
COPY: 3 Tifa  
 Item   4 -- Length:   35  Offset: 8032 (0x1f60)  Flags: NORMAL  
COPY: 1 Yuffie  
​  
​  
*** End of File Encountered. Last Block Read: 0 ***  

Great. Now the deleted tuple has been identified and the transaction number has been intimately prompted.

Let’s also take a look at the index page:

create index ON test using btree (id);  
​  
\d test  
                Table "public.test"  
 Column |  Type   | Collation | Nullable | Default   
--------+---------+-----------+----------+---------  
 id     | integer |           |          |   
 name   | text    |           |          |   
Indexes:  
    "test_id_idx" btree (id)  
      
select pg_relation_filepath('test_id_idx');  
 pg_relation_filepath   
----------------------  
 base/14236/164337  
(1 row)  
​  
checkpoint;  
% ./pg_filedump -D int,text -o $PGDATA/base/14236/164337  
​  
*******************************************************************  
* PostgreSQL File/Block Formatted Dump Utility  
*  
* File: $PGDATA/base/14236/164337  
* Options used: -D int,text -o  
*******************************************************************  
​  
Block    0 ********************************************************  
<Header> -----  
 Block Offset: 0x00000000         Offsets: Lower      72 (0x0048)  
 Block: Size 8192  Version    4            Upper    8176 (0x1ff0)  
 LSN:  logid      0 recoff 0x61845ee0      Special  8176 (0x1ff0)  
 Items:   12                      Free Space: 8104  
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()  
 Length (including item array): 72  
​  
 BTree Meta Data:  Magic (0x00053162)   Version (4)  
                   Root:     Block (1)  Level (0)  
                   FastRoot: Block (1)  Level (0)  
​  
<Special Section> -----  
 BTree Index Section:  
  Flags: 0x0008 (META)  
  Blocks: Previous (0)  Next (0)  Level (0)  CycleId (0)  
​  
​  
Block    1 ********************************************************  
<Header> -----  
 Block Offset: 0x00002000         Offsets: Lower      36 (0x0024)  
 Block: Size 8192  Version    4            Upper    8128 (0x1fc0)  
 LSN:  logid      0 recoff 0x61845e50      Special  8176 (0x1ff0)  
 Items:    3                      Free Space: 8092  
 Checksum: 0x0000  Prune XID: 0x00000000  Flags: 0x0000 ()  
 Length (including item array): 36  
​  
<Data> -----  
 Item   1 -- Length:   16  Offset: 8160 (0x1fe0)  Flags: NORMAL  
tuple was removed by transaction #1048577  
 Item   2 -- Length:   16  Offset: 8144 (0x1fd0)  Flags: NORMAL  
tuple was removed by transaction #1048578  
 Item   3 -- Length:   16  Offset: 8128 (0x1fc0)  Flags: NORMAL  
tuple was removed by transaction #1048579  
​  
<Special Section> -----  
 BTree Index Section:  
  Flags: 0x0003 (LEAF|ROOT)  
  Blocks: Previous (0)  Next (0)  Level (0)  CycleId (0)  
​  
​  
*** End of File Encountered. Last Block Read: 1 ***  

Cool, It shows the information in the index file in detail, and also shows the special section on the index page.

summary

Of course, the functions of pg_filedump are very rich. Here I just show the most basic usage. It can help us with a lot of things, such as:

In the learning stage, with the help of it, we can more easily grasp the content of the page, as well as the differences between different types of pages.

We can also use it to repair data, as mentioned in this slides — https://pgday.ru/presentation/146/59649714ee40a.pdf

Data recovery using pg_filedump

More cool things, have a fun with it!

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

digoal's wechat