fix coredump when I create table in coordinator restore mode #1

xuegang · 2015-01-27T08:45:54Z

Hi koichi;
In coordinator restore mode. I got coredump when I create table.

Recreate steps:
1.pg_ctl start -Z restoremode -D /rdbdata/bcrdb_data/coord
2.psql -hzhcx5i -p20015 cxdb
3.create table:

CREATE TABLE cm_busi_handle_201301 (
so_nbr bigint,
region_code integer,
process_id integer,
process_result integer,
handle_seq integer,
op_id integer,
oper_date timestamp without time zone,
oper_end_date timestamp without time zone,
invoice_no character varying(20),
property character varying(20),
oper_desc text
)
DISTRIBUTE BY MODULO (region_code)
TO NODE (datanode1,datanode2,datanode3,datanode4,datanode5,datanode6,datanode7,datanode8)

4.get coredump

cxdb=# CREATE TABLE cm_busi_handle_201301 (
cxdb(# so_nbr bigint,
cxdb(# region_code integer,
cxdb(# process_id integer,
cxdb(# process_result integer,
cxdb(# handle_seq integer,
cxdb(# op_id integer,
cxdb(# oper_date timestamp without time zone,
cxdb(# oper_end_date timestamp without time zone,
cxdb(# invoice_no character varying(20),
cxdb(# property character varying(20),
cxdb(# oper_desc text
cxdb(# )
cxdb-# DISTRIBUTE BY MODULO (region_code)
cxdb-# TO NODE (datanode1,datanode2,datanode3,datanode4,datanode5,datanode6,datanode7,datanode8);
The connection to the server was lost. Attempting reset: Failed.
!>

5.stack when get coredump

gdb /rdbdata/bcrdb_install/bin/postgres /tmp/corefile/core.postgres.48524
(gd bt
#0 0x00000036a48328a5 in raise () from /lib64/libc.so.6
#1 0x00000036a4834085 in abort () from /lib64/libc.so.6
#2 0x00000036a486fa37 in __libc_message () from /lib64/libc.so.6
#3 0x00000036a4875366 in malloc_printerr () from /lib64/libc.so.6
#4 0x00000036a4877e93 in _int_free () from /lib64/libc.so.6
#5 0x0000000000769879 in AllocSetDelete (context=) at aset.c:551
#6 0x0000000000769dad in MemoryContextDelete (context=0x12231e8) at mcxt.c:193
#7 0x000000000076aa70 in PortalDrop (portal=0x122d0c0, isTopCommit=) at portalmem.c:588
#8 0x000000000067ddaa in exec_simple_query (

query_string=0x114c1e0 "CREATE TABLE cm_busi_handle_201301 (\n    so_nbr bigint,\n    region_code integer,\n    process_id integer,\n    process_result integer,\n    handle_seq integer,\n    op_id integer,\n    oper_date timestamp "...) at postgres.c:1149

#9 0x000000000067f82f in PostgresMain (argc=, argv=, dbname=0x1166708 "cxdb",

username=<value optimized out>) at postgres.c:4243

#10 0x000000000063b84a in BackendRun (argc=, argv=) at postmaster.c:4202
#11 BackendStartup (argc=, argv=) at postmaster.c:3891
#12 ServerLoop (argc=, argv=) at postmaster.c:1702
#13 PostmasterMain (argc=, argv=) at postmaster.c:1369
#14 0x00000000005d1420 in main (argc=4, argv=0x1131c70) at main.c:206

My analysis is as below：
when I start coordinator in restoremode. Pooler process is not running. So NumDataNodes is zero.
so there is a problem in function BuildRelationDistributionNodes

{

//In restoremode. NumDataNodes is 0. So memory allocation has the problem. when the parameter of palloc0 is 0.Then palloc0 allocate smallest chunk to

//nodeoids. If we need more memory, there is a memory overflow. so when postgres free memory, get coredump

  nodeoids = (Oid ) palloc0(NumDataNodes * sizeof(Oid));

}

The right code as follows:

BuildRelationDistributionNodes(List *nodes, int *numnodes)
{
Oid *nodeoids;
ListCell *item;
int numdatanotes;
*numnodes = 0;

numdatanotes=list_length(nodes);
nodeoids = (Oid _) palloc0(numdatanotes_sizeof(Oid));

}

I have put changed code in git repo, and pull request to you. Please help to review it. If and problem. Please let me know.

Thanks
[email protected]

koichi-szk · 2015-05-07T03:43:05Z

Hello;

I'm very sorry that I've left this thread for a log time. I think this fix can be put into the master and related stable branches.

I'm also renewing whole PGXC project. At present, master repository is in sourceforge and at present, there are slight conflict between sourceforge and github. So I'm resetting github repository and push everything from sourceforge to make github as the master repository.

The patch will be kept in current sourceforge repository to be brought back here.

You may see the repository is gone and the be back as initialized status. Please understand that the patch is kept elsewhere to be back.

I'd like you to continue to contribute to this project.

Thank you.

Koichi Suzuki

fix coredump when I create table in coordinator restore mode

55f0842

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix coredump when I create table in coordinator restore mode #1

fix coredump when I create table in coordinator restore mode #1

xuegang commented Jan 27, 2015

koichi-szk commented May 7, 2015

fix coredump when I create table in coordinator restore mode #1

Are you sure you want to change the base?

fix coredump when I create table in coordinator restore mode #1

Conversation

xuegang commented Jan 27, 2015

koichi-szk commented May 7, 2015

Thank you.