COPY FROM crash

From: Zhang Mingli <zmlpostgres(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: COPY FROM crash
Date: 2024-07-30 03:50:54
Message-ID: f722b8fb-1962-4015-8578-e2bd77818ac9@Spark
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, all

I got a crash when copy partition tables with mass data in Cloudberry DB[0](based on Postgres14.4, Greenplum 7).

I have a test on Postgres and it has the similar issue(different places but same function).

However it’s a little hard to reproduce because it happened when inserting next tuple after a previous copy multi insert buffer is flushed.

To reproduce easily, change the Macros to:

#define MAX_BUFFERED_TUPLES 1
#define MAX_PARTITION_BUFFERS 0

Config and make install, when initdb, a core dump will be as:

#0 0x000055de617211b9 in CopyMultiInsertInfoNextFreeSlot (miinfo=0x7ffce496d360, rri=0x55de6368ba88)
 at copyfrom.c:592
#1 0x000055de61721ff1 in CopyFrom (cstate=0x55de63592ce8) at copyfrom.c:985
#2 0x000055de6171dd86 in DoCopy (pstate=0x55de63589e00, stmt=0x55de635347d8, stmt_location=0, stmt_len=195,
 processed=0x7ffce496d590) at copy.c:306
#3 0x000055de61ad7ce8 in standard_ProcessUtility (pstmt=0x55de635348a8,
 queryString=0x55de63533960 "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub
_feature_name, is_supported, comments) FROM E'/home/gpadmin/install/pg17/share/postgresql/sql_features.txt';\n",
 readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55de620b0ce0 <debugtupDR>,
 qc=0x7ffce496d910) at utility.c:735
#4 0x000055de61ad7614 in ProcessUtility (pstmt=0x55de635348a8,
 queryString=0x55de63533960 "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub
_feature_name, is_supported, comments) FROM E'/home/gpadmin/install/pg17/share/postgresql/sql_features.txt';\n",
 readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55de620b0ce0 <debugtupDR>,
 qc=0x7ffce496d910) at utility.c:523
#5 0x000055de61ad5e8f in PortalRunUtility (portal=0x55de633dd7a0, pstmt=0x55de635348a8, isTopLevel=true,
 setHoldSnapshot=false, dest=0x55de620b0ce0 <debugtupDR>, qc=0x7ffce496d910) at pquery.c:1158
#6 0x000055de61ad6106 in PortalRunMulti (portal=0x55de633dd7a0, isTopLevel=true, setHoldSnapshot=false,
 dest=0x55de620b0ce0 <debugtupDR>, altdest=0x55de620b0ce0 <debugtupDR>, qc=0x7ffce496d910) at pquery.c:1315
#7 0x000055de61ad5550 in PortalRun (portal=0x55de633dd7a0, count=9223372036854775807, isTopLevel=true,
 run_once=true, dest=0x55de620b0ce0 <debugtupDR>, altdest=0x55de620b0ce0 <debugtupDR>, qc=0x7ffce496d910)
 at pquery.c:791```

The root cause is:  we may call CopyMultiInsertInfoFlush() to flush buffer during COPY tuples, ex: insert from next tuple,
CopyMultiInsertInfoNextFreeSlot() will get a crash due to null pointer of buffer.

To fix it: instead of call CopyMultiInsertInfoSetupBuffer() outside, I put it into CopyMultiInsertInfoNextFreeSlot() to avoid such issues.

[0] https://github.com/cloudberrydb/cloudberrydb

Zhang Mingli
www.hashdata.xyz

Attachment Content-Type Size
v0-0001-Fix-COPY-FROM-crash-due-to-buffer-flush.patch application/octet-stream 1.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-07-30 03:54:52 Re: 040_pg_createsubscriber.pl is slow and unstable (was Re: speed up a logical replica setup)
Previous Message Sutou Kouhei 2024-07-30 02:58:24 Re: Make COPY format extendable: Extract COPY TO format implementations