From: | Zhang Mingli <zmlpostgres(at)gmail(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | COPY FROM crash |
Date: | 2024-07-30 03:50:54 |
Message-ID: | f722b8fb-1962-4015-8578-e2bd77818ac9@Spark |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi, all
I got a crash when copy partition tables with mass data in Cloudberry DB[0](based on Postgres14.4, Greenplum 7).
I have a test on Postgres and it has the similar issue(different places but same function).
However it’s a little hard to reproduce because it happened when inserting next tuple after a previous copy multi insert buffer is flushed.
To reproduce easily, change the Macros to:
#define MAX_BUFFERED_TUPLES 1
#define MAX_PARTITION_BUFFERS 0
Config and make install, when initdb, a core dump will be as:
#0 0x000055de617211b9 in CopyMultiInsertInfoNextFreeSlot (miinfo=0x7ffce496d360, rri=0x55de6368ba88)
at copyfrom.c:592
#1 0x000055de61721ff1 in CopyFrom (cstate=0x55de63592ce8) at copyfrom.c:985
#2 0x000055de6171dd86 in DoCopy (pstate=0x55de63589e00, stmt=0x55de635347d8, stmt_location=0, stmt_len=195,
processed=0x7ffce496d590) at copy.c:306
#3 0x000055de61ad7ce8 in standard_ProcessUtility (pstmt=0x55de635348a8,
queryString=0x55de63533960 "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub
_feature_name, is_supported, comments) FROM E'/home/gpadmin/install/pg17/share/postgresql/sql_features.txt';\n",
readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55de620b0ce0 <debugtupDR>,
qc=0x7ffce496d910) at utility.c:735
#4 0x000055de61ad7614 in ProcessUtility (pstmt=0x55de635348a8,
queryString=0x55de63533960 "COPY information_schema.sql_features (feature_id, feature_name, sub_feature_id, sub
_feature_name, is_supported, comments) FROM E'/home/gpadmin/install/pg17/share/postgresql/sql_features.txt';\n",
readOnlyTree=false, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x55de620b0ce0 <debugtupDR>,
qc=0x7ffce496d910) at utility.c:523
#5 0x000055de61ad5e8f in PortalRunUtility (portal=0x55de633dd7a0, pstmt=0x55de635348a8, isTopLevel=true,
setHoldSnapshot=false, dest=0x55de620b0ce0 <debugtupDR>, qc=0x7ffce496d910) at pquery.c:1158
#6 0x000055de61ad6106 in PortalRunMulti (portal=0x55de633dd7a0, isTopLevel=true, setHoldSnapshot=false,
dest=0x55de620b0ce0 <debugtupDR>, altdest=0x55de620b0ce0 <debugtupDR>, qc=0x7ffce496d910) at pquery.c:1315
#7 0x000055de61ad5550 in PortalRun (portal=0x55de633dd7a0, count=9223372036854775807, isTopLevel=true,
run_once=true, dest=0x55de620b0ce0 <debugtupDR>, altdest=0x55de620b0ce0 <debugtupDR>, qc=0x7ffce496d910)
at pquery.c:791```
The root cause is: we may call CopyMultiInsertInfoFlush() to flush buffer during COPY tuples, ex: insert from next tuple,
CopyMultiInsertInfoNextFreeSlot() will get a crash due to null pointer of buffer.
To fix it: instead of call CopyMultiInsertInfoSetupBuffer() outside, I put it into CopyMultiInsertInfoNextFreeSlot() to avoid such issues.
[0] https://github.com/cloudberrydb/cloudberrydb
Zhang Mingli
www.hashdata.xyz
Attachment | Content-Type | Size |
---|---|---|
v0-0001-Fix-COPY-FROM-crash-due-to-buffer-flush.patch | application/octet-stream | 1.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2024-07-30 03:54:52 | Re: 040_pg_createsubscriber.pl is slow and unstable (was Re: speed up a logical replica setup) |
Previous Message | Sutou Kouhei | 2024-07-30 02:58:24 | Re: Make COPY format extendable: Extract COPY TO format implementations |