From: | <obouda(at)email(dot)cz> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>, <pgsql-general(at)lists(dot)postgresql(dot)org>, "Karsten Hilbert" <Karsten(dot)Hilbert(at)gmx(dot)net> |
Subject: | Re: backend crash on DELETE, reproducible locally |
Date: | 2018-11-03 19:19:19 |
Message-ID: | 4kp.XAMr.3k5V4ORoB1t.1RtVIt@seznam.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
Hello,
we reached the exactly same problem after upgrading to PostgreSQL 11 - the
server crashed on a DELETE statement with a trigger. We also observed an
AFTER DELETE trigger receiving NULL values in OLD. Now I see the problem
seems to be solved (theoretically). Unfortunately, we are not able to build
the server with the patch, so we cannot confirm that. However, when we just
copied the database (within the same server), the same DELETE executed on
the copy with no problems.
I would like to ask, however: could the same problem arise from an UPDATE
statement (also on a table with an AFTER trigger), or would that be another
bug (technical details below)?
As the bug causes our production servers segfault several times a day, we'd
like to solve the problem as soon as possible. Do I understand it correctly
that if we dump and restore the database, the bug should not occur (until
the next ALTER TABLE ADD COLUMN - which we'll avoid until the patch is
released)?
With the update, we caught this (we also have the core dump and could
provide it privately if useful):
#0 __memcmp_sse4_1 () at ../sysdeps/x86_64/multiarch/memcmp-sse4.S:793
No locals.
#1 0x00007f6a6b993296 in datumIsEqual (value1=7, value2=7, typByVal=<
optimized out>, typLen=16) at /build/postgresql-11-oDwOcQ/postgresql-11-
11.0/build/../src/backend/utils/adt/datum.c:249
size1 = 16
size2 = <optimized out>
s1 = 0x7 <error: Cannot access memory at address 0x7>
s2 = 0x7 <error: Cannot access memory at address 0x7>
res = <optimized out>
#2 0x00007f6a6b6b89de in ProjIndexIsUnchanged (newtup=0x7f6a6d504e80,
oldtup=0x7ffc9c1c1270, relation=0x7f6a6b4c1738) at /build/postgresql-11-
oDwOcQ/postgresql-11-11.0/build/../src/backend/access/heap/heapam.c:4539
att = <optimized out>
indexOid = <optimized out>
indexDesc = 0x7f6a6b4c5008
indexInfo = 0x7f6a6d505ca8
i = 0
indexoidlist = <optimized out>
econtext = 0x7f6a6d4c5080
new_isnull = {false, false, false, 156, 252, 127, false, false, 44,
29, 128, 107, 106, 127, false, false, 116, 54, 76, 107, 106, 127, false,
false, 123, false, false, false, false, false, false, false}
slot = 0x7f6a6d507b78
equals = true
old_isnull = {false, false, false, 109, 106, 127, false, false, 165,
199, 80, 109, 106, 127, false, false, 160, 199, 80, 109, 106, 127, false,
false, 168, 199, 80, 109, 106, 127, false, false}
new_values = {7, 4, 140095077240522, 16, 140722927572064,
140095077271456, 140095077272720, 140722927572079, 0, 0, 140722927572048,
140095054460096, 140095077060032, 140095077242520, 5, 4, 140722927572096,
140095046814123,
140722927572096, 140095046825162, 140095043410520,
140095077060880, 140722927572224, 140095046865098, 60129542543, 23274985272,
140095043410520, 140095077060032, 140095043409720, 140095077060304,
140095077237568,
140095043414504}
indexno = 1
l = 0x7f6a6d50d180
estate = 0x7f6a6d4c4e70
old_values = {7, 5, 140078657614882, 1, 140095077060032, 1024,
140095077271456, 0, 140722927571760, 140095049517261, 140722927571808,
140722927571776, 140722927571792, 140095046814329, 18, 140095077271464,
140095076986224,
140095077237568, 22, 100, 140722927571856, 140722927571920,
140095046831813, 140722927571856, 140095046842624, 0, 140095077265656, 3,
140095077271456, 140095077271792, 140095077271456, 140095077242520}
#3 heap_update (relation=relation(at)entry=0x7f6a6b4c1738, otid=otid(at)entry=0x7
ffc9c1c1690, newtup=newtup(at)entry=0x7f6a6d504e80, cid=0, crosscheck=<
optimized out>, wait=wait(at)entry=true, hufd=hufd(at)entry=0x7ffc9c1c15b0,
lockmode=lockmode(at)entry=0x7ffc9c1c15a4) at /build/postgresql-11-oDwOcQ/
postgresql-11-11.0/build/../src/backend/access/heap/heapam.c:4230
result = <optimized out>
xid = 245028971
hot_attrs = 0x7f6a6d50d0a0
proj_idx_attrs = 0x7f6a6d50d0b8
key_attrs = 0x7f6a6d50d0d0
id_attrs = 0x7f6a6d50d0e8
interesting_attrs = 0x7f6a6d50d100
lp = <optimized out>
oldtup = {t_len = 81, t_self = {ip_blkid = {bi_hi = 0, bi_lo = 9},
ip_posid = 71}, t_tableOid = 65471913, t_data = 0x7f669aa0b3f0}
heaptup = 0x7f6a6d504e80
old_key_tuple = 0x0
old_key_copied = false
page = 0x7f669aa0b180 "=="
block = <optimized out>
mxact_status = <optimized out>
buffer = <optimized out>
newbuf = 17858
vmbuffer = 0
vmbuffer_new = 0
need_toast = <optimized out>
newtupsize = <optimized out>
pagefree = <optimized out>
have_tuple_lock = false
iscombo = false
use_hot_update = false
hot_attrs_checked = <optimized out>
key_intact = <optimized out>
all_visible_cleared = false
all_visible_cleared_new = false
checked_lockers = <optimized out>
locker_remains = <optimized out>
xmax_new_tuple = <optimized out>
xmax_old_tuple = 245028971
infomask_old_tuple = 0
infomask2_old_tuple = 8192
infomask_new_tuple = 144
infomask2_new_tuple = 0
__func__ = "heap_update"
#4 0x00007f6a6b82f2c9 in ExecUpdate (mtstate=mtstate(at)entry=0x7f6a6d4da3f0,
tupleid=0x7ffc9c1c1690, oldtuple=0x0, slot=0x7f6a6d504340, planSlot=
planSlot(at)entry=0x7f6a6d5685e8, epqstate=epqstate(at)entry=0x7f6a6d4da4b0,
estate=estate(at)entry=0x7f6a6d4d8ed0, canSetTag=true) at /build/postgresql
-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/executor/nodeModifyTable.
c:1209
lockmode = LockTupleNoKeyExclusive
partition_constraint_failed = false
tuple = 0x7f6a6d504e80
resultRelInfo = 0x7f6a6d4d9110
resultRelationDesc = 0x7f6a6b4c1738
result = <optimized out>
hufd = {ctid = {ip_blkid = {bi_hi = 5616, bi_lo = 39964}, ip_posid =
32764}, xmax = 1803594893, cmax = 32618}
recheckIndexes = 0x0
saved_tcs_map = 0x0
__func__ = "ExecUpdate"
#5 0x00007f6a6b82f978 in ExecModifyTable (pstate=0x7f6a6d4da3f0) at /build/
postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/executor/
nodeModifyTable.c:2166
node = 0x7f6a6d4da3f0
proute = 0x0
estate = 0x7f6a6d4d8ed0
operation = CMD_UPDATE
saved_resultRelInfo = 0x0
resultRelInfo = 0x7f6a6d4d9110
subplanstate = 0x7f6a6d5679f8
junkfilter = 0x7f6a6d504218
slot = <optimized out>
tupleid = <optimized out>
tuple_ctid = {ip_blkid = {bi_hi = 0, bi_lo = 9}, ip_posid = 71}
oldtupdata = {t_len = 2619086544, t_self = {ip_blkid = {bi_hi =
32764, bi_lo = 0}, ip_posid = 9204}, t_tableOid = 32618, t_data = 0x7ffc9c1c
16d0}
oldtuple = <optimized out>
__func__ = "ExecModifyTable"
#6 0x00007f6a6b80b25b in ExecProcNode (node=0x7f6a6d4da3f0) at /build/
postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/include/executor/
executor.h:237
No locals.
#7 ExecutePlan (execute_once=<optimized out>, dest=0x7f6a6d5ed298,
direction=<optimized out>, numberTuples=0, sendTuples=false, operation=CMD_
UPDATE, use_parallel_mode=<optimized out>, planstate=0x7f6a6d4da3f0, estate=
0x7f6a6d4d8ed0)
at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/
executor/execMain.c:1723
slot = <optimized out>
current_tuple_count = 0
#8 standard_ExecutorRun (queryDesc=0x7f6a6d5f5b80, direction=<optimized out
>, count=0, execute_once=<optimized out>) at /build/postgresql-11-oDwOcQ/
postgresql-11-11.0/build/../src/backend/executor/execMain.c:364
estate = 0x7f6a6d4d8ed0
operation = CMD_UPDATE
dest = 0x7f6a6d5ed298
sendTuples = <optimized out>
__func__ = "standard_ExecutorRun"
#9 0x00007f6a6b95b41d in ProcessQuery (plan=<optimized out>,
sourceText=0x7f6a6d3b45a0 "UPDATE public.schedulecard SET ext_ident=
NULL, finalreport_absence_cnt=NULL, period_day=4, period_num=1, rotates=
false, rotationgroup_id=NULL, schedulecard_id=NULL, schedulecardblock_id=
NULL, schedule"...,
params=0x0, queryEnv=0x0, dest=0x7f6a6d5ed298, completionTag=0x7ffc9c1c
19d0 "") at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/
backend/tcop/pquery.c:161
queryDesc = 0x7f6a6d5f5b80
#10 0x00007f6a6b95b668 in PortalRunMulti (portal=portal(at)entry=0x7f6a6d47f
230, isTopLevel=isTopLevel(at)entry=true, setHoldSnapshot=setHoldSnapshot(at)entry
=false, dest=dest(at)entry=0x7f6a6d5ed298, altdest=altdest(at)entry=0x7f6a6d5ed
298,
completionTag=completionTag(at)entry=0x7ffc9c1c19d0 "") at /build/
postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/tcop/pquery.c:
1286
pstmt = 0x7f6a6d5ed138
active_snapshot_set = true
stmtlist_item = 0x7f6a6d5ed248
#11 0x00007f6a6b95c2d5 in PortalRun (portal=portal(at)entry=0x7f6a6d47f230,
count=count(at)entry=9223372036854775807, isTopLevel=isTopLevel(at)entry=true, run
_once=run_once(at)entry=true, dest=dest(at)entry=0x7f6a6d5ed298,
altdest=altdest(at)entry=0x7f6a6d5ed298, completionTag=completionTag(at)entry=
0x7ffc9c1c19d0 "") at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/
build/../src/backend/tcop/pquery.c:799
save_exception_stack = 0x7ffc9c1c1c00
save_context_stack = 0x0
local_sigjmp_buf = {{__jmpbuf = {140722927574016,
7153736072036697610, 140095075869696, 140722927573456, 140095076692528,
140095075869744, 7153736071992657418, 7091693995611473418}, __mask_was_saved
= 0, __saved_mask = {__val = {
0, 140723125634669, 0, 140095051070678, 64, 112,
140095075869696, 140095075861648, 140095078191720, 140722927573328,
140095049517496, 140722927573360, 2, 140095075869696, 2, 140722927573360}}}}
result = <optimized out>
nprocessed = <optimized out>
saveTopTransactionResourceOwner = 0x7f6a6d4182c8
saveTopTransactionContext = 0x7f6a6d490150
saveActivePortal = 0x0
saveResourceOwner = 0x7f6a6d4182c8
savePortalContext = 0x0
saveMemoryContext = 0x7f6a6d490150
__func__ = "PortalRun"
#12 0x00007f6a6b957f7f in exec_simple_query (
query_string=0x7f6a6d3b45a0 "UPDATE public.schedulecard SET ext_ident=
NULL, finalreport_absence_cnt=NULL, period_day=4, period_num=1, rotates=
false, rotationgroup_id=NULL, schedulecard_id=NULL, schedulecardblock_id=
NULL, schedule"...) at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/
build/../src/backend/tcop/postgres.c:1122
parsetree = 0x7f6a6d3b6400
portal = 0x7f6a6d47f230
snapshot_set = <optimized out>
commandTag = <optimized out>
completionTag = "\000\032\034\234\374\177\000\000\341\356\247kj\177\
000\000\000\000\000\000\000\000\000\000\060\316\367kj\177\000\000\340&\360
kj\177\000\000\234U\357kj\177\000\000\240E;mj\177\000\000Q\000\000\000\000\
000\000"
querytree_list = <optimized out>
plantree_list = <optimized out>
receiver = 0x7f6a6d5ed298
format = 0
dest = DestRemote
parsetree_list = 0x7f6a6d3b6450
parsetree_item = 0x7f6a6d3b6430
save_log_statement_stats = false
was_logged = false
use_implicit_block = false
msec_str = "\000\032\034\234\374\177\000\000\341\356\247kj\177\000\
000\000\000\000\000\000\000\000\000\060\316\367kj\177\000"
__func__ = "exec_simple_query"
#13 0x00007f6a6b95930a in PostgresMain (argc=<optimized out>, argv=argv@
entry=0x7f6a6d400220, dbname=0x7f6a6d4001b0 "mydb", username=<optimized out>
)
at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/
tcop/postgres.c:4159
query_string = 0x7f6a6d3b45a0 "UPDATE public.schedulecard SET ext_
ident=NULL, finalreport_absence_cnt=NULL, period_day=4, period_num=1,
rotates=false, rotationgroup_id=NULL, schedulecard_id=NULL,
schedulecardblock_id=NULL, schedule"...
firstchar = 81
input_message = {
data = 0x7f6a6d3b45a0 "UPDATE public.schedulecard SET ext_ident=
NULL, finalreport_absence_cnt=NULL, period_day=4, period_num=1, rotates=
false, rotationgroup_id=NULL, schedulecard_id=NULL, schedulecardblock_id=
NULL, schedule"..., len = 319, maxlen = 1024, cursor = 319}
local_sigjmp_buf = {{__jmpbuf = {140095054460088,
7153736071902479882, 140095054460096, 1, 0, 140095076133088,
7153736071933937162, 7091693993451144714}, __mask_was_saved = 1, __saved_
mask = {__val = {0, 140095076133088,
140095004765576, 844424930131970, 206158430256,
140722927574360, 140722927574144, 293, 140095076133088, 0, 0,
140095075861648, 1024, 140722927574468, 0, 140722927574240}}}}
send_ready_for_query = false
disable_idle_in_transaction_timeout = false
__func__ = "PostgresMain"
#14 0x00007f6a6b67867d in BackendRun (port=0x7f6a6d3f68e0) at /build/
postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/postmaster/
postmaster.c:4361
ac = 1
secs = 594476990
usecs = 933865
i = 1
av = 0x7f6a6d400220
maxac = <optimized out>
#15 BackendStartup (port=0x7f6a6d3f68e0) at /build/postgresql-11-oDwOcQ/
postgresql-11-11.0/build/../src/backend/postmaster/postmaster.c:4033
bn = <optimized out>
pid = <optimized out>
#16 ServerLoop () at /build/postgresql-11-oDwOcQ/postgresql-11-11.0/
build/../src/backend/postmaster/postmaster.c:1706
rmask = {fds_bits = {8, 0 <repeats 15 times>}}
selres = <optimized out>
now = <optimized out>
readmask = {fds_bits = {200, 0 <repeats 15 times>}}
last_lockfile_recheck_time = 1541161751
last_touch_time = 1541161147
__func__ = "ServerLoop"
#17 0x00007f6a6b8e7644 in PostmasterMain (argc=5, argv=<optimized out>) at /
build/postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/
postmaster/postmaster.c:1379
opt = <optimized out>
status = <optimized out>
userDoption = <optimized out>
listen_addr_saved = true
i = <optimized out>
output_config_variable = <optimized out>
__func__ = "PostmasterMain"
#18 0x00007f6a6b679616 in main (argc=5, argv=0x7f6a6d3af1f0) at /build/
postgresql-11-oDwOcQ/postgresql-11-11.0/build/../src/backend/main/main.c:228
No locals.
Best regards,
Ondřej Bouda
---------- Původní e-mail ----------
Od: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Komu: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
Datum: 3. 11. 2018 4:57:19
Předmět: Re: backend crash on DELETE, reproducible locally
"Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net> writes:
> On Thu, Nov 01, 2018 at 11:43:56AM -0400, Tom Lane wrote:
>> Yeah, apparently we've passed a null OLD tuple to an RI_FKey_cascade_del
>> trigger, which surely shouldn't happen. It'd be interesting to look at
>> the set of triggers on this table. I don't entirely trust psql's \d
>> to show us reality if there's something screwed up about the triggers,
>> so in addition to \d output, could we see
>> select * from pg_trigger where tgrelid = 'ref.auto_hint'::regclass;
> [ pretty normal-looking trigger entries ]
I was feeling baffled about this, but it suddenly occurs to me that maybe
the bug fixed in 040a1df61/372102b81 explains this. If the particular
tuple you're trying to delete predates the last ALTER TABLE ADD COLUMN
operation on the table, then this bug would result in t_self getting
set to "invalid", which'd cause AfterTriggerSaveEvent to save "ate_ctid1"
as "invalid", which'd lead to a null tuple getting passed when the trigger
eventually gets invoked.
regards, tom lane
"
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2018-11-03 19:38:14 | Re: backend crash on DELETE, reproducible locally |
Previous Message | Karsten Hilbert | 2018-11-03 16:03:14 | Re: backend crash on DELETE, reproducible locally |
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2018-11-03 19:24:28 | Re: partitioned indexes and tablespaces |
Previous Message | Andres Freund | 2018-11-03 18:42:59 | Re: plruby: rb_iterate symbol clash with libruby.so |