missing ML message (Re: v12.0: segfault in reindex CONCURRENTLY)

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: pgsql-www(at)postgresql(dot)org
Subject: missing ML message (Re: v12.0: segfault in reindex CONCURRENTLY)
Date: 2019-10-13 15:20:52
Message-ID: 20191013152052.GE4475@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

I was going to add a link to this thread in a commit message for our internal
project's repo, but found that (even 2.5 hours later) the HTTP archives don't
show my last message; I'd expect it to be here:
https://www.postgresql.org/message-id/flat/20191012004446.GT10470%40telsasoft.com

Quoted message follows, sent with headers like so:

From pryzby(at)telsasoft(dot)com Sun Oct 13 08:03:21 2019
Date: Sun, 13 Oct 2019 08:03:21 -0500
From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org,
Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>,
Andreas Karlsson <andreas(at)proxel(dot)se>,
Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: v12.0: segfault in reindex CONCURRENTLY
Message-ID: <20191013130321(dot)GD4475(at)telsasoft(dot)com>
References: <20191012004446(dot)GT10470(at)telsasoft(dot)com>
<20191013090643(dot)GA1434(at)paquier(dot)xyz>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20191013090643(dot)GA1434(at)paquier(dot)xyz>

On Sun, Oct 13, 2019 at 08:03:21AM -0500, Justin Pryzby wrote:
> On Sun, Oct 13, 2019 at 06:06:43PM +0900, Michael Paquier wrote:
> > On Fri, Oct 11, 2019 at 07:44:46PM -0500, Justin Pryzby wrote:
> > > Unfortunately, there was no core file, and I'm still trying to reproduce it.
> >
> > Forgot to set ulimit -c? Having a backtrace would surely help.
>
> Fortunately (?) another server hit crashed last night.
> (Doesn't appear to be relevant, but this table has no inheritence/partition-ness).
>
> Looks like it's a race condition and dereferencing *holder=NULL. The first
> crash was probably the same bug, due to report query running during "reindex
> CONCURRENTLY", and probably finished at nearly the same time as another locker.
>
> Relevant code introduced here:
>
> commit ab0dfc961b6a821f23d9c40c723d11380ce195a6
> Author: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> Date: Tue Apr 2 15:18:08 2019 -0300
>
> Report progress of CREATE INDEX operations
>
> Needs to be conditionalized (as anticipated by the comment)
>
> + if (holder)
> pgstat_progress_update_param(PROGRESS_WAITFOR_CURRENT_PID,
> holder->pid);
>
>
> Core was generated by `postgres: postgres ts [local] REINDEX '.
> Program terminated with signal 11, Segmentation fault.
>
> #0 WaitForLockersMultiple (locktags=locktags(at)entry=0x1d30548, lockmode=lockmode(at)entry=5, progress=progress(at)entry=true) at lmgr.c:911
> #1 0x00000000005c2ac8 in ReindexRelationConcurrently (relationOid=relationOid(at)entry=17618, options=options(at)entry=0) at indexcmds.c:3090
> #2 0x00000000005c328a in ReindexIndex (indexRelation=<optimized out>, options=0, concurrent=<optimized out>) at indexcmds.c:2352
> #3 0x00000000007657fe in standard_ProcessUtility (pstmt=pstmt(at)entry=0x1d05468, queryString=queryString(at)entry=0x1d046e0 "REINDEX INDEX CONCURRENTLY loaded_cdr_files_filename",
> context=context(at)entry=PROCESS_UTILITY_TOPLEVEL, params=params(at)entry=0x0, queryEnv=queryEnv(at)entry=0x0, dest=dest(at)entry=0x1d05548,
> completionTag=completionTag(at)entry=0x7ffc05e6c0a0 "") at utility.c:787
> #4 0x00007f21517204ef in pgss_ProcessUtility (pstmt=0x1d05468, queryString=0x1d046e0 "REINDEX INDEX CONCURRENTLY loaded_cdr_files_filename", context=PROCESS_UTILITY_TOPLEVEL,
> params=0x0, queryEnv=0x0, dest=0x1d05548, completionTag=0x7ffc05e6c0a0 "") at pg_stat_statements.c:1006
> #5 0x0000000000762816 in PortalRunUtility (portal=0x1d7a4e0, pstmt=0x1d05468, isTopLevel=<optimized out>, setHoldSnapshot=<optimized out>, dest=0x1d05548,
> completionTag=0x7ffc05e6c0a0 "") at pquery.c:1175
> #6 0x0000000000763267 in PortalRunMulti (portal=portal(at)entry=0x1d7a4e0, isTopLevel=isTopLevel(at)entry=true, setHoldSnapshot=setHoldSnapshot(at)entry=false, dest=dest(at)entry=0x1d05548,
> altdest=altdest(at)entry=0x1d05548, completionTag=completionTag(at)entry=0x7ffc05e6c0a0 "") at pquery.c:1328
> #7 0x0000000000763e45 in PortalRun (portal=<optimized out>, count=9223372036854775807, isTopLevel=<optimized out>, run_once=<optimized out>, dest=0x1d05548, altdest=0x1d05548,
> completionTag=0x7ffc05e6c0a0 "") at pquery.c:796
> #8 0x000000000075ff45 in exec_simple_query (query_string=<optimized out>) at postgres.c:1215
> #9 0x0000000000761212 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>, username=<optimized out>) at postgres.c:4236
> #10 0x0000000000483d02 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4431
> #11 BackendStartup (port=0x1d2b340) at postmaster.c:4122
> #12 ServerLoop () at postmaster.c:1704
> #13 0x00000000006f0b1f in PostmasterMain (argc=argc(at)entry=3, argv=argv(at)entry=0x1cff280) at postmaster.c:1377
> #14 0x0000000000484c93 in main (argc=3, argv=0x1cff280) at main.c:228
>
> bt f
>
> #0 WaitForLockersMultiple (locktags=locktags(at)entry=0x1d30548, lockmode=lockmode(at)entry=5, progress=progress(at)entry=true) at lmgr.c:911
> holder = 0x0
> lockholders = 0x1d9b778
> holders = <optimized out>
> lc = 0x1d9bf80
> total = <optimized out>
> done = 1
> #1 0x00000000005c2ac8 in ReindexRelationConcurrently (relationOid=relationOid(at)entry=17618, options=options(at)entry=0) at indexcmds.c:3090
> heapRelationIds = 0x1d30360
> indexIds = 0x1d303b0
> newIndexIds = <optimized out>
> relationLocks = <optimized out>
> lockTags = <optimized out>
> lc = 0x0
> lc2 = 0x0
> private_context = <optimized out>
> oldcontext = <optimized out>
> relkind = 105 'i'
> relationName = 0x0
> relationNamespace = 0x0
> ru0 = {tv = {tv_sec = 30592544, tv_usec = 7232025}, ru = {ru_utime = {tv_sec = 281483566645394, tv_usec = 75668733820930}, ru_stime = {tv_sec = 0, tv_usec = 30592272}, {
> ru_maxrss = 0, __ru_maxrss_word = 0}, {ru_ixrss = 0, __ru_ixrss_word = 0}, {ru_idrss = 105, __ru_idrss_word = 105}, {ru_isrss = -926385342574214656,
> __ru_isrss_word = -926385342574214656}, {ru_minflt = 8924839, __ru_minflt_word = 8924839}, {ru_majflt = 0, __ru_majflt_word = 0}, {ru_nswap = 17618,
> __ru_nswap_word = 17618}, {ru_inblock = 139781327898864, __ru_inblock_word = 139781327898864}, {ru_oublock = 30430312, __ru_oublock_word = 30430312}, {
> ru_msgsnd = 139781327898864, __ru_msgsnd_word = 139781327898864}, {ru_msgrcv = 139781327898864, __ru_msgrcv_word = 139781327898864}, {ru_nsignals = 139781327898864,
> __ru_nsignals_word = 139781327898864}, {ru_nvcsw = 139781327898864, __ru_nvcsw_word = 139781327898864}, {ru_nivcsw = 139781327898864,
> __ru_nivcsw_word = 139781327898864}}}
> __func__ = "ReindexRelationConcurrently"
> #2 0x00000000005c328a in ReindexIndex (indexRelation=<optimized out>, options=0, concurrent=<optimized out>) at indexcmds.c:2352
> state = {concurrent = true, locked_table_oid = 17608}
> indOid = 17618
> irel = <optimized out>
> persistence = 112 'p'
> ...

--
Justin Pryzby
System Administrator
Telsasoft
+1-952-707-8581

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Tom Lane 2019-10-13 16:21:12 Re: missing ML message (Re: v12.0: segfault in reindex CONCURRENTLY)
Previous Message Rodrigo Ramírez Norambuena 2019-10-07 15:10:01 Re: PGweb: Patches and tests