Re: BUG #8579: CoreDump of background writer process

From: Rene Grün <rene(dot)gruen(at)cslab(dot)de>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>
Cc: <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #8579: CoreDump of background writer process
Date: 2013-11-06 16:25:53
Message-ID: 5AD33821D1C0AD479A69349F17DA5F555003B5@exchange.mbs.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Thank you very much for your help.

I will try to give more information.

We are using gcc 4.4.2 provided by QNX. It is the default-compiler for this system.
I have compiled the port in a pkgsrc-environment, so I can't tell you which optimizations are used.

There are no other matches for "background" in the logfiles.

We are using a normal qnx4-filesystem directly on the harddisc (ahci).

The SIGHUP only appears in the first example.

From the QNX-documentation for the function lseek:

Classification:

lseek() is POSIX 1003.1;

Safety:
Cancellation point No
Interrupt handler No
Signal handler Yes
Thread Yes

Mit freundlichen Grüßen aus Krefeld,
With best regards from Krefeld,

CS-Lab GmbH
i. A. René Grün

E-Mail: rgr(at)cslab(dot)de
Fon:    +49 2151 72949-0
Fax:    +49 2151 72949-9
---
CS-Lab GmbH (Creativ Software Labor GmbH)
Römerstr. 15
D-47809 Krefeld
Geschäftsführer: Dieter Schmitz
Registergericht Krefeld, HRB 12257, USt.-ID: DE 263 834 180

-----Ursprüngliche Nachricht-----
Von: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Gesendet: Mittwoch, 6. November 2013 16:57
An: Alvaro Herrera
Cc: rgr(at)cslab(dot)de; pgsql-bugs(at)postgresql(dot)org
Betreff: Re: [BUGS] BUG #8579: CoreDump of background writer process

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> I wonder if the problem is mishandling of signals -- i.e. perhaps the
> port is at fault, or maybe it was right in 8.3 but we changed
> something that would affect the port, and it wasn't properly updated to match.

The postmaster log looked like the problem was triggered by SIGHUP arriving while the bgwriter was doing an lseek(). It's not usual for seeks to be interruptable, though, unless maybe you're running the database over NFS? I tend to not trust that kind of arrangement much, mainly because NFS exposes you to all sorts of poorly-tested error recovery paths. Like this one.

Anyway, in theory the bgwriter ought to be able to recover from such an error. Somehow the local state of BackgroundWriterMain is getting messed up, though.

>> #0 0x00000000 in ?? ()
>> #1 0x08205ef4 in BackgroundWriterMain ()
>> #2 0x080e2759 in AuxiliaryProcessMain ()

> Not very useful, is it :-(

The OP did provide a stack trace with debug symbols, further down.

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Rene Grün 2013-11-06 16:51:17 Re: BUG #8579: CoreDump of background writer process
Previous Message Tom Lane 2013-11-06 15:57:17 Re: BUG #8579: CoreDump of background writer process