Re: [bug fix] Produce a crash dump before main() on Windows

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [bug fix] Produce a crash dump before main() on Windows
Date: 2018-02-20 15:43:49
Message-ID: CABUevEwBLHHrf=YZd31P8aD6xDGCaok60RL-8_+bvWKq95j3Gg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 20, 2018 at 3:18 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

> On 20 February 2018 at 21:47, Magnus Hagander <magnus(at)hagander(dot)net> wrote:
>
>>
>>
>> On Fri, Feb 16, 2018 at 8:28 AM, Tsunakawa, Takayuki <
>> tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
>>
>>> Hello,
>>>
>>> postgres.exe on Windows doesn't output a crash dump when it crashes
>>> before main() is called. The attached patch fixes this. I'd like this to
>>> be back-patched. I'll add this to the next CF.
>>>
>>> The original problem happened on our customer's production system.
>>> Their application sometimes failed to connect to the database. That was
>>> because postgres.exe crashed due to access violation (exception code
>>> C0000005). But there was no crash dump, so we had difficulty in finding
>>> the cause. The frequency was low -- about ten times during half a year.
>>>
>>> What caused the access violation was Symantec's antivirus software. It
>>> seems that sysfer.dll of the software intercepts registry access, during C
>>> runtime library initialization, before main() is called. So, the direct
>>> cause of this problem is not PostgreSQL.
>>>
>>> On the other hand, it's PostgreSQL's problem that we can't get the crash
>>> dump, which makes the investigation difficult. The cause is that
>>> postmaster calls SetErrorMode() to disable the outputing of crash dumps by
>>> WER (Windows Error Reporting). This error mode is inherited from
>>> postmaster to its children. If a crash happens before the child sets up
>>> the exception handler, no crash dump is produced.
>>>
>>
>> The original call to SetErrorMode() was put in there to make sure we
>> didn't show a popup message which would then make everything freeze (see
>> very old commit 27bff7502f04ee01237ed3f5a997748ae43d3a81). Doesn't this
>> turn that back on, so that if you are not actually there to monitor
>> something you can end up with stuck processes and exactly the issues we had
>> before that one?
>>
>
> Ha, I just went digging for the same.
>
> We should not disable WER when running as a service (no UI access), it
> will not display an interactive dialog.
>
> I'm not convinced we should disable it at all personally. Things have come
> a long way from drwatson.exe . Disabling WER makes it hard to debug
> postgres by installing Visual Studio Debugger as the hander (I always
> wondered why that didn't work!) and is generally just painful. It prevents
> us from collecting data via Microsoft about crashes, should we wish to do
> so. And who runs Pg on windows except as a service?!
>
>
I've seen a number of usecases where apps start it alongside the app
instead of as a service. I'm not sure how recent those apps are though, and
I'm not sure it's better than using a service in the first place (but it
does let you install things without being an admin).

We really shouldn't *break* that scenario for people. But making it work
well for the service usecase should definitely be the priority.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Zakirov 2018-02-20 15:46:57 Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs
Previous Message Fabrízio de Royes Mello 2018-02-20 15:38:56 Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT