From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: Tracing down buildfarm "postmaster does not shut down" failures |
Date: | 2016-02-08 20:41:23 |
Message-ID: | 20160208204123.GA325488@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> Of late, by far the majority of the random-noise failures we see in the
> buildfarm have come from failure to shut down the postmaster in a
> reasonable timeframe.
I noticed that.
> An example is this current failure on hornet:
>
> http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hornet&dt=2016-02-08%2013%3A41%3A55
FWIW you seem to have edited this URL -- it returns a blank page.
The right one is
http://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hornet&dt=2016-02-08%2013%3A41%3A55&stg=stopdb-C-1
> What I'd like to do to investigate this is put in a temporary HEAD-only
> patch that makes ShutdownXLOG() and its subroutines much chattier about
> how far they've gotten and what time it is, and also makes pg_ctl print
> out the current time if it gives up waiting. A few failed runs with
> that in place will at least allow us to confirm or deny whether it's
> just that the shutdown checkpoint is sometimes really slow, or whether
> there's a bug lurking.
>
> Any objections? Anybody have another idea for data to collect?
Seems like a reasonable place to start. I assume you'll be installing
some debug-elog calls, enabled by default, and then once the problem is
fixed, we'll change the default to disabled but keep the actual calls?
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2016-02-08 20:42:06 | Re: [ADMIN] 9.5 new setting "cluster name" and logging |
Previous Message | Thomas Munro | 2016-02-08 20:36:17 | Re: [ADMIN] 9.5 new setting "cluster name" and logging |