From: | "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> |
---|---|
To: | 'Tom Lane' <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [RFC] Should we fix postmaster to avoid slow shutdown? |
Date: | 2016-11-24 05:41:08 |
Message-ID: | 0A3221C70F24FB45833433255569204D1F65A8D6@G01JPEXMBYT05 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
From: pgsql-hackers-owner(at)postgresql(dot)org
> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Tom Lane
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > I agree. However, in many cases, the major cost of a fast shutdown is
> > getting the dirty data already in the operating system buffers down to
> > disk, not in writing out shared_buffers itself. The latter is
> > probably a single-digit number of gigabytes, or maybe double-digit.
> > The former might be a lot more, and the write of the pgstat file may
> > back up behind it. I've seen cases where an 8kB buffered write from
> > Postgres takes tens of seconds to complete because the OS buffer cache
> > is already saturated with dirty data, and the stats files could easily
> > be a lot more than that.
>
> I think this is mostly FUD, because we don't fsync the stats files. Maybe
> we should, but we don't today. So even if we have managed to get the system
> into a state where physical writes are heavily backlogged, that's not a
> reason to assume that the stats collector will be unable to do its thing
> promptly. All it has to do is push a relatively small amount of data into
> kernel buffers.
I'm sorry for my late reply, yesterday was a national holiday in Japan.
It's not FUD. I understand you hit the slow stats file write problem during some regression test. You said it took 57 seconds to write the stats file during the postmaster shutdown. That caused pg_ctl stop to fail due to its 60 second timeout. Even the regression test environment suffered from the trouble.
Regards
Takayuki Tsunakawa
From | Date | Subject | |
---|---|---|---|
Next Message | Tsunakawa, Takayuki | 2016-11-24 05:57:16 | Re: [RFC] Should we fix postmaster to avoid slow shutdown? |
Previous Message | Ashutosh Bapat | 2016-11-24 05:39:22 | Re: Declarative partitioning - another take |