Re: [HACKERS] Re: [HACKERS] 答复:[HACKERS] 答复:[HACKERS] about fsync in CLOG buffer write

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, 周正中(德歌) <dege(dot)zzz(at)alibaba-inc(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, 张广舟(明虚) <guangzhou(dot)zgz(at)alibaba-inc(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, 范孝剑(康贤) <funnyxj(dot)fxj(at)alibaba-inc(dot)com>, 曾文旌(义从) <wenjing(dot)zwj(at)alibaba-inc(dot)com>, 窦贤明(执白) <xianming(dot)dxm(at)alibaba-inc(dot)com>, 萧少聪(铁庵) <shaocong(dot)xsc(at)alibaba-inc(dot)com>, 陈新坚(惧留孙) <xinjian(dot)chen(at)alibaba-inc(dot)com>
Subject: Re: [HACKERS] Re: [HACKERS] 答复:[HACKERS] 答复:[HACKERS] about fsync in CLOG buffer write
Date: 2015-09-09 14:46:47
Message-ID: CA+TgmobEGkAuaUENr6TUNO+gOouvnOn8cKqe9J_hiV10YinGag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 9, 2015 at 10:35 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> ... How often such a workload actually has to replace a *dirty* clog
>> buffer obviously depends on how often you checkpoint, but if you're
>> getting ~28k TPS you can completely fill 32 clog buffers (1 million
>> transactions) in less than 40 seconds, and you're probably not
>> checkpointing nearly that often.
>
> But by the same token, at that kind of transaction rate, no clog page is
> actively getting dirtied for more than a couple of seconds. So while it
> might get swapped in and out of the SLRU arena pretty often after that,
> this scenario seems unconvincing as a source of repeated fsyncs.

Well, if you're filling ~1 clog page per second, you're doing ~1 fsync
per second too. Or if you are not, then you are thrashing the
progressively smaller and smaller number of clean slots ever-harder
until no clean pages remain and you're forced to fsync something -
probably, a bunch of things all at once.

> Like Andres, I'd want to see a more realistic problem case before
> expending a lot of work here.

I think the question here isn't whether the problem case is realistic
- I am quite sure that a pgbench workload is - but rather how much of
a problem it actually causes. I'm very sure that individual pgbench
backends experience multi-second stallls as a result of this. What
I'm not sure about is how frequently it happens, and how much of an
effect it has on overall latency. I think it would be worth someone's
time to try to write some good instrumentation code here and figure
that out.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-09-09 14:49:32 Re: checkpointer continuous flushing
Previous Message Robert Haas 2015-09-09 14:39:17 Re: Parallel Seq Scan