Quick Links

Re: checkpointer continuous flushing

From:	Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To:	Andres Freund <andres(at)anarazel(dot)de>
Cc:	PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: checkpointer continuous flushing
Date:	2015-11-12 18:09:57
Message-ID:	alpine.DEB.2.10.1511121851510.20444@sto
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hello,

>> Basically yes, I'm suggesting a mutex in the vdf struct.
>
> I can't see that being ok. I mean what would that thing even do? VFD
> isn't shared between processes, and if we get a smgr flush we have to
> apply it, or risk breaking other things.

Probably something is eluding my comprehension:-)

My basic assumption is that the fopen & fd is per process, so we just have
to deal with the one in the checkpointer process, so it is enough that the
checkpointer does not close the file while it is flushing things to it?

>>> * my laptop, 16 GB Ram, 840 EVO 1TB as storage. With 2GB
>>> shared_buffers. Tried checkpoint timeouts from 60 to 300s.
>>
>> Hmmm. This is quite short.
>
> Indeed. I'd never do that in a production scenario myself. But
> nonetheless it showcases a problem.

I would say that it would render sorting ineffective because all the
rewriting is done by bgwriter or workers, which does not totally explain
why the throughput would be worst than before, I would expect it to be as
bad as before...

>>> Well, you can't easily sort bgwriter/backend writes stemming from cache
>>> replacement. Unless your access patterns are entirely sequential the
>>> data in shared buffers will be laid out in a nearly entirely random
>>> order. We could try sorting the data, but with any reasonable window,
>>> for many workloads the likelihood of actually achieving much with that
>>> seems low.
>>
>> Maybe the sorting could be shared with others so that everybody uses the
>> same order?
>>
>> That would suggest to have one global sorting of buffers, maybe maintained
>> by the checkpointer, which could be used by all processes that need to scan
>> the buffers (in file order), instead of scanning them in memory order.
>
> Uh. Cache replacement is based on an approximated LRU, you can't just
> remove that without serious regressions.

I understand that, but there is a balance to find. Generating random I/Os
is very bad for performance, so the decision process must combine LRU/LFU
heuristics with considering things in some order as well.

>>>> Hmmm. The shorter the timeout, the more likely the sorting NOT to be
>>>> effective
>>>
>>> You mean, as evidenced by the results, or is that what you'd actually
>>> expect?
>>
>> What I would expect...
>
> I don't see why then? If you very quickly writes lots of data the OS
> will continously flush dirty data to the disk, in which case sorting is
> rather important?

What I have in mind is: the shorter the timeout the less neighboring
buffers will be touched, so the less nice sequential writes will be found
by sorting them, so the worst the positive impact on performance...

--
Fabien.

In response to

Re: checkpointer continuous flushing at 2015-11-12 17:05:40 from Andres Freund

Responses

Re: checkpointer continuous flushing at 2015-11-12 19:25:34 from Fabien COELHO

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Thomas Munro	2015-11-12 18:25:42	Re: Proposal: "Causal reads" mode for load balancing reads without stale data
Previous Message	Pavel Stehule	2015-11-12 17:10:46	Re: psql: add \pset true/false