From: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: checkpointer continuous flushing |
Date: | 2015-06-02 16:59:05 |
Message-ID: | alpine.DEB.2.10.1506021848420.19484@sto |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>>> IMO this feature, if done correctly, should result in better performance
>>> in 95+% of the workloads
>>
>> To demonstrate that would require time...
>
> Well, that's part of the contribution process. Obviously you can't test
> 100% of the problems, but you can work hard with coming up with very
> adversarial scenarios and evaluate performance for those.
I did spent time (well, a machine spent time, really) to collect some
convincing data for the simple version without sorting to demonstrate that
it brings a clear value, which seems not to be enough...
> I don't think we want yet another tuning knob that's hard to tune
> because it's critical for one factor (latency) but bad for another
> (throughput); especially when completely unnecessarily.
Hmmm.
My opinion is that throughput is given too much attention in general, but
if both can be kept/improved, this would be easier to sell, obviously.
>>> It's also not just the sequential writes making this important, it's also
>>> that it allows to do the final fsync() of the individual segments as soon
>>> as their last buffer has been written out.
>>
>> Hmmm... I'm not sure this would have a large impact. The writes are
>> throttled as much as possible, so fsync will catch plenty other writes
>> anyway, if there are some.
>
> That might be the case in a database with a single small table;
> i.e. where all the writes go to a single file. But as soon as you have
> large tables (i.e. many segments) or multiple tables, a significant part
> of the writes issued independently from checkpointing will be outside
> the processing of the individual segment.
Statistically, I think that it would reduce the number of unrelated writes
taken in a fsync by about half: the last table to be written on a
tablespace, at the end of the checkpoint, will have accumulated
checkpoint-unrelated writes (bgwriter, whatever) from the whole checkpoint
time, while the first table will have avoided most of them.
--
Fabien.
From | Date | Subject | |
---|---|---|---|
Next Message | Christian Ullrich | 2015-06-02 17:29:48 | Re: pg_xlog -> pg_xjournal? |
Previous Message | Andrew Dunstan | 2015-06-02 16:56:27 | Re: Re: [COMMITTERS] pgsql: Map basebackup tablespaces using a tablespace_map file |