Re: increasing the default WAL segment size

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: David Steele <david(at)pgmasters(dot)net>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: increasing the default WAL segment size
Date: 2017-04-06 22:52:08
Message-ID: 82c8bb32-d897-01e6-5ac8-adc01f5a9393@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 04/06/2017 11:45 PM, David Steele wrote:
> On 4/6/17 5:05 PM, Tomas Vondra wrote:
>> On 04/06/2017 08:33 PM, David Steele wrote:
>>> On 4/5/17 7:29 AM, Simon Riggs wrote:
>>>
>>>> 2. It's not clear to me the advantage of being able to pick varying
>>>> filesizes. I see great disadvantage in having too many options, which
>>>> greatly increases the chance of incompatibility, annoyance and
>>>> breakage. I favour a small number of values that have been shown by
>>>> testing to be sweet spots in performance and usability. (1GB has been
>>>> suggested)
>>>
>>> I'm in favor of 16,64,256,1024.
>>>
>>
>> I don't see a particular reason for this, TBH. The sweet spots will be
>> likely dependent hardware / OS configuration etc. Assuming there
>> actually are sweet spots - no one demonstrated that yet.
>
> Fair enough, but my feeling is that this patch has never been about
> server performance, per se. Rather, is is about archive management and
> trying to stem the tide of WAL as servers get bigger and busier.
> Generally, archive commands have to make a remote connection to offload
> WAL and that has a cost per segment.
>

Perhaps, although Robert also mentioned that the fsync at the end of
each WAL segment is noticeable. But the thread is a bit difficult to
follow, different people have different ideas about the motivation of
the patch, etc.

>> Also, I don't see how supporting additional WAL sizes increases chance
>> of incompatibility. We already allow that, so either the tools (e.g.
>> backup solutions) assume WAL segments are always 16MB (in which case are
>> essentially broken) or support valid file sizes (in which case they
>> should have no issues with the new ones).
>
> I don't see how a compile-time option counts as "supporting that" in
> practice. How many people in the field are running custom builds of
> Postgres? And of those, how many have changed the WAL segment size?
> I've never encountered a non-standard segment size or talked to anyone
> who has. I'm not saying it has *never* happened but I would venture to
> say it's rare.
>

I agree it's rare, but I don't think that means we can just consider the
option as 'unsupported'. We're even mentioning it in the docs as a valid
way to customize granularity of the WAL archival.

I certainly know people who run custom builds, and some of them run with
custom WAL segment size. Some of them are our customers, some are not.
And yes, some of them actually patched the code to allow 256MB WAL segments.

>> If we're going to do this, I'm in favor of deciding some reasonable
>> upper limit (say, 1GB or 2GB sounds good), and allowing all 2^n values
>> up to that limit.
>
> I'm OK with that. I'm also OK with providing a few reasonable choices.
> I guess that means I'll just go with the majority opinion.
>
>>>> 3. New file allocation has been a problem raised with this patch for
>>>> some months now.
>>>
>>> I've been playing around with this and I don't think short tests show
>>> larger sizes off to advantage. Larger segments will definitely perform
>>> more poorly until Postgres starts recycling WAL. Once that happens I
>>> think performance differences should be negligible, though of course
>>> this needs to be verified with longer-running tests.
>>>
>> I'm willing to do some extensive performance testing on the patch. I
>> don't see how that could happen in the next few day (before the feature
>> freeze), particularly considering we're interested in long tests.
>
> Cool. I've been thinking about how to do some meaningful tests for this
> (mostly pgbench related). I'd like to hear what you are thinking.
>

My plan was to do some pgbench tests with different workloads, scales
(in shared buffers, in RAM, exceeds RAM), and different storage
configurations (SSD vs. HDD, WAL/datadir on the same/different
device/fs, possibly also ext4/xfs).

>> The question however is whether we need to do this testing when we don't
>> actually change the default (at least the patch submitted on 3/27 does
>> seem to keep the 16MB). I assume people specifying a custom value when
>> calling initdb are expected to know what they are doing (and I don't see
>> how we can prevent distros from choosing a bad value in their packages -
>> they could already do that with configure-time option).
>
> Just because we don't change the default doesn't mean that others won't.
> I still think testing for sizes other than 16MB is severely lacking and
> I don't believe caveat emptor is the way to go.
>

Aren't you mixing regression and performance testing? I agree we need to
be sure all segment sizes are handled correctly, no argument here.

>> Do we actually have any infrastructure for that? Or do you plan to add
>> some new animals with different WAL segment sizes?
>
> I don't have plans to add animals. I think we'd need a way to tell
> 'make check' to use a different segment size for tests and then
> hopefully reconfigure some of the existing animals.
>

OK. My point was that we don't have that capability now, and the latest
patch is not adding it either.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-04-06 23:01:32 Re: Time to change pg_regress diffs to unified by default?
Previous Message Andrew Dunstan 2017-04-06 22:36:37 Re: Time to change pg_regress diffs to unified by default?