From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | David Steele <david(at)pgmasters(dot)net> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, tushar <tushar(dot)ahuja(at)enterprisedb(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: increasing the default WAL segment size |
Date: | 2017-03-23 00:08:57 |
Message-ID: | 20170323000857.GE9812@tamriel.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
David,
* David Steele (david(at)pgmasters(dot)net) wrote:
> On 3/22/17 3:45 PM, Robert Haas wrote:
> >On Wed, Mar 22, 2017 at 3:24 PM, David Steele <david(at)pgmasters(dot)net> wrote:
> >>>One of the reasons to go with the LSN is that we would actually be
> >>>maintaining what happens when the WAL files are 16MB in size.
> >>>
> >>>David's initial expectation was this for 64MB WAL files:
> >>>
> >>>000000010000000000000040
> >>>000000010000000000000080
> >>>0000000100000000000000CO
> >>>000000010000000100000000
> >>
> >>
> >>This is the 1GB sequence, actually, but idea would be the same for 64MB
> >>files.
> >
> >Wait, really? I thought you abandoned this approach because there's
> >then no principled way to handle WAL segments of less than the default
> >size.
>
> I did say that, but I thought I had hit on a compromise.
Strikes me as one, at least.
> But, as I originally pointed out the hex characters in the filename
> are not aligned correctly for > 8 bits (< 16MB segments) and using
> different alignments just made it less consistent.
>
> It would be OK if we were willing to drop the 1,2,4,8 segment sizes
> because then the alignment would make sense and not change the
> current 16MB sequence.
For my 2c, at least, it seems extremely unlikely that people are using
smaller-than-16MB segments. Also, we don't have to actually drop
support for those sizes, just handle the numbering differently, if we
feel like they're useful enough to keep- in particular I was thinking we
could make the filename one digit longer, or shift the numbers up one
position, but my general feeling is that it wouldn't ever be an
exercised use-case and therefore we should just drop support for them.
Perhaps I'm being overly paranoid, but I share David's concern about
non-standard/non-default WAL sizes being a serious risk due to lack of
exposure for those code paths, which is another reason that we should
change the default if we feel it's valuable to have a large WAL segment,
not just create this option which users can set at initdb time but which
we very rarely actually test to ensure it's working.
With any of these we need to have some buildfarm systems which are at
*least* running our regression tests against the different options, if
we would consider telling users to use them.
> Even then, there are some interesting side effects. For 1GB
> segments the "0000000100000001000000C0" segment would include LSNs
> 1/C0000000 through 1/FFFFFFFF. This is correct but is not an
> obvious filename to LSN mapping, at least for LSNs that appear later
> in the segment.
That doesn't seem unreasonable to me. If we're going to use the
starting LSN of the segment then it's going to skip when you start
varying the size of the segment. Even keeping the current scheme we end
up with skipping happening, it just different skipping and goes
"1, 2, 3, skip to the next!" where how high the count goes depends on
the size. With this approach, we're consistently skipping the same
amount which is exactly the size divided by 16MB, always.
I do also like Peter's suggestion also of using separator between the
components of the WAL filename, but that would change the naming for
everyone, which is a concern I can understand us wishing to avoid.
From a user-experience point of view, keeping the mapping from the WAL
filename to the starting LSN is quite nice, even if this change might
complicate the backend code a bit.
Thanks!
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2017-03-23 00:14:41 | Re: Hash support for grouping sets |
Previous Message | Peter Geoghegan | 2017-03-22 23:55:28 | Re: WIP: [[Parallel] Shared] Hash |