Re: gitlab post-mortem: pg_basebackup waiting for checkpoint

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: gitlab post-mortem: pg_basebackup waiting for checkpoint
Date: 2017-02-18 00:43:27
Message-ID: CAKFQuwZQU6z6ei99oAjMQJJx5t1qphJkF-PjiEMp=hAUJV053Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Feb 17, 2017 at 4:22 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> What about adding a paragraph into pg_basebackup docs, explaining that
> with 'fast' it does immediate checkpoint, while with 'spread' it'll wait
> for a spread checkpoint.
>

I agree that a better, and self-contained, explanation of the behaviors
that fast and spread invoke on the server should be included directly in
the pg_basebackup docs.

Additionally, a primary benefit of pg_basebackup is hiding the low-level
details from the user and in that spirit the cross-reference link to
Section 25.3.3 "Making a Base Backup Using the Low Level API" should be
removed. If there is specific information there that a user of
pg_basebackup needs it should be presented properly in the application
documentation.

The top of pg_basebackup points to the entire 25.3 chapter but the flow
from there is solid - coverage of pg_basebackup occurs and points out the
low level API for those whose needs are not fully served by the bundled
application. If one uses pg_basebackup they should be able to stop at that
point, go back to the app page, and continue reading and skip all of 25.3.3

The term "spread checkpoint" isn't actually a defined term in our
docs...and aside from the word spread itself describing out a checkpoint
works, it isn't used outside of pg_basebackup docs. So "it will wait for a
spread checkpoint" doesn't really work - "it will start and then wait for a
normal checkpoint to complete" does.

More holistically (i.e., feel free to skip)

This paragraph from 25.3.3:

"""
This is because it performs a checkpoint, and the I/O required for the
checkpoint will be spread out over a significant period of time, by default
half your inter-checkpoint interval (see the configuration parameter
checkpoint_completion_target). This is usually what you want, because it
minimizes the impact on query processing. If you want to start the backup
as soon as possible, change the second parameter to true.
"""

is good but buried and seems like it would be more visible in Chapter 30.
Reliability and the Write-Ahead Log. To there both the internals and
backbackup pages could point the reader. There isn't a chapter dedicated
to checkpoints - nor does there need to be - but a section in 30 seems
warranted as being the official reference. Right now you have to skim the
configuration variables and "WAL Configuration" and "CHECKPOINT" and "base
backup API and pg_basebackup" to cover everything. A checkpoint chapter
with that paragraph as a focus would allow the other items to simply say
"immediate or normal checkpoint" as needed and redirect the reader for
additional context as to the trade-offs of each - whether done manually or
during some form of backup script.

David J.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-02-18 00:45:06 Re: [HACKERS] Small issue in online devel documentation build
Previous Message Peter Eisentraut 2017-02-18 00:38:13 Re: Official adoption of PGXN