From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
Cc: | "Bossart, Nathan" <bossartn(at)amazon(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: parallelizing the archiver |
Date: | 2021-09-10 17:09:51 |
Message-ID: | CA+TgmoZXBnH65OWB1L4cUCjZp2MX-t7s4e27BHhy9-6S8pCHYA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 10, 2021 at 11:49 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> I totally agree that batching as many file as possible in a single
> command is probably what's gonna achieve the best performance. But if
> the archiver only gets an answer from the archive_command once it
> tried to process all of the file, it also means that postgres won't be
> able to remove any WAL file until all of them could be processed. It
> means that users will likely have to limit the batch size and
> therefore pay more startup overhead than they would like. In case of
> archiving on server with high latency / connection overhead it may be
> better to be able to run multiple commands in parallel. I may be
> overthinking here and definitely having feedback from people with more
> experience around that would be welcome.
That's a fair point. I'm not sure how much it matters, though. I think
you want to imagine a system where there are let's say 10 WAL flies
being archived per second. Using fork() + exec() to spawn a shell
command 10 times per second is a bit expensive, whether you do it
serially or in parallel, and even if the command is something with a
less-insane startup overhead than scp. If we start a shell command say
every 3 seconds and give it 30 files each time, we can reduce the
startup costs we're paying by ~97% at the price of having to wait up
to 3 additional seconds to know that archiving succeeded for any
particular file. That sounds like a pretty good trade-off, because the
main benefit of removing old files is that it keeps us from running
out of disk space, and you should not be running a busy system in such
a way that it is ever within 3 seconds of running out of disk space,
so whatever.
If on the other hand you imagine a system that's not very busy, say 1
WAL file being archived every 10 seconds, then using a batch size of
30 would very significantly delay removal of old files. However, on
this system, batching probably isn't really needed. The rate of WAL
file generation is low enough that if you pay the startup cost of your
archive_command for every file, you're probably still doing just fine.
Probably, any kind of parallelism or batching needs to take this kind
of time-based thinking into account. For batching, the rate at which
files are generated should affect the batch size. For parallelism, it
should affect the number of processes used.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Zhihong Yu | 2021-09-10 17:11:34 | incorrect file name in backend_progress.c header |
Previous Message | Jacob Champion | 2021-09-10 17:07:01 | Re: parallelizing the archiver |