Re: parallelizing the archiver

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: parallelizing the archiver
Date: 2021-09-10 05:28:20
Message-ID: FA9D5725-17CA-4476-8B3D-919BD32AFD96@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> 8 сент. 2021 г., в 03:36, Bossart, Nathan <bossartn(at)amazon(dot)com> написал(а):
>
> Anyway, I'm curious what folks think about this. I think it'd help
> simplify server administration for many users.

BTW this thread is also related [0].

My 2 cents.
It's OK if external tool is responsible for concurrency. Do we want this complexity in core? Many users do not enable archiving at all.
Maybe just add parallelism API for external tool?
It's much easier to control concurrency in external tool that in PostgreSQL core. Maintaining parallel worker is a tremendously harder than spawning goroutine, thread, task or whatever.
External tool needs to know when xlog segment is ready and needs to report when it's done. Postgres should just ensure that external archiever\restorer is running.
For example external tool could read xlog names from stdin and report finished files from stdout. I can prototype such tool swiftly :)
E.g. postgres runs ```wal-g wal-archiver``` and pushes ready segment filenames on stdin. And no more listing of archive_status and hacky algorithms to predict next WAL name and completition time!

Thoughts?

Best regards, Andrey Borodin.

[0] https://www.postgresql.org/message-id/flat/CA%2BTgmobhAbs2yabTuTRkJTq_kkC80-%2Bjw%3DpfpypdOJ7%2BgAbQbw%40mail.gmail.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2021-09-10 05:51:18 RE: Added schema level support for publication.
Previous Message Dilip Kumar 2021-09-10 05:24:04 Re: Toast compression method options