Re: Logical archiving

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Euler Taveira <euler(dot)taveira(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, boris(dot)novikov(at)acm(dot)org
Subject: Re: Logical archiving
Date: 2020-12-04 17:36:25
Message-ID: 51EBF3DB-98F7-4828-8A43-491726E06384@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Euler!

Thanks for your response.

> 4 дек. 2020 г., в 22:14, Euler Taveira <euler(dot)taveira(at)2ndquadrant(dot)com> написал(а):
>
> On Fri, 4 Dec 2020 at 04:33, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
> I was discussing problems of CDC with scientific community and they asked this simple question: "So you have efficient WAL archive on a very cheap storage, why don't you have a logical archive too?"
>
> WAL archive doesn't process data; it just copies from one location into another one. However, "logical archive" must process data.
WAL archiving processes data: it does compression, encryption and digesting. Only minimal impractical setup will copy data as is. However I agree, that all processing is done outside postgres.

> If we could just run archive command ```archive-tool wal-push 0000000900000F2C000000E1.logical``` with contents of logical replication - this would be super cool for OLAP. I'd prefer even avoid writing 0000000900000F2C000000E1.logical to disk, i.e. push data on stdio or something like that.
>
> The most time consuming process is logical decoding, mainly due to long running transactions.
Currently I do not experience problem of high CPU utilisation.

> In order to minimize your issue, we should improve the logical decoding mechanism.
No, the issue I'm facing comes from the fact that corner cases of failover are not solved properly for logical replication. Timelines, partial segments, archiving along with streaming, starting from arbitrary LSN (within available WAL), rewind, named restore points, cascade replication etc etc. All these nice things are there for WAL and are missing for LR. I'm just trying to find shortest path through this to make CDC(changed data capture) work.

> There was a discussion about allowing logical decoding on the replica that would probably help your use case a lot.
I will look there more closely, thanks! But it's only part of a solution.

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-12-04 18:02:31 Re: pg_dump, ATTACH, and independently restorable child partitions
Previous Message Pavel Borisov 2020-12-04 17:31:14 Re: [PATCH] Covering SPGiST index