From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Önder Kalacı <onderkalaci(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "andres(at)anarazel(dot)de" <andres(at)anarazel(dot)de>, "vignesh21(at)gmail(dot)com" <vignesh21(at)gmail(dot)com>, "shveta(dot)malik(at)gmail(dot)com" <shveta(dot)malik(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, "dilipbalaut(at)gmail(dot)com" <dilipbalaut(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, "euler(at)eulerto(dot)com" <euler(at)eulerto(dot)com>, "m(dot)melihmutlu(at)gmail(dot)com" <m(dot)melihmutlu(at)gmail(dot)com>, "marcos(at)f10(dot)com(dot)br" <marcos(at)f10(dot)com(dot)br>, 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com> |
Subject: | RE: Time delayed LR (WAS Re: logical replication restrictions) |
Date: | 2023-03-17 13:11:58 |
Message-ID: | TYAPR01MB5866D871F60DDFD8FAA2CDE4F5BD9@TYAPR01MB5866.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi hackers,
I have made a rough prototype that can serialize changes to permanent file and
apply after time elapsed from v30 patch. I think the 2PC and restore mechanism
needs more analysis, but I can share codes for discussion. How do you think?
## Interfaces
Not changed from old versions. The subscription parameter "min_apply_delay" is
used to enable the time-delayed logical replication.
## Advantages
Two big problems are solved.
* Apply worker can respond from walsender's keepalive while delaying application.
This is because the process will not sleep.
* Publisher can recycle WALs even if a transaction related with the WAL is not
applied yet. This is because the apply worker flush all the changes to file
and reply that WALs are flushed.
## Disadvantages
Code complexity.
## Basic design
The basic idea is quite simple - create a new file when apply worker receive
BEGIN message, write received changes, and flush them when COMMIT message is come.
The delayed transaction is checked its commit time for every main loop, and applied
when the time exceeds the min_apply_delay.
To handle files APIs that uses plain kernel FDs was used. This approach is
similar to physical walreceiver process. Apart from the physical one, worker
does not flush for every messages - it is done at the end of the transaction.
### For 2PC
The delay is started since COMMIT PREPARED is come. But to avoid the
long-lock-holding issue, the prepared transaction is just written into file
without applying.
When BEGIN PREPARE is received, same as normal transactions, the worker creates
a file and starts to write changes. If we reach the PREPARE message, just writes
a message into file, flushes, and just closes it. This means that no transactions
are prepared on subscriber. When COMMIT PREPARED is received, the worker opens the
file again and write the message. After that we treat same as normal committed
transaction.
### For streamed transaction
Do no special thing when the streaming transaction is come. When it is committed
or prepared, read all the changes and write into permanent file. To read and
write changes apply_spooled_changes() is used, which means the basic workflow
is not changed.
### Restore from files
To check the elapsed time from the commit, all commit_time of delayed transactions
must be stored in the memory. Basically it can store when the worker handle COMMIT
message, but it must do special treatment for restarting.
When an apply worker receives COMMIT/PREPARE/COMMIT PREPARED message, it writes
the message, flush them, and cache the commit_time. When worker restarts, it open
files, check the final message (this is done by seeking some bytes from end of
the file), and then cache the written commit_time.
## Restrictions
* The combination with ALTER SUBSCRIPTION .. SKIP LSN is not considered.
Thanks for Osumi-san to help implementing.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachment | Content-Type | Size |
---|---|---|
0001-WIP-Time-delayed-logical-replication-by-serializing-.patch | application/octet-stream | 105.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2023-03-17 13:43:21 | Re: Commitfest 2023-03 starting tomorrow! |
Previous Message | Aleksander Alekseev | 2023-03-17 12:31:33 | Re: HOT chain validation in verify_heapam() |