Re: Logical replication timeout

From: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To: RECHTÉ Marc <marc(dot)rechte(at)meteo(dot)fr>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Logical replication timeout
Date: 2024-12-12 07:16:47
Message-ID: CANhcyEVV=LseM2CKyQ72KMWzOV2LpqTRAS08yp1A5qM_gsGNkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 11 Dec 2024 at 14:29, RECHTÉ Marc <marc(dot)rechte(at)meteo(dot)fr> wrote:
>
> This how to reproduce the problem.
>
> Session 1:
>
> psql -c "CREATE TABLE test (i int)" -c "INSERT INTO test SELECT generate_series(1, 2_000_000)"
>
> Session 2:
>
> pg_recvlogical -d postgres --slot=test --create-slot
> pg_recvlogical -d postgres --slot=test --start -f -
>
>
> Session 3:
>
> cd data/pg_repslots
> watch 'ls test | wc -l'
>
>
> Session 1:
>
> date
> time psql -c "BEGIN" -c "
> DO LANGUAGE plpgsql
> \$\$
> DECLARE
> cur CURSOR FOR SELECT * FROM test FOR UPDATE;
> rec record;
> BEGIN
> FOR rec IN cur LOOP
> BEGIN
> UPDATE test SET i = i + 1 WHERE CURRENT OF cur;
> EXCEPTION
> WHEN no_data_found THEN
> RAISE NOTICE 'no data found exception';
> END;
> END LOOP;
> END;
> \$\$
> " -c "ROLLBACK"
>
> date
> mer. 11 déc. 2024 08:59:03 CET
> BEGIN
> DO
> ROLLBACK
>
> real 0m17,071s
> user 0m0,003s
> sys 0m0,000s
> mer. 11 déc. 2024 08:59:21 CET
>
>
> Session 3: Watch session
>
> Count increases up to
>
> Wed Dec 11 09:00:02 2024
> 1434930
>
> Then decreases down to 1
>
> Wed Dec 11 09:03:17 2024
> 1
>
> Session 2:
>
> Appears last (after spill files deleted)
>
> BEGIN 12874409
> COMMIT 12874409
>
>
> Conclusion:
>
> - The exception block is responsible for generating subtransactions
> - Although the transaction lasted 17s, one can see that the decoding was a bit late (40 seconds), but
> - spent an extra 200s to delete the spill files !
>

Hi,

Thanks for sharing the test case.
Unfortunately I donot have a powerful machine which would generate
such large number of spill files. But I created a patch as per your
suggestion in point(2) in thread [1]. Can you test with this patch on
your machine?

With this patch instead of calling unlink for every wal segment, we
are first reading the directory and filtering the files related to our
transaction and then unlinking those files.
You can apply the patch on your publisher source code and check. I
have created this patch on top of Postgres 15.6.

[1]: https://www.postgresql.org/message-id/1430556325.185731745.1731484846410.JavaMail.zimbra@meteo.fr

Thanks and Regards,
Shlok Kyal

Attachment Content-Type Size
scan_dir.patch application/x-patch 2.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-12-12 07:40:32 Re: attndims, typndims still not enforced, but make the value within a sane threshold
Previous Message Sutou Kouhei 2024-12-12 06:25:41 Re: confusing / inefficient "need_transcoding" handling in copy