Re: Logical replication timeout

From: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
To: RECHTÉ Marc <marc(dot)rechte(at)meteo(dot)fr>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Logical replication timeout
Date: 2024-12-19 06:27:12
Message-ID: CANhcyEVbT2N-=Dwg-EENq4U=Eqf-rGf6ozfXrtvkc+jw_euYLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 12 Dec 2024 at 19:20, RECHTÉ Marc <marc(dot)rechte(at)meteo(dot)fr> wrote:
>
> Hi,
>
> Thanks for sharing the test case.
> Unfortunately I donot have a powerful machine which would generate
> such large number of spill files. But I created a patch as per your
> suggestion in point(2) in thread [1]. Can you test with this patch on
> your machine?
>
> With this patch instead of calling unlink for every wal segment, we
> are first reading the directory and filtering the files related to our
> transaction and then unlinking those files.
> You can apply the patch on your publisher source code and check. I
> have created this patch on top of Postgres 15.6.
>
> [1]: https://www.postgresql.org/message-id/1430556325.185731745.1731484846410.JavaMail.zimbra@meteo.fr
>
>
> Thanks and Regards,
> Shlok Kyal
>
> Thanks for the parch.
>
> I tried it, but it does not compile.
>
> Attached another version that I tested on PostgreSQL 17.2.
>
> This is much worse: it deletes only 3 files / s !
>
> The problem here, is that for a given xid, there is just one spill file to delete.
> ReorderBufferRestoreCleanup is called over a million times, so for each call,
> it has to open the directory and filter the one file to delete.
>
> By the way, you don't need a powerful machine to test, as spill files are very small.
>

Thanks for sharing the analysis.

I tested the patch on my machine as well and it has worse performance
for me as well.
I came up with an alternate approach. In this approach we keep track
of wal segment the transaction is part of. This helps to iterate
through only required files during clean up.

On my machine, I am running the testcase provided by you in [1]. It is
generating ~1.9 million spill files. For me the transaction completed
in 56sec.
Cleanup (deletion of spill files) took around following time:
With HEAD : ~ 5min
With latest patch (attached here) : ~2min

Can you test if this improves performance for you?

The patch applies on HEAD.

Thanks and Regards,
Shlok Kyal

Attachment Content-Type Size
track_wal_segments.patch application/octet-stream 7.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-12-19 06:28:21 in BeginCopyTo make materialized view using COPY TO instead of COPY (query).
Previous Message Tatsuo Ishii 2024-12-19 06:19:50 Re: Row pattern recognition