Re: Logical replication timeout

From: RECHTÉ Marc <marc(dot)rechte(at)meteo(dot)fr>
To: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Logical replication timeout
Date: 2024-12-12 13:50:49
Message-ID: 648326112.211129709.1734011449684.JavaMail.zimbra@meteo.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for sharing the test case.
Unfortunately I donot have a powerful machine which would generate
such large number of spill files. But I created a patch as per your
suggestion in point(2) in thread [1]. Can you test with this patch on
your machine?

With this patch instead of calling unlink for every wal segment, we
are first reading the directory and filtering the files related to our
transaction and then unlinking those files.
You can apply the patch on your publisher source code and check. I
have created this patch on top of Postgres 15.6.

[1]: https://www.postgresql.org/message-id/1430556325.185731745.1731484846410.JavaMail.zimbra@meteo.fr

Thanks and Regards,
Shlok Kyal

Thanks for the parch.

I tried it, but it does not compile.

Attached another version that I tested on PostgreSQL 17.2.

This is much worse: it deletes only 3 files / s !

The problem here, is that for a given xid, there is just one spill file to delete.
ReorderBufferRestoreCleanup is called over a million times, so for each call,
it has to open the directory and filter the one file to delete.

By the way, you don't need a powerful machine to test, as spill files are very small.

Marc

Attachment Content-Type Size
scan_dir2.patch text/x-patch 1.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-12-12 14:02:38 Re: per backend I/O statistics
Previous Message Thomas Munro 2024-12-12 13:29:53 Re: connection establishment versus parallel workers