Re: [PATCH v1] parallel pg_restore: avoid disk seeks when jumping short distance forward

From: Dimitrios Apostolou <jimis(at)gmx(dot)net>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: [PATCH v1] parallel pg_restore: avoid disk seeks when jumping short distance forward
Date: 2025-04-01 22:25:25
Message-ID: 5B5099C6-A424-4F6C-886F-EF545C7FA7E8@gmx.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks. This is the first value I tried and it works well. In the archive I have all blocks seem to be between 8 and 20KB so the jump forward before the change never even got close to 1MB. Could it be bigger in an uncompressed archive? Or in a future pg_dump that raises the block size? I don't really know, so it is difficult to test such scenario but it made sense to guard against these cases too.

I chose 1MB by basically doing a very crude calculation in my mind: when would it be worth seeking forward instead of reading? On very slow drives 60MB/s sequential and 60 IOPS for random reads is a possible speed. In that worst case it would be better to seek() forward for lengths of over 1MB.

On 1 April 2025 22:04:00 CEST, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>On Tue, Apr 01, 2025 at 09:33:32PM +0200, Dimitrios Apostolou wrote:
>> It didn't break any test, but I also don't see any difference, the
>> performance boost is noticeable only when restoring a huge archive that is
>> missing offsets.
>
>This seems generally reasonable to me, but how did you decide on 1MB as the
>threshold? Have you tested other values? Could the best threshold vary
>based on the workload and hardware?
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-04-01 22:25:28 Re: AIO v2.5
Previous Message Matthias van de Meent 2025-04-01 22:08:55 Re: Adding skip scan (including MDAM style range skip scan) to nbtree