Re: WAL prefetch

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Sean Chittenden <seanc(at)joyent(dot)com>
Subject: Re: WAL prefetch
Date: 2018-06-14 03:45:53
Message-ID: CAA4eK1+cZM0yVb=d000_5B0++-P+QE+wvbgpWMYaK_c+mt1Rcw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 13, 2018 at 6:39 PM, Konstantin Knizhnik
<k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
> There was very interesting presentation at pgconf about pg_prefaulter:
>
> http://www.pgcon.org/2018/schedule/events/1204.en.html
>
> But it is implemented in GO and using pg_waldump.
> I tried to do the same but using built-on Postgres WAL traverse functions.
> I have implemented it as extension for simplicity of integration.
> In principle it can be started as BG worker.
>

Right or in other words, it could do something like autoprewarm [1]
which can allow a more user-friendly interface for this utility if we
decides to include it.

> First of all I tried to estimate effect of preloading data.
> I have implemented prefetch utility with is also attached to this mail.
> It performs random reads of blocks of some large file and spawns some number
> of prefetch threads:
>
> Just normal read without prefetch:
> ./prefetch -n 0 SOME_BIG_FILE
>
> One prefetch thread which uses pread:
> ./prefetch SOME_BIG_FILE
>
> One prefetch thread which uses posix_fadvise:
> ./prefetch -f SOME_BIG_FILE
>
> 4 prefetch thread which uses posix_fadvise:
> ./prefetch -f -n 4 SOME_BIG_FILE
>
> Based on this experiments (on my desktop), I made the following conclusions:
>
> 1. Prefetch at HDD doesn't give any positive effect.
> 2. Using posix_fadvise allows to speed-up random read speed at SSD up to 2
> times.
> 3. posix_fadvise(WILLNEED) is more efficient than performing normal reads.
> 4. Calling posix_fadvise in more than one thread has no sense.
>
> I have tested wal_prefetch at two powerful servers with 24 cores, 3Tb NVME
> RAID 10 storage device and 256Gb of RAM connected using InfiniBand.
> The speed of synchronous replication between two nodes is increased from 56k
> TPS to 60k TPS (on pgbench with scale 1000).
>

That's a reasonable improvement.

> Usage:
> 1. At master: create extension wal_prefetch
> 2. At replica: Call pg_wal_prefetch() function: it will not return until you
> interrupt it.
>

I think it is not a very user-friendly interface, but the idea sounds
good to me, it can help some other workloads. I think this can help
in recovery as well.

[1] - https://www.postgresql.org/docs/devel/static/pgprewarm.html

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

  • WAL prefetch at 2018-06-13 13:09:45 from Konstantin Knizhnik

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yinjie Lin 2018-06-14 04:12:16 Two round for Client Authentication
Previous Message Charles Cui 2018-06-14 03:00:42 [GSoC] current working status