Re: Avoiding smgrimmedsync() during nbtree index builds

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Avoiding smgrimmedsync() during nbtree index builds
Date: 2021-11-23 20:51:51
Message-ID: CAAKRu_aRDndtaxmV4JNoUa6c_nUoKxsXnX59YtET4YAtUxYTQQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 19, 2021 at 3:11 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Mon, May 3, 2021 at 5:24 PM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> >
> > So, I've written a patch which avoids doing the immediate fsync for
> > index builds either by using shared buffers or by queueing sync requests
> > for the checkpointer. If a checkpoint starts during the index build and
> > the backend is not using shared buffers for the index build, it will
> > need to do the fsync.
>
> I've attached a rebased version of the patch (old patch doesn't apply).
>
> With the patch applied (compiled at O2), creating twenty empty tables in
> a transaction with a text column and an index on another column (like in
> the attached SQL [make a test_idx schema first]) results in a fairly
> consistent 15-30% speedup on my laptop (timings still in tens of ms -
> avg 50 ms to avg 65 ms so run variation affects the % a lot).
> Reducing the number of fsync calls from 40 to 1 was what likely causes
> this difference.

Correction for the above: I haven't worked on mac in a while and didn't
realize that wal_sync_method=fsync was not enough to ensure that all
buffered data would actually be flushed to disk on mac (which was
required for my test).

Setting wal_sync_method to fsync_writethrough with my small test I see
over a 5-6X improvement in time taken - from 1 second average to 0.2
seconds average. And running Andres' "createlots.sql" test, I see around
a 16x improvement - from around 11 minutes to around 40 seconds. I ran
it on a laptop running macos and other than wal_sync_method, I only
changed shared_buffers (to 1GB).

- Melanie

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Schneider 2021-11-23 21:08:40 Re: Sequence's value can be rollback after a crashed recovery.
Previous Message Alvaro Herrera 2021-11-23 20:40:35 Re: prevent immature WAL streaming