Re: Reduce pinning in btree indexes

From: Kevin Grittner <kgrittn(at)ymail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: "pg(at)heroku(dot)com" <pg(at)heroku(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reduce pinning in btree indexes
Date: 2015-03-16 13:23:58
Message-ID: 1380407963.252852.1426512238114.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Thank you for rewriting.

Thank *you* for pointing out where more work was needed in the
comments and README!

> I attached the your latest patch to this mail as
> bt-nopin-v4.patch for now. Please check that there's no problem
> in it.

I checked out master, applied the patch and checked it against my
latest code (merged with master), and it matched. So, looks good.

> - By this patch, index scan becomes to release buffer pins while
> fetching index tuples in a page, so it should reduce the chance
> of index scans with long duration to block vacuum, because
> vacuums now can easily overtake the current position of an
> index scan. I didn't actually measured how effective it is,
> though.

That part got pretty thorough testing on end-user software. The
whole reason for writing the patch was that after applying the
snapshot-too-old PoC patch they still saw massive bloat because all
autovacuum workers blocked behind cursors which were left idle.
The initial version of this patch fixed that, preventing (in
combination with the other patch) uncontrolled bloat in a 48 hour
test.

> - It makes no performance deterioration, on the contrary it
> accelerates index scans. It seems to be because of removal of
> lock and unlock surrounding _bt_steppage in bt_next.

I fear that the performance improvement may only show up on forward
scans -- because the whole L&Y algorithm depends on splits only
moving right, walking right is almost trivial to perform correctly
with minimal locking and pinning. Our left pointers help with
scanning backward, but there are a lot of conditions that can
complicate that, leaving us with the choice between keeping some of
the locking or potentially scanning more pages than we now do on a
backward scan. Since it wasn't clear how many cases would benefit
from a change and how many would lose, I pretty much left the
backward scan locking alone in this patch. Changing the code to
work the other way would not be outrageously difficult; but
benchmarking to determine whether the alternative was a net win
would be pretty tricky and very time-consuming. If that is to be
done, I strongly feel it should be a separate patch.

Because this patch makes a forward scan faster without having much
affect on a backward scan, the performance difference between them
(which has always existed) will get a little wider. I wonder
whether this difference should perhaps be reflected in plan
costing.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Voronin 2015-03-16 13:24:56 Question about TEMP tables
Previous Message Petr Jelinek 2015-03-16 13:22:57 Re: Using 128-bit integers for sum, avg and statistics aggregates