Re: Deleting older versions in unique indexes to avoid page splits

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Victor Yegorov <vyegorov(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Deleting older versions in unique indexes to avoid page splits
Date: 2021-01-20 13:33:23
Message-ID: CAA4eK1JDqt4Qe2jFNgv2nTo0w_g_HRNObqKxDPCciN9GK1VJDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jan 20, 2021 at 10:58 AM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Tue, Jan 19, 2021 at 7:54 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > The worst cases could be (a) when there is just one such duplicate
> > (indexval logically unchanged) on the page and that happens to be the
> > last item and others are new insertions, (b) same as (a) and along
> > with it lets say there is an open transaction due to which we can't
> > remove even that duplicate version. Have we checked the overhead or
> > results by simulating such workloads?
>
> There is no such thing as a workload that has page splits caused by
> non-HOT updaters, but almost no actual version churn from the same
> non-HOT updaters. It's possible that a small number of individual page
> splits will work out like that, of course, but they'll be extremely
> rare, and impossible to see in any kind of consistent way.
>
> That just leaves long running transactions. Of course it's true that
> eventually a long-running transaction will make it impossible to
> perform any cleanup, for the usual reasons. And at that point this
> mechanism is bound to fail (which costs additional cycles -- the
> wasted access to a single heap page, some CPU cycles). But it's still
> a bargain to try. Even with a long running transactions there will be
> a great many bottom-up deletion passes that still succeed earlier on
> (because at least some of the dups are deletable, and we can still
> delete those that became garbage right before the long running
> snapshot was acquired).
>

How many of the dups are deletable till there is an open long-running
transaction in the system before the transaction that has performed an
update? I tried a simple test to check this.

create table t1(c1 int, c2 int, c3 char(10));
create index idx_t1 on t1(c1);
create index idx_t2 on t1(c2);

insert into t1 values(generate_series(1,5000),1,'aaaaaa');
update t1 set c2 = 2;

The update will try to remove the tuples via bottom-up cleanup
mechanism for index 'idx_t1' and won't be able to remove any tuple
because the duplicates are from the same transaction.

> Victor independently came up with a benchmark that ran over several
> hours, with cleanup consistently held back by ~5 minutes by a long
> running transaction:
>

AFAICS, the long-running transaction used in the test is below:
SELECT abalance, pg_sleep(300) FROM pgbench_accounts WHERE mtime >
now() - INTERVAL '15min' ORDER BY aid LIMIT 1;

This shouldn't generate a transaction id so would it be sufficient to
hold back the clean-up?

> https://www.postgresql.org/message-id/CAGnEbogATZS1mWMVX8FzZHMXzuDEcb10AnVwwhCtXtiBpg3XLQ@mail.gmail.com
>
> This was actually one of the most favorable cases of all for the patch
> -- the patch prevented logically unchanged indexes from growing (this
> is a mix of inserts, updates, and deletes, not just updates, so it was
> less than perfect -- we did see the indexes grow by a half of one
> percent). Whereas without the patch each of the same 3 indexes grew by
> 30% - 60%.
>
> So yes, I did think about long running transactions, and no, the
> possibility of wasting one heap block access here and there when the
> database is melting down anyway doesn't seem like a big deal to me.
>

First, it is not clear to me if that has properly simulated the
long-running test but even if it is what I intend to say was to have
an open long-running transaction possibly for the entire duration of
the test? If we do that, we will come to know if there is any overhead
and if so how much?

> > I feel unlike LP_DEAD optimization this new bottom-up scheme can cost
> > us extra CPU and I/O because there seems to be not much consideration
> > given to the fact that we might not be able to delete any item (or
> > very few) due to long-standing open transactions except that we limit
> > ourselves when we are not able to remove even one tuple from any
> > particular heap page.
>
> There was plenty of consideration given to that. It was literally
> central to the design, and something I poured over at length. Why
> don't you go read some of that now? Or, why don't you demonstrate an
> actual regression using a tool like pgbench?
>
> I do not appreciate being accused of having acted carelessly. You
> don't have a single shred of evidence.
>

I think you have done a good job and I am just trying to see if there
are any loose ends which we can tighten-up. Anyway, here are results
from some simple performance tests:

Test with 2 un-modified indexes
===============================
create table t1(c1 int, c2 int, c3 int, c4 char(10));
create index idx_t1 on t1(c1);
create index idx_t2 on t1(c2);
create index idx_t3 on t1(c3);

insert into t1 values(generate_series(1,5000000), 1, 10, 'aaaaaa');
update t1 set c2 = 2;

Without nbree mod (without commit d168b66682)
===================================================
postgres=# update t1 set c2 = 2;
UPDATE 5000000
Time: 46533.530 ms (00:46.534)

With HEAD
==========
postgres=# update t1 set c2 = 2;
UPDATE 5000000
Time: 52529.839 ms (00:52.530)

I have dropped and recreated the table after each update in the test.
Some non-default configurations:
autovacuum = off; checkpoint_timeout = 35min; shared_buffers = 10GB;
min_wal_size = 10GB; max_wal_size = 20GB;

There seems to 12-13% of regression in the above test and I think we
can reproduce similar or higher regression with a long-running open
transaction. At this moment, I don't have access to any performance
machine so done these tests on CentOS VM. The results could vary but I
have repeated these enough times to reduce such a possibility.

> The design is counter-intuitive. I think that you simply don't understand it.
>

I have read your patch and have some decent understanding but
obviously, you and Victor will have a better idea. I am not sure what
I wrote in my previous email which made you say so. Anyway, I hope I
have made my point clear this time.

>
> > > I personally will
> > > never vote for a theoretical risk with only a theoretical benefit. And
> > > right now that's what the idea of doing bottom-up deletions in more
> > > marginal cases (the page flag thing) looks like.
> > >
> >
> > I don't think we can say that it is purely theoretical because I have
> > dome shown some basic computation where it can lead to fewer splits.
>
> I'm confused. You realize that this makes it *more* likely that
> bottom-up deletion passes will take place, right?
>

Yes.

> It sounds like
> you're arguing both sides of the issue at the same time.
>

No, I am sure the bottom-up deletion is a good technique to get rid of
bloat and just trying to see if there are more cases where we can take
its advantage and also try to avoid regression if there is any.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-01-20 13:35:40 Re: Deleting older versions in unique indexes to avoid page splits
Previous Message Fujii Masao 2021-01-20 13:28:34 Re: [PATCH] postgres_fdw connection caching - cause remote sessions linger till the local session exit