Re: Index bloat and REINDEX/VACUUM optimization for partial index

From: jayaprabhakar k <jayaprabhakar(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: psql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Index bloat and REINDEX/VACUUM optimization for partial index
Date: 2023-08-29 16:47:18
Message-ID: CA+t=SiJGPasfKRxt8v=P5c4X+JuiWMjZcWWjy4p31CV8OhT+WQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks Peter. It is *14.4*, But on AWS RDS Aurora instance. I am trying to
read the links you shared - B-Tree Deletion and deduplication, etc. I still
don't fully understand what I need to do. In the BTree documentation,

> The average and worst-case number of versions per logical row can be kept
> low purely through targeted incremental deletion passes. It's quite
> possible that the on-disk size of certain indexes will never increase by
> even one single page/block despite *constant* version churn from UPDATEs.

In our case, almost all the tuples stop being covered by the index as they
fail the predicate, and only a tiny 1000s of rows pass the index predicate
at any point in time. But, we still see the index size continue to
increase, index lookups become slow over time, and vacuum (non full)
doesn't reduce the index size much.

Do we need to do anything specific to better utilize the targeted
incremental deletion passes?

SELECT VERSION();
version

-------------------------------------------------------------------------------------------------
PostgreSQL 14.4 on x86_64-pc-linux-gnu, compiled by
x86_64-pc-linux-gnu-gcc (GCC) 7.4.0, 64-bit
(1 row)

On Mon, 28 Aug 2023 at 18:49, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:

> On Mon, Aug 28, 2023 at 5:33 PM jayaprabhakar k <jayaprabhakar(at)gmail(dot)com>
> wrote:
> > REINDEX requires a full table scan
> >
> > Roughly create a new index, rename index, drop old index.
> > REINDEX is not incremental. running reindex frequently does not reduce
> the future reindex time.
>
> You didn't say which Postgres version you're on. Note that Postgres 14
> can deal with index bloat a lot better than earlier versions could.
> This is known to work well with partial indexes. See:
>
>
> https://www.postgresql.org/message-id/flat/CAL9smLAjt9mZC2%3DqBeJwuNPq7KMAYGTWWQw_hvA-Lfo0b3ycow%40mail.gmail.com
>
> --
> Peter Geoghegan
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Rondat Flyag 2023-08-29 17:47:19 Queries containing ORDER BY and LIMIT started to work slowly
Previous Message David Rowley 2023-08-29 09:38:05 Re: Range partitioning query performance with date_trunc (vs timescaledb)