Re: Frequetly updated partial index leads to bloat on index for Postresql 11

From: Tom Dearman <tom(dot)dearman(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: pgsql-general General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Frequetly updated partial index leads to bloat on index for Postresql 11
Date: 2021-07-16 14:24:01
Message-ID: 95D506BA-D1E8-45BB-8167-BD61ED3D9B9C@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

We upgraded to 13 a couple of months ago on production but are still having an issue with bloated partial indexes which have an impact on our partial queries especially towards the end of a quarter when our quarterly-partitioned tables are getting big. I have built 14 (on macOS catalina, 14 beta 2) and run a test but the index grows fairly large (even though vacuums are running as the table is still relatively small - I put in 2 million inserts, each having one update of the column that makes up the partial index). The table is:

Table "public.buyer"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
---------------+------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
buyer_id | integer | | not null | | plain | | |
first_name | character varying(35) | | not null | | extended | | |
last_name | character varying(35) | | not null | | extended | | |
email_address | character varying(50) | | | | extended | | |
status | character varying(256) | | not null | | extended | | |
Indexes:
"buyer_pkey" PRIMARY KEY, btree (buyer_id)
"idex_buyer_inactive" btree (first_name) WHERE status::text = 'IN_PROGRESS'::text
Access method: heap

I run a loop to insert, commit, update, commit one row at a time as this is an emulation of what a similar table would experience in production. The index never has many rows with status=‘IN_PROGRESS’ as each row is set to CANCEL in the update. If the index is reindexed it takes 1 page as expected but without the reindexing it keeps growing, currently reaching 3MB - this is with 2 million inserts and updates but our production will have about 300 million inserts and > 300 million updates on the partial index in the quarter. Should we have seen more of an improvement in 14? Is it valid to look at the size of the index (\di+) as a measure of whether this latest change to bottom up index deleting has helped?

Thanks,

Tom

> On 18 Mar 2021, at 16:30, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Thu, Mar 18, 2021 at 6:51 AM Tom Dearman <tom(dot)dearman(at)gmail(dot)com> wrote:
>> Is this a known issue, are they any ways around it, and if it is an
>> issue is there a plan to fix it if a fix is possible?
>
> It's not exactly a known issue per se, but I think the problem here is
> related to the fact that you have lots of duplicates, which did
> perform rather badly prior to Postgres 12. I bet that you'd benefit
> from upgrading to Postgres 12, or especially to Postgres 13. The
> B-Tree space management is a lot better now. (Actually, it'll be
> better again in Postgres 14.)
>
> --
> Peter Geoghegan

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Lewis 2021-07-16 14:49:27 Re: Frequetly updated partial index leads to bloat on index for Postresql 11
Previous Message Markhof, Ingolf 2021-07-16 12:42:53 dealing with dependencies