Re: Index bloat and REINDEX/VACUUM optimization for partial index

From: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
To: jayaprabhakar k <jayaprabhakar(at)gmail(dot)com>
Cc: psql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Index bloat and REINDEX/VACUUM optimization for partial index
Date: 2023-08-31 15:05:41
Message-ID: CAK-MWwQsCoB2P=JO0G8_R4K=MOYz-5PcrMjLcGKit0VzM8wcNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> At any moment, there are *around 1000-1500 tasks in pending statuses*
> (Init + InProgress) out of around 500 million tasks.
>
> Now, we have a task monitoring query that will look for all pending tasks
> that have not received any update in the last n minutes.
>
> ```
> SELECT [columns list]
> FROM tasks
> WHERE status NOT IN (3,4,5) AND created > NOW() - INTERVAL '30 days' AND
> updated < NOW() - interval '30 minutes'
> ```
>
> Since we are only interested in the pending tasks, I created a partial
> index
> `*"tasks_pending_status_created_type_idx" btree (status, created,
> task_type) WHERE status <> ALL (ARRAY[3, 4, 5])*`.
>
> This worked great initially, however this started to get bloated very very
> quickly because, every task starts in pending state, gets multiple updates
> (and many of them are not HOT updates, working on optimizing fill factor
> now), and eventually gets deleted from the index (as status changes to
> success).
>

From my experience I suspect that there is a problem with "of around 500
million tasks."
Autovacuum indeed cleans old dead index entries, but how many such dead
index entries will be collected on the 500M table before autovacuum kicks
in?

With the default value of autovacuum_vacuum_scale_factor (The default is
0.2 (20% of table size).) index will collect like 100M outdated/dead index
entries before autovacuum kicks in and cleans them all (in a worst case),
and of course it will lead to huge index bloat and awful performance.

Even if you scale down autovacuum_vacuum_scale_factor to some unreasonable
low value like 0.01, the index still bloats to the 5M dead entries before
autovacuum run, and constant vacuuming of a huge 500M table will put a huge
load on the database server.

Unfortunately there is no easy way out of this situation from database
side, in general I recommend not trying to implement a fast pacing queue
like load inside of a huge and constantly growing table, it never works
well because you cannot keep up partial efficient indexes for the queue in
a clean/non-bloated state.

In my opinion the best solution is to keep list of entries to process ("*around
1000-1500 tasks in pending statuses")* duplicated in the separate tiny
table (via triggers or implement it on the application level), in that case
autovacuum will be able quickly clean dead entries from the index.

Kind Regards,
Maxim

--
Maxim Boguk
Senior Postgresql DBA

Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2023-08-31 16:52:31 Re: Queries containing ORDER BY and LIMIT started to work slowly
Previous Message Christian Beikov 2023-08-31 08:19:10 Join order optimization