Use of inefficient index in the presence of dead tuples

From: Alexander Staubo <alex(at)purefiction(dot)net>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Use of inefficient index in the presence of dead tuples
Date: 2024-05-28 08:00:22
Message-ID: DC43B9C3-7BCB-4671-A69E-B0061C710241@purefiction.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I am encountering an odd problem where Postgres will use the wrong index, particularly if the table
has some dead tuples. The database affected is running 12.6, but I can also reproduce with 16.3.

To reproduce:

(1) Disable autovacuum. This is just so we can induce a scenario where there are lots of dead tuples.

(2) Set up schema. It's important to create the index before insertion, in order to provoke a
situation where the indexes have dead tuples:

CREATE TABLE outbox_batches (
id text NOT NULL,
receiver text NOT NULL,
created_at timestamp without time zone DEFAULT now() NOT NULL,
PRIMARY KEY (receiver, id)
);
CREATE INDEX outbox_batches_on_receiver_and_created_at
ON outbox_batches (receiver, created_at DESC);

(3) Insert 5M rows of dummy data. Note that we are using UUIDs here for the purposes of testing; in
my real database, I use much shorter unique IDs.

INSERT INTO outbox_batches (receiver, id)
SELECT 'dummy', uuid_generate_v4()
FROM (SELECT * FROM generate_series(1, 5000000, 1)) AS foo;

(4) Then ensure all tuples are dead except one:

DELETE FROM outbox_batches;
INSERT INTO outbox_batches (receiver, id) VALUES ('dummy', 'test');

(5) Analyze:

ANALYZE outbox_batches;

(6) You should now have 5m dead rows and 1 live row:

SELECT n_live_tup, n_dead_tup FROM pg_stat_user_tables WHERE relname = 'outbox_batches';
┌────────────┬────────────┐
│ n_live_tup │ n_dead_tup │
├────────────┼────────────┤
│ 1 │ 5000000 │
└────────────┴────────────┘

We also observe that the outbox_batches_pkey index is 454 MB, and the
outbox_batches_on_receiver_and_created_at is 31 MB.

(7) Try the following query:

EXPLAIN (ANALYZE, VERBOSE, BUFFERS, COSTS, TIMING, SETTINGS, SUMMARY)
SELECT id FROM outbox_batches
WHERE receiver = 'dummy'
AND id = 'test';

Here's the query plan:

Index Scan using outbox_batches_on_receiver_and_created_at on public.outbox_batches (cost=0.38..8.39 rows=1 width=5) (actual time=0.426..984.038 rows=1 loops=1)
Output: id
Index Cond: (outbox_batches.receiver = 'dummy'::text)
Filter: (outbox_batches.id = 'test'::text)
Buffers: shared hit=3948 read=60742 dirtied=60741 written=30209
Settings: work_mem = '32MB'
Query Identifier: -2232653838283363139
Planning:
Buffers: shared hit=18 read=3
Planning Time: 1.599 ms
Execution Time: 984.082 ms

This query is reading 60K buffers even though it only needs to read a single row. Notice in particular the
use of the index outbox_batches_on_receiver_and_created_at, even though outbox_batches_pkey would be
a much better choice. We know this because if we drop the first index:

Index Only Scan using outbox_batches_pkey on public.outbox_batches (cost=0.50..8.52 rows=1 width=5) (actual time=2.067..2.070 rows=1 loops=1)
Output: id
Index Cond: ((outbox_batches.receiver = 'dummy'::text) AND (outbox_batches.id = 'test'::text))
Heap Fetches: 1
Buffers: shared hit=1 read=4
Settings: work_mem = '32MB'
Query Identifier: -2232653838283363139
Planning:
Buffers: shared hit=5 dirtied=1
Planning Time: 0.354 ms
Execution Time: 2.115 ms

This is also the index that's used in the normal case when there are no dead tuples at all.

Interestingly, the cost of an index only scan on outbox_batches_pkey is 8.52, whereas the other is
8.39. Is this because it considers the number of index pages? I've tried adjusting the various cost
and memory settings, but they have no effect.

In this test, we created 5M dead tuples. However, for me it also reproduces with just 1,000 rows.
For such a small table, the performance degradation is minimal, but it increases as more and more
tuples are deleted.

In a production environment, we have rows being constantly deleted at a high rate, leaving a table
that often has very few live tuples, and often 500K+ dead tuples before autovacuum can kick in. Here
I am consistently seeing the wrong index used, leading to poor performance.

The autovacuum settings ar aggressive, but for whatever reason it is not keeping up. We also have
long-running transactions that sometimes cause the xmin to hang back for a while, preventing
vacuums from helping.

All of that said, I would rather Postgres choose the right index than spend a lot of time optimizing
vacuums.

Here's my full server config: https://gist.github.com/atombender/54207d473e415fab26fc59751a22feca.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message vijay patil 2024-05-28 08:31:51 Pgpool with high availability
Previous Message Andreas Joseph Krogh 2024-05-28 06:07:38 Re: prevent users from SELECT-ing from pg_roles/pg_database