Re: Understanding partial index selection

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Owen Nelson <onelson(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Understanding partial index selection
Date: 2023-11-28 23:08:27
Message-ID: CAApHDvrV6Nx0CVzwdyCW6K60m9O_LLmyANc=NFM+4RucjhniGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, 29 Nov 2023 at 11:23, Owen Nelson <onelson(at)gmail(dot)com> wrote:
> "message_payload_not_null_pidx" btree (expiration) WHERE payload IS NOT NULL

> I periodically run a query like this:
> ```
> UPDATE message SET payload = NULL WHERE id IN (

> Update on message (cost=1773.41..44611.36 rows=5000 width=283) (actual time=20913.192..20913.194 rows=0 loops=1)
> -> Nested Loop (cost=1773.41..44611.36 rows=5000 width=283) (actual time=20881.320..20886.541 rows=51 loops=1)
> -> HashAggregate (cost=1772.85..1822.85 rows=5000 width=88) (actual time=20881.286..20882.052 rows=51 loops=1)
> Group Key: ("ANY_subquery".id)::text
> -> Subquery Scan on "ANY_subquery" (cost=0.00..1760.35 rows=5000 width=88) (actual time=8425.022..20881.244 rows=51 loops=1)
> -> Limit (cost=0.00..1710.35 rows=5000 width=38) (actual time=8425.017..20881.219 rows=51 loops=1)
> -> LockRows (cost=0.00..2112304.92 rows=6175068 width=38) (actual time=8425.016..20881.212 rows=51 loops=1)
> -> Seq Scan on message message_1 (cost=0.00..2050554.24 rows=6175068 width=38) (actual time=8424.977..20880.945 rows=65 loops=1)
> Filter: ((payload IS NOT NULL) AND (expiration <= now()))
> Rows Removed by Filter: 37772897
> -> Index Scan using pk_message on message (cost=0.56..8.56 rows=1 width=191) (actual time=0.073..0.073 rows=1 loops=51)
> Index Cond: ((id)::text = ("ANY_subquery".id)::text)
> Planning Time: 0.237 ms
> Execution Time: 20913.310 ms

I think the most likely cause is that the index has just become
bloated from all the updates. If you run the query after running SET
enable_seqscan TO off; then, proving the planner opts to use the
message_payload_not_null_pidx, you'll see what the planner's estimated
cost of that scan is. If you see the index being used, then that'll at
least confirm the index was not picked due to costs.

If the index gets used, then I'd check the size of the
message_payload_not_null_pidx index. You could also consider using
pgstatindex() [1] to check the state of the index and if it's bloated,
reindex it.

David

[1] https://www.postgresql.org/docs/current/pgstattuple.html

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Owen Nelson 2023-11-29 01:12:51 Re: Understanding partial index selection
Previous Message Tom Lane 2023-11-28 23:07:57 Re: Understanding partial index selection