Re: Trying to understand why a query is filtering when there is a composite index

From: Shiv Iyer <shiv(at)minervadb(dot)com>
To: "Stephen Samuel (Sam)" <sam(at)sksamuel(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Trying to understand why a query is filtering when there is a composite index
Date: 2024-08-19 08:05:40
Message-ID: CAALLqt-xk0ckt9s_9OvsNfcFZOxgo4QdpkxoWz6xne7Y7OGDdg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

The query's behavior is expected due to how PostgreSQL handles composite
indexes and MVCC. The index on `(a, b)` is used efficiently for the `a`
condition, but the `b IN (<ids>)` filter is more complex, leading to
additional filtering rather than direct index usage. Although the
index-only scan is utilized, heap fetches still occur to verify tuple
visibility, a necessary step when the visibility map doesn’t confirm
visibility or to apply the `b` filter accurately. This is standard in
PostgreSQL’s handling of such queries, ensuring data consistency and
accuracy. Performance remains good, but these heap fetches could be
optimized if needed by reconsidering the index structure or query design.
Thank you!

On Mon, Aug 19, 2024 at 7:26 AM Stephen Samuel (Sam) <sam(at)sksamuel(dot)com>
wrote:

> Hi folks.
>
> I have a table with 4.5m rows per partition (16 partitions) (I know, very
> small, probably didn't need to be partitioned).
>
> The table has two columns, a bigint and b text.
> There is a unique index on (a,b)
> The query is:
>
> SELECT b
> FROM table
> WHERE a = <id>
> AND b IN (<ids>)
>
>
> The visibility map is almost exclusively true.
> This table gets few updates.
>
> The planner says index only scan, but is filtering on b.
>
> Index Only Scan using pkey on table (cost=0.46..29.09 rows=1 width=19) (actual time=0.033..0.053 rows=10 loops=1)
> Index Cond: (a = 662028765)
> " Filter: (b = ANY ('{634579987:662028765,561730945:662028765,505555183:662028765,472806302:662028765,401361055:662028765,363587258:662028765,346093772:662028765,314369897:662028765,289498328:662028765,217993946:662028765}'::text[]))"
> Rows Removed by Filter: 1
> Heap Fetches: 11
> Planning Time: 0.095 ms
> Execution Time: 0.070 ms
>
> My question is, why isn't it using the index for column b? Is this expected? And why is it doing heap lookups for every row,.
>
> Performance is still good, but I am curious.
>
> Thanks in advance!
>
>

--

*Best Regards *
*Shiv Iyer *

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Peter Geoghegan 2024-08-19 15:21:13 Re: Trying to understand why a query is filtering when there is a composite index
Previous Message Stephen Samuel (Sam) 2024-08-19 04:16:45 Re: Trying to understand why a query is filtering when there is a composite index