Re: Differents execution times with gin index, prepared statement and literals.

From: Pierrick Chovelon <pierrick(dot)chovelon(at)dalibo(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Differents execution times with gin index, prepared statement and literals.
Date: 2024-07-17 08:15:26
Message-ID: ece8c222-d0d5-4c12-af42-6f5a1b708ab5@dalibo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

Thanks a lot for your clear answer.

On 16/07/2024 19:54, Tomas Vondra wrote:
> On 7/16/24 17:43, Pierrick Chovelon wrote:
>> ...
>>
>> Quite fast as well...
>>
>> Have you got an idea on the initial issue ? Why when using a prepared
>> statement and a gin index the execution time "explode" ?
>> Something to do with the planner ? optimizer ?
>>
>> (We executed the same test with a btree index and execution times are
>> the same in both cases).
>>
> The reason why the two queries end up with different plans is pretty
> simple - the condition ends up matching different operators, because of
> data type difference. In case of the prepared query, the (x <= 950000)
> matches <=(bigint,bitint) operator, and thus it matches the index. But
> that happens because the query is prepared with bigint parameter. For
> the regular query, the 950000 literal gets treated as int, the condition
> matches to <=(bigint,int) and that does not match the index - hence it's
> treated as a filter, not an index condition.
>
> If you cast the literal to bigint (by doing ::bigint) in the regular
> query, we end it'll use the same same plan as the prepared query - but
> that's the slower one, unfortunately :-(
I try the following thing :

postgres=# prepare stmt(int, text, int) as delete from  tmp_tk_test_index where sync_id <= $1 and line_id = $2 and chk_upgrade_index = $3;
PREPARE
postgres=# begin ;
BEGIN
postgres=*# explain (analyse) execute stmt(950000, 'the-test-value-fa529a621a15', 0);

                                                              QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------

 Delete on tmp_tk_test_index  (cost=21.36..25.38 rows=0 width=0) (actual time=0.148..0.149 rows=0 loops=1)
   ->  Bitmap Heap Scan on tmp_tk_test_index  (cost=21.36..25.38 rows=1 width=6) (actual time=0.146..0.147 rows=0 loops=1)
         Recheck Cond: ((line_id)::text = 'the-test-value-fa529a621a15'::text)
         Filter: ((sync_id <= 950000) AND (chk_upgrade_index = 0))
         Heap Blocks: exact=1
         ->  Bitmap Index Scan on idx_tmp_tk_test_index_1  (cost=0.00..21.36 rows=1 width=0) (actual time=0.099..0.099 rows=1 loops=1)
               Index Cond: ((line_id)::text = 'the-test-value-fa529a621a15'::text)
 Planning Time: 9.412 ms
 Execution Time: 1.570 ms
(9 rows)
postgres=*# rollback ;
ROLLBACK

So preparing a query with a data type different from the column (int
(prepared statement) vs bigint (table)) is faster in our case :/
It doesn't sound obvious to me :)

Thanks again for your answer Tomas.

> Which gets us to why that plan is slower, compared to the plan using
> fewer conditions. I think the problem is that <= 950000 matches most of
> the table, which means the GIN index will have to load and process a
> pretty long TID list, which is clearly not cheap.
>
> I don't think there's much you can do do - we don't consider this when
> matching conditions to the index, we simply match as many conditions as
> possible. And the GIN code is not smart enough to make judgements about
> which columns to process first - it just goes column by column and
> builds the bitmap, and building a bitmap on 95% of the table is costly.
>
> If this is a systemic problem for most/all queries (i.e. it's enough to
> have a condition on line_id), I believe the +0 trick is a good way to
> make sure the condition is treated as a filter.
>
>
> regards
>
--
Pierrick Chovelon
Consultant DBA PostgreSQL - Dalibo

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2024-07-17 08:30:46 Re: Restrict EXPLAIN (ANALYZE) for RLS and security_barrier views
Previous Message Peter Smith 2024-07-17 07:53:28 Re: Slow catchup of 2PC (twophase) transactions on replica in LR