From: | Pierrick Chovelon <pierrick(dot)chovelon(at)dalibo(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Differents execution times with gin index, prepared statement and literals. |
Date: | 2024-07-17 08:15:26 |
Message-ID: | ece8c222-d0d5-4c12-af42-6f5a1b708ab5@dalibo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
Thanks a lot for your clear answer.
On 16/07/2024 19:54, Tomas Vondra wrote:
> On 7/16/24 17:43, Pierrick Chovelon wrote:
>> ...
>>
>> Quite fast as well...
>>
>> Have you got an idea on the initial issue ? Why when using a prepared
>> statement and a gin index the execution time "explode" ?
>> Something to do with the planner ? optimizer ?
>>
>> (We executed the same test with a btree index and execution times are
>> the same in both cases).
>>
> The reason why the two queries end up with different plans is pretty
> simple - the condition ends up matching different operators, because of
> data type difference. In case of the prepared query, the (x <= 950000)
> matches <=(bigint,bitint) operator, and thus it matches the index. But
> that happens because the query is prepared with bigint parameter. For
> the regular query, the 950000 literal gets treated as int, the condition
> matches to <=(bigint,int) and that does not match the index - hence it's
> treated as a filter, not an index condition.
>
> If you cast the literal to bigint (by doing ::bigint) in the regular
> query, we end it'll use the same same plan as the prepared query - but
> that's the slower one, unfortunately :-(
I try the following thing :
postgres=# prepare stmt(int, text, int) as delete from tmp_tk_test_index where sync_id <= $1 and line_id = $2 and chk_upgrade_index = $3;
PREPARE
postgres=# begin ;
BEGIN
postgres=*# explain (analyse) execute stmt(950000, 'the-test-value-fa529a621a15', 0);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Delete on tmp_tk_test_index (cost=21.36..25.38 rows=0 width=0) (actual time=0.148..0.149 rows=0 loops=1)
-> Bitmap Heap Scan on tmp_tk_test_index (cost=21.36..25.38 rows=1 width=6) (actual time=0.146..0.147 rows=0 loops=1)
Recheck Cond: ((line_id)::text = 'the-test-value-fa529a621a15'::text)
Filter: ((sync_id <= 950000) AND (chk_upgrade_index = 0))
Heap Blocks: exact=1
-> Bitmap Index Scan on idx_tmp_tk_test_index_1 (cost=0.00..21.36 rows=1 width=0) (actual time=0.099..0.099 rows=1 loops=1)
Index Cond: ((line_id)::text = 'the-test-value-fa529a621a15'::text)
Planning Time: 9.412 ms
Execution Time: 1.570 ms
(9 rows)
postgres=*# rollback ;
ROLLBACK
So preparing a query with a data type different from the column (int
(prepared statement) vs bigint (table)) is faster in our case :/
It doesn't sound obvious to me :)
Thanks again for your answer Tomas.
> Which gets us to why that plan is slower, compared to the plan using
> fewer conditions. I think the problem is that <= 950000 matches most of
> the table, which means the GIN index will have to load and process a
> pretty long TID list, which is clearly not cheap.
>
> I don't think there's much you can do do - we don't consider this when
> matching conditions to the index, we simply match as many conditions as
> possible. And the GIN code is not smart enough to make judgements about
> which columns to process first - it just goes column by column and
> builds the bitmap, and building a bitmap on 95% of the table is costly.
>
> If this is a systemic problem for most/all queries (i.e. it's enough to
> have a condition on line_id), I believe the +0 trick is a good way to
> make sure the condition is treated as a filter.
>
>
> regards
>
--
Pierrick Chovelon
Consultant DBA PostgreSQL - Dalibo
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2024-07-17 08:30:46 | Re: Restrict EXPLAIN (ANALYZE) for RLS and security_barrier views |
Previous Message | Peter Smith | 2024-07-17 07:53:28 | Re: Slow catchup of 2PC (twophase) transactions on replica in LR |