Re: Differents execution times with gin index, prepared statement and literals.

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Pierrick Chovelon <pierrick(dot)chovelon(at)dalibo(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Differents execution times with gin index, prepared statement and literals.
Date: 2024-07-16 17:54:32
Message-ID: 95a459da-d3f3-4d9e-ad02-29ba39c5cc5f@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/16/24 17:43, Pierrick Chovelon wrote:
> ...
>
> Quite fast as well...
>
> Have you got an idea on the initial issue ? Why when using a prepared
> statement and a gin index the execution time "explode" ?
> Something to do with the planner ? optimizer ?
>
> (We executed the same test with a btree index and execution times are
> the same in both cases).
>

The reason why the two queries end up with different plans is pretty
simple - the condition ends up matching different operators, because of
data type difference. In case of the prepared query, the (x <= 950000)
matches <=(bigint,bitint) operator, and thus it matches the index. But
that happens because the query is prepared with bigint parameter. For
the regular query, the 950000 literal gets treated as int, the condition
matches to <=(bigint,int) and that does not match the index - hence it's
treated as a filter, not an index condition.

If you cast the literal to bigint (by doing ::bigint) in the regular
query, we end it'll use the same same plan as the prepared query - but
that's the slower one, unfortunately :-(

Which gets us to why that plan is slower, compared to the plan using
fewer conditions. I think the problem is that <= 950000 matches most of
the table, which means the GIN index will have to load and process a
pretty long TID list, which is clearly not cheap.

I don't think there's much you can do do - we don't consider this when
matching conditions to the index, we simply match as many conditions as
possible. And the GIN code is not smart enough to make judgements about
which columns to process first - it just goes column by column and
builds the bitmap, and building a bitmap on 95% of the table is costly.

If this is a systemic problem for most/all queries (i.e. it's enough to
have a condition on line_id), I believe the +0 trick is a good way to
make sure the condition is treated as a filter.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-07-16 17:57:47 Re: Remove dependence on integer wrapping
Previous Message Fujii Masao 2024-07-16 17:50:30 Re: [Proposal] Add foreign-server health checks infrastructure