Re: Querying one partition in a function takes locks on all partitions

From: Renan Alves Fonseca <renanfonseca(at)gmail(dot)com>
To: Evgeny Morozov <postgresql4(at)realityexists(dot)net>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Querying one partition in a function takes locks on all partitions
Date: 2025-03-28 21:30:08
Message-ID: CAN_p2QghEHKeyTvb6Gc8F3y6wNm+dJXaCdrAnrV7TEq15Oh-pg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I've investigated further and found out that the code that processes
SQL functions is completely different from the code that processes SQL
statements. The latter is more efficient, and there is ongoing work to
merge both.

Currently, in the SQL function path the plan is always generic. The
planner ignores the function arguments. The plan_cache_mode setting
has no effect in this path.

I agree that the docs should be more explicit about this. There is a
high penalty for using generic plans in complex functions.

If you can use prepared statements
(https://www.postgresql.org/docs/current/sql-prepare.html) they run
on the optimized path and respect plan_cache_mode.

Regards,
Renan

On Fri, Mar 28, 2025 at 5:38 PM Evgeny Morozov
<postgresql4(at)realityexists(dot)net> wrote:
>
> Thank you for doing the extra investigation! I realised only now you didn't send this to the mailing list, only to me. As you say, force_custom_plan doesn't seem to help with a SQL function - just tested that.
>
Thanks for noting. I've seen just now.

> Regards,
> Evgeny Morozov
>
> On 23/03/2025 12:08 am, Renan Alves Fonseca wrote:
>
> It seems that when we create a function using pure sql, the query planner uses a generic plan.
> We can mimic this behavior using prepared statements and plan_cache_mode:
>
> # prepare read1(int) as select count(*) from entity where part_id=$1;
> # set plan_cache_mode = force_generic_plan ;
> # explain (costs off) execute read1(1);
> QUERY PLAN
> --------------------------------------
> Aggregate
> -> Append
> Subplans Removed: 1
> -> Seq Scan on entity_1
> Filter: (part_id = $1)
> Note "Subplans Removed". This plan causes lock.
>
> # set plan_cache_mode = force_custom_plan ;
> # explain (costs off) execute read1(1);
> QUERY PLAN
> -----------------------------------
> Aggregate
> -> Seq Scan on entity_1 entity
> Filter: (part_id = 1)
>
> No lock in this case.
>
> However, I didn't find a solution to force a custom plan in the stored procedure (written in pure sql). I don't know if it is not supported or if I'm missing some parameter. Anyway, it would be nice to have custom plans in sql stored procedures. I've run into other troubles in the past due to the generic plan.
>
> Regards,
> Renan Fonseca
>
> On Fri, Mar 21, 2025 at 5:27 PM Evgeny Morozov <postgresql4(at)realityexists(dot)net> wrote:
>>
>> I have a list-partitioned table. When I query the base table but filter
>> by the partition column in a regular SQL query this takes a lock only on
>> the one partition being queried, as I expect. However, when the exact
>> same SQL query is run fom a DB function, with the partition ID passed in
>> as argument, it takes (shared) locks on ALL partitions - which blocks
>> any other process that wants an exclusive lock on another partition (and
>> vice-versa).
>>
>> Originally found on PG 15.12, but happens on 17.4 as well. Easily
>> reproducible:
>>
>> -- One-time setup
>>
>> create table entity
>> (
>> part_id integer not null
>> ) partition by list (part_id);
>>
>> create table entity_1 partition of entity for values in (1);
>> create table entity_2 partition of entity for values in (2);
>>
>> create function read_partition(which_part int) returns bigint as
>> 'select count(*) from entity where part_id = which_part;'
>> language sql stable;
>>
>> -- Then try this, keeping the connection open (so the transaction is
>> pending):
>>
>> begin;
>> select read_partition(1); -- This takes shared locks on entity_1 AND
>> entity_2
>>
>> -- select count(*) from entity where part_id = 1; -- but this would only
>> take a shared lock only on entity_1
>>
>> If another session tries something that takes an exclusive lock on
>> another partition, like
>>
>> alter table entity_2 add column new_column text;
>>
>> I would expect that to be able to run concurrently, but it blocks due to
>> the shared lock on entity_2. (The way I originally found the problem was
>> the opposite: once one client took an exclusive lock on a partition many
>> others were blocked from reading from ANY partition.)
>>
>> This seems like quite the "gotcha", especially when the query plan for
>> the function call (logged via autoexplain) shows it only accessing one
>> partition (entity_1). Is this expected behavior? If so, is it documented
>> somewhere?
>>
>>
>>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Stephenson 2025-03-29 07:28:46 Re: BTREE index: field ordering
Previous Message Adrian Klaver 2025-03-28 20:57:36 Re: Concurrent delete and insert on same key, potentially a bug