From: | Julius Tuskenis <julius(dot)tuskenis(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: performance of sql and plpgsql functions |
Date: | 2024-06-18 13:03:17 |
Message-ID: | 09c67567-f351-415f-af2d-76f7e8661d12@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 2024-06-17 17:24, Tom Lane wrote:
> Julius Tuskenis<julius(dot)tuskenis(at)gmail(dot)com> writes:
>> EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
>> SELECT
>> COALESCE(sum(mok_nepadengta), 0)
>> FROM
>> public.b_pardavimai
>> JOIN public.b_mokejimai ON (mok_pardavimas = pard_id)
>> WHERE
>> (pard_tipas = ANY('{1, 2, 6, 7}'))
>> AND (mok_saskaita = 7141968)
> I believe that the SQL-language function executor always uses generic
> plans for parameterized queries (which is bad, but nobody's gotten
> round to improving it). So the above is a poor way of investigating
> what will happen, because it corresponds to a custom plan for the
> value 7141968. You should try something like
>
> PREPARE p(integer) AS
> SELECT COALESCE ...
> ... AND (mok_saskaita = $1);
>
> SET plan_cache_mode TO force_generic_plan;
>
> EXPLAIN ANALYZE EXECUTE p(7141968);
>
> What I suspect is that the statistics for mok_saskaita are
> highly skewed and so with a generic plan the planner will
> not risk using a plan that depends on the parameter value
> being infrequent, as the one you're showing does.
>
> regards, tom lane
Thank you Tom Lane, for pointing the problem.
In deed, after setting plan_cache_mode to force_generic_plan I see very
different plan:
```
"Finalize Aggregate (cost=6901.01..6901.02 rows=1 width=32) (actual
time=50.258..56.004 rows=1 loops=1)"
" Output: COALESCE(sum((b_mokejimai.mok_nepadengta)::numeric),
'0'::numeric)"
" Buffers: shared hit=4"
" -> Gather (cost=6900.89..6901.00 rows=1 width=32) (actual
time=0.809..55.993 rows=2 loops=1)"
" Output: (PARTIAL sum((b_mokejimai.mok_nepadengta)::numeric))"
" Workers Planned: 1"
" Workers Launched: 1"
" Buffers: shared hit=4"
" -> Partial Aggregate (cost=5900.89..5900.90 rows=1 width=32)
(actual time=0.077..0.079 rows=1 loops=2)"
" Output: PARTIAL sum((b_mokejimai.mok_nepadengta)::numeric)"
" Buffers: shared hit=4"
" Worker 0: actual time=0.052..0.053 rows=1 loops=1"
" -> Nested Loop (cost=25.92..5897.69 rows=1280 width=3)
(actual time=0.070..0.072 rows=0 loops=2)"
" Output: b_mokejimai.mok_nepadengta"
" Inner Unique: true"
" Buffers: shared hit=4"
" Worker 0: actual time=0.043..0.043 rows=0 loops=1"
" -> Parallel Bitmap Heap Scan on
public.b_mokejimai (cost=25.48..2455.36 rows=1307 width=7) (actual
time=0.069..0.070 rows=0 loops=2)"
" Output: b_mokejimai.mok_id,
b_mokejimai.mok_moketojas, b_mokejimai.mok_pardavimas,
b_mokejimai.mok_laikas, b_mokejimai.mok_suma, b_mokejimai.mok_budas,
b_mokejimai.mok_terminas, b_mokejimai.mok_cekis,
b_mokejimai.mok_saskaita, b_mokejimai.mok_suma_bazine,
b_mokejimai.mok_nepadengta, b_mokejimai.mok_padengta,
b_mokejimai.mok_laiko_diena"
" Recheck Cond: (b_mokejimai.mok_saskaita = $1)"
" Buffers: shared hit=4"
" Worker 0: actual time=0.042..0.042 rows=0
loops=1"
" -> Bitmap Index Scan on idx_saskaita
(cost=0.00..24.93 rows=2222 width=0) (actual time=0.023..0.023 rows=0
loops=1)"
" Index Cond: (b_mokejimai.mok_saskaita =
$1)"
" Buffers: shared hit=4"
" -> Index Scan using pk_b_pardavimai_id on
public.b_pardavimai (cost=0.44..2.63 rows=1 width=4) (never executed)"
" Output: b_pardavimai.pard_id,
b_pardavimai.pard_preke, b_pardavimai.pard_kaina,
b_pardavimai.pard_nuolaida, b_pardavimai.pard_kiekis,
b_pardavimai.pard_kasos_nr, b_pardavimai.pard_laikas,
b_pardavimai.pard_prekes_id, b_pardavimai.pard_pirkejo_id,
b_pardavimai.pard_pardavejas, b_pardavimai.pard_spausdinta,
b_pardavimai.pard_reikia_grazinti, b_pardavimai.pard_kam_naudoti,
b_pardavimai.pard_susieta, b_pardavimai.pard_galima_anuliuoti,
b_pardavimai.pard_tipas, b_pardavimai.pard_pvm,
b_pardavimai.pard_apsilankymas, b_pardavimai.pard_fk,
b_pardavimai.pard_kelintas, b_pardavimai.pard_precekis,
b_pardavimai.pard_imone, b_pardavimai.pard_grazintas,
b_pardavimai.pard_debeto_sutartis, b_pardavimai.pard_kaina_be_nld,
b_pardavimai.pard_uzsakymas_pos, b_pardavimai.pard_pvm_suma,
b_pardavimai.pard_uzsakymo_nr, b_pardavimai.pard_nuolaidos_id,
b_pardavimai.pard_nuolaida_taikyti,
b_pardavimai.pard_pirkeja_keisti_galima,
b_pardavimai.pard_suma_keisti_galima"
" Index Cond: (b_pardavimai.pard_id =
b_mokejimai.mok_pardavimas)"
" Filter: (b_pardavimai.pard_tipas = ANY
('{1,2,6,7}'::integer[]))"
"Planning Time: 0.016 ms"
"Execution Time: 56.097 ms"
```
If I understand the plan correctly, the problem is the planner expects
to find 2222 records for a provide value of `mok_saskaita`. I've tried
running analyze on `b_mokejimai`, but the plan remains the same - must
be because some values of `mok_saskaita` do really return tens of
thousands of records.
I don't know how the planner comes up with value 2222, because on
average there are 15 b_mokejimai records for a single mok_saskaita (if
NULL in mok_saskata is ignored), and 628 records if not.
Anyway...
Do you think rewriting a function in plpgsql is a way to go in such
case? In pg documentation
(https://www.postgresql.org/docs/12/plpgsql-implementation.html#PLPGSQL-PLAN-CACHING)
I read that the plan for the plpgsql function is calculated the first
time the function is executed (for a connection). I'm concerned, that
the function execution is not replanned: I will be stuck with a plan
that corresponds to the `mok_saskaita` parameter value passed on the
first execution. Or am I wrong?
Is there a way to make PostgreSQL recalculate the plan on each execution
of the function? The observed planning times are acceptable for my
application.
Regards,
Julius Tuskenis
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Simpson | 2024-06-24 22:28:13 | Row level security |
Previous Message | Tom Lane | 2024-06-17 14:24:07 | Re: performance of sql and plpgsql functions |