Re: Function's execute overhead reducing

From: Игорь Выскорко <vyskorko(dot)igor(at)yandex(dot)ru>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Function's execute overhead reducing
Date: 2019-11-18 05:23:23
Message-ID: 116440101574054603@myt2-f25485a938c5.qloud-c.yandex.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> My suspicion is that the query in the function is hidden from the
> planner and so it ends up running two separate SELECT queries without
> reference to each other. A quick test and possible solution:
>
> My results for the above on old machine.:
>
> Straight query:
>                                                              QUERY PLAN
>
> ----------------------------------------------------------------------------------------------------------------------------------
>   Seq Scan on tst (cost=0.00..1644807.00 rows=500000 width=116) (actual
> time=0.033..2808.596 rows=500000 loops=1)
>     SubPlan 1
>       -> Aggregate (cost=3.25..3.26 rows=1 width=8) (actual
> time=0.005..0.005 rows=1 loops=500000)
>             -> Function Scan on jsonb_each _a (cost=0.00..1.00
> rows=100 width=64) (actual time=0.003..0.003 rows=3 loops=500000)
>   Planning Time: 16.162 ms
>   Execution Time: 2846.815 ms
>
> Function in query
>                                                     QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------------------
>   Seq Scan on tst (cost=0.00..138557.00 rows=500000 width=116) (actual
> time=0.119..7048.285 rows=500000 loops=1)
>   Planning Time: 0.105 ms
>   Execution Time: 7098.057 ms
>
> I changed the function to:
>
> CREATE OR REPLACE FUNCTION public.tst_func(i integer, a jsonb, b jsonb)
>   RETURNS TABLE(id integer, val bigint)
>   LANGUAGE sql
>   STABLE
> AS $function$
>    select
>      i,
>      sum(
>        ((_a.value::text)::int - (coalesce(b->>_a.key, '0'))::int)::int
>      )
>    from
>      jsonb_each(a) _a
> $function$
>
> Using 'table' function:
>
> test=# explain analyze select
>    tst.id,
>    j1,
>    j2
> from
>    tst
> join
>    tst_func(id, j1, j2) as f
> on
>   tst.id = f.id;
>                                                               QUERY PLAN
>
> -------------------------------------------------------------------------------------------------------------------------------------
>   Nested Loop (cost=1.00..531057.00 rows=1 width=108) (actual
> time=0.042..2002.258 rows=500000 loops=1)
>     -> Seq Scan on tst (cost=0.00..13557.00 rows=500000 width=108)
> (actual time=0.014..70.936 rows=500000 loops=1)
>     -> Subquery Scan on f (cost=1.00..1.02 rows=1 width=4) (actual
> time=0.003..0.003 rows=1 loops=500000)
>           Filter: (tst.id = f.id)
>           -> Aggregate (cost=1.00..1.01 rows=1 width=12) (actual
> time=0.003..0.003 rows=1 loops=500000)
>                 -> Function Scan on jsonb_each _a (cost=0.00..1.00
> rows=100 width=0) (actual time=0.003..0.003 rows=3 loops=500000)
>   Planning Time: 0.297 ms
>   Execution Time: 2037.601 ms
>
>>  PS current situation maybe solved by denormalization with precalculations of function and storing results along with data, but it's not the way i would like to use, because it leads to more issues to solve (invalidation, for instance)
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com

Thanks a lot! It works even better than subquery

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Paquier 2019-11-18 06:42:10 Re: REINDEX VERBOSE unknown option
Previous Message Andrew Gierth 2019-11-18 01:56:14 Re: here does postgres take its timezone information from?