Re: range_agg extremely slow compared to naive implementation in obscure circumstances

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Duncan Sands <duncan(dot)sands(at)deepbluecap(dot)com>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: range_agg extremely slow compared to naive implementation in obscure circumstances
Date: 2023-01-30 10:42:45
Message-ID: CAFj8pRA8udegzsc0BhY3tU7YgStuJ9N8V-wyNKnzi71fY=M1wg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

po 30. 1. 2023 v 11:24 odesílatel Duncan Sands <duncan(dot)sands(at)deepbluecap(dot)com>
napsal:

> In general range_agg is faster than the naive version
>
> CREATE AGGREGATE public.naive_range_agg(anymultirange) (
> SFUNC = multirange_union,
> STYPE = anymultirange
> );
>
> however here is an example where using it is over 6000 times slower. I'm
> not
> sure exactly what feature of the example triggers this - I failed to
> create a
> synthetic testcase using generate_series, thus the attached table data.
>
> How to reproduce (Ubuntu 22.10, x86_64, postgresql 15.1-1.pgdg22.10+1):
>
> $ cp data.txt.gz /tmp/
> $ gunzip /tmp/data.txt.gz
> $ psql
> Pager usage is off.
> psql (15.1 (Ubuntu 15.1-1.pgdg22.10+1))
> Type "help" for help.
>
> duncan=> CREATE TEMP TABLE wacky(priority bigint, valid tstzrange);
> CREATE TABLE
> duncan=> \COPY wacky FROM /tmp/data.txt
> COPY 98094
> duncan=> CREATE AGGREGATE public.naive_range_agg(anymultirange) (
> SFUNC = multirange_union,
> STYPE = anymultirange
> );
> CREATE AGGREGATE
> duncan=> \timing on
> Timing is on.
> duncan=> EXPLAIN (ANALYZE) SELECT FROM (SELECT valid,
> naive_range_agg(valid::tstzmultirange) OVER (ORDER BY priority DESC GROUPS
> BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS more_recent FROM wacky)
> foo
> WHERE valid <@ more_recent;
> QUERY PLAN
>
>
> -----------------------------------------------------------------------------------------------------------------------------
> Subquery Scan on foo (cost=11063.57..13879.37 rows=433 width=0) (actual
> time=88.086..88.087 rows=0 loops=1)
> Filter: (foo.valid <@ foo.more_recent)
> Rows Removed by Filter: 98094
> -> WindowAgg (cost=11063.57..12796.37 rows=86640 width=72) (actual
> time=16.102..84.242 rows=98094 loops=1)
> -> Sort (cost=11063.57..11280.17 rows=86640 width=40) (actual
> time=16.096..20.205 rows=98094 loops=1)
> Sort Key: wacky.priority DESC
> Sort Method: external merge Disk: 3848kB
> -> Seq Scan on wacky (cost=0.00..1588.40 rows=86640
> width=40)
> (actual time=0.021..5.479 rows=98094 loops=1)
> Planning Time: 0.277 ms
> Execution Time: 88.631 ms
> (10 rows)
>
> Time: 98.261 ms
> duncan=> EXPLAIN (ANALYZE) SELECT FROM (SELECT valid,
> range_agg(valid::tstzmultirange) OVER (ORDER BY priority DESC GROUPS
> BETWEEN
> UNBOUNDED PRECEDING AND 1 PRECEDING) AS more_recent FROM wacky) foo WHERE
> valid
> <@ more_recent;
> QUERY PLAN
>
>
> -----------------------------------------------------------------------------------------------------------------------------
> Subquery Scan on foo (cost=11063.57..13879.37 rows=433 width=0) (actual
> time=566009.972..566009.973 rows=0 loops=1)
> Filter: (foo.valid <@ foo.more_recent)
> Rows Removed by Filter: 98094
> -> WindowAgg (cost=11063.57..12796.37 rows=86640 width=72) (actual
> time=21.996..565998.800 rows=98094 loops=1)
> -> Sort (cost=11063.57..11280.17 rows=86640 width=40) (actual
> time=21.988..26.154 rows=98094 loops=1)
> Sort Key: wacky.priority DESC
> Sort Method: external merge Disk: 3848kB
> -> Seq Scan on wacky (cost=0.00..1588.40 rows=86640
> width=40)
> (actual time=0.014..6.868 rows=98094 loops=1)
> Planning Time: 0.178 ms
> Execution Time: 566010.770 ms
> (10 rows)
>
>
Maybe there is some problem in range_deserialize function

40,53% postgres [.]
range_deserialize ◆
8,28% postgres [.]
FunctionCall2Coll ▒
7,14% postgres [.] range_compare

4,95% postgres [.] qsort_arg

4,51% postgres [.]
range_cmp_bounds ▒
2,49% postgres [.] timestamp_cmp

1,73% postgres [.]
range_serialize ▒
0,91% postgres [.] AllocSetAlloc

Regards

Pavel

Time: 566018.613 ms (09:26.019)
>

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2023-01-30 12:36:21 Re: range_agg extremely slow compared to naive implementation in obscure circumstances
Previous Message Duncan Sands 2023-01-30 09:59:37 range_agg extremely slow compared to naive implementation in obscure circumstances