From: | Duncan Sands <duncan(dot)sands(at)deepbluecap(dot)com> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | range_agg extremely slow compared to naive implementation in obscure circumstances |
Date: | 2023-01-30 09:59:37 |
Message-ID: | dff7c0a3-6be9-e175-80d4-1ecf3b800ca9@deepbluecap.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
In general range_agg is faster than the naive version
CREATE AGGREGATE public.naive_range_agg(anymultirange) (
SFUNC = multirange_union,
STYPE = anymultirange
);
however here is an example where using it is over 6000 times slower. I'm not
sure exactly what feature of the example triggers this - I failed to create a
synthetic testcase using generate_series, thus the attached table data.
How to reproduce (Ubuntu 22.10, x86_64, postgresql 15.1-1.pgdg22.10+1):
$ cp data.txt.gz /tmp/
$ gunzip /tmp/data.txt.gz
$ psql
Pager usage is off.
psql (15.1 (Ubuntu 15.1-1.pgdg22.10+1))
Type "help" for help.
duncan=> CREATE TEMP TABLE wacky(priority bigint, valid tstzrange);
CREATE TABLE
duncan=> \COPY wacky FROM /tmp/data.txt
COPY 98094
duncan=> CREATE AGGREGATE public.naive_range_agg(anymultirange) (
SFUNC = multirange_union,
STYPE = anymultirange
);
CREATE AGGREGATE
duncan=> \timing on
Timing is on.
duncan=> EXPLAIN (ANALYZE) SELECT FROM (SELECT valid,
naive_range_agg(valid::tstzmultirange) OVER (ORDER BY priority DESC GROUPS
BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING) AS more_recent FROM wacky) foo
WHERE valid <@ more_recent;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Subquery Scan on foo (cost=11063.57..13879.37 rows=433 width=0) (actual
time=88.086..88.087 rows=0 loops=1)
Filter: (foo.valid <@ foo.more_recent)
Rows Removed by Filter: 98094
-> WindowAgg (cost=11063.57..12796.37 rows=86640 width=72) (actual
time=16.102..84.242 rows=98094 loops=1)
-> Sort (cost=11063.57..11280.17 rows=86640 width=40) (actual
time=16.096..20.205 rows=98094 loops=1)
Sort Key: wacky.priority DESC
Sort Method: external merge Disk: 3848kB
-> Seq Scan on wacky (cost=0.00..1588.40 rows=86640 width=40)
(actual time=0.021..5.479 rows=98094 loops=1)
Planning Time: 0.277 ms
Execution Time: 88.631 ms
(10 rows)
Time: 98.261 ms
duncan=> EXPLAIN (ANALYZE) SELECT FROM (SELECT valid,
range_agg(valid::tstzmultirange) OVER (ORDER BY priority DESC GROUPS BETWEEN
UNBOUNDED PRECEDING AND 1 PRECEDING) AS more_recent FROM wacky) foo WHERE valid
<@ more_recent;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Subquery Scan on foo (cost=11063.57..13879.37 rows=433 width=0) (actual
time=566009.972..566009.973 rows=0 loops=1)
Filter: (foo.valid <@ foo.more_recent)
Rows Removed by Filter: 98094
-> WindowAgg (cost=11063.57..12796.37 rows=86640 width=72) (actual
time=21.996..565998.800 rows=98094 loops=1)
-> Sort (cost=11063.57..11280.17 rows=86640 width=40) (actual
time=21.988..26.154 rows=98094 loops=1)
Sort Key: wacky.priority DESC
Sort Method: external merge Disk: 3848kB
-> Seq Scan on wacky (cost=0.00..1588.40 rows=86640 width=40)
(actual time=0.014..6.868 rows=98094 loops=1)
Planning Time: 0.178 ms
Execution Time: 566010.770 ms
(10 rows)
Time: 566018.613 ms (09:26.019)
Attachment | Content-Type | Size |
---|---|---|
data.txt.gz | application/gzip | 1004.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2023-01-30 10:42:45 | Re: range_agg extremely slow compared to naive implementation in obscure circumstances |
Previous Message | Kosio Dimitrov | 2023-01-30 08:11:26 | Re: BUG #17761: Questionable regular expression behavior |