From: | Thierry Henrio <thierry(dot)henrio(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | effect on planner of turning a subquery to a table, sql function returning table |
Date: | 2024-04-12 10:33:06 |
Message-ID: | CAMPYKo0CNfgU64VHtFe0pCP4_H9-bTXQGYKpdZLh8CcdWyGbyw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi there,
I work on a booking system.
Below is a query showing reservations of devices that overlaps a campaign.
A campaign has duration and time intervals by day of week.
Here is a query (A):
select
device_id,
t.date,
timerange(t.start_time, t.end_time) * g.times as times
from device_timeslots t
join (
select date, dow, timerange(start, stop) as times
from campaigns c
cross join generate_series(c.start_date, c.end_date, '1 day') d(date)
join unnest_timeslots(c.timeslots) t(dow, start, stop) on
t.dow=extract(dow from date)
where id=11870
) g on g.date=t.date
where t.date between '2024-04-26' and '2024-04-26'
group by device_id, t.date, t.start_time, t.end_time, g.times;
The time intervals of the campaign are stored as a jsonb (c.timeslots) and
expanded by an IMMUTABLE sql function unnest_timeslots returning a table.
The query (A) runs in 440.497 ms.
When I replace subquery with a temp table (B):
create temp table z11870 as (
select date, dow, timerange(start, stop) as times
from campaigns c
cross join generate_series(c.start_date, c.end_date, '1 day') d(date)
join unnest_timeslots(c.timeslots) t(dow, start, stop) on
t.dow=extract(dow from date)
where id=11870
);
select
device_id,
t.date,
timerange(t.start_time, t.end_time) * z.times as times
from device_timeslots t
join z11870 z on z.date=t.date
where t.date between '2024-04-26' and '2024-04-26'
group by device_id, t.date, t.start_time, t.end_time, g.times;
The query (B) runs in 48.160 ms.
Here is (B) execution plan:
GroupAggregate (cost=70121.37..71282.14 rows=33165 width=124)
Group Key: t.device_id, t.date, t.start_time, t.end_time, z.times
-> Sort (cost=70121.37..70204.28 rows=33165 width=64)
Sort Key: t.device_id, t.date, t.start_time, t.end_time, z.times,
t.rank
-> Merge Join (cost=67127.99..67631.11 rows=33165 width=64)
Merge Cond: (z.date = t.date)
-> Sort (cost=78.60..81.43 rows=1130 width=40)
Sort Key: z.date
-> Seq Scan on z11870 z (cost=0.00..21.30 rows=1130
width=40)
-> Sort (cost=67049.39..67109.04 rows=23861 width=32)
Sort Key: t.date
-> Bitmap Heap Scan on device_timeslots t
(cost=329.01..65314.41 rows=23861 width=32)
Recheck Cond: ((date >= '2024-04-26'::date) AND
(date <= '2024-04-26'::date))
-> Bitmap Index Scan on
device_timeslots_date_index (cost=0.00..323.05 rows=23861 width=0)
Index Cond: ((date >= '2024-04-26'::date)
AND (date <= '2024-04-26'::date))
, whereas the plan of (A) is:
GroupAggregate (cost=401037.82..503755.82 rows=1467400 width=124)
Group Key: t.device_id, t.date, t.start_time, t.end_time,
(timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->>
1))::time without time zone))
-> Sort (cost=401037.82..404706.32 rows=1467400 width=96)
Sort Key: t.device_id, t.date, t.start_time, t.end_time,
(timerange(((t_1.value ->> 0))::time without time zone, ((t_1.value ->>
1))::time without time zone)), t.rank
-> Nested Loop (cost=2.99..100268.62 rows=1467400 width=96)
-> Nested Loop (cost=2.98..55962.20 rows=14674 width=64)
-> Nested Loop (cost=2.54..39.31 rows=500 width=40)
-> Index Scan using campaigns_pkey on campaigns
c (cost=0.28..8.30 rows=1 width=355)
Index Cond: (id = 11870)
-> Hash Join (cost=2.26..26.01 rows=500
width=40)
Hash Cond: (EXTRACT(dow FROM d.date) =
((j.dow)::integer)::numeric)
-> Function Scan on generate_series d
(cost=0.01..10.01 rows=1000 width=8)
-> Hash (cost=1.00..1.00 rows=100
width=64)
-> Function Scan on jsonb_each j
(cost=0.00..1.00 rows=100 width=64)
-> Index Scan using device_timeslots_date_index on
device_timeslots t (cost=0.43..111.56 rows=29 width=32)
Index Cond: ((date = d.date) AND (date >=
'2024-04-26'::date) AND (date <= '2024-04-26'::date))
-> Memoize (cost=0.01..1.01 rows=100 width=32)
Cache Key: j.times
Cache Mode: binary
-> Function Scan on jsonb_array_elements t_1
(cost=0.00..1.00 rows=100 width=32)
The Merge Join of (B) provides better timing than the Nested Loop of (A)...
On the options I think:
O1) change the design, add a table much like the z11870
O2) Is there a way to hint planner to materialize a subquery?
O3) other?
Cheers!
, Thierry
From | Date | Subject | |
---|---|---|---|
Next Message | Laura Smith | 2024-04-12 10:43:52 | Obsolete Linux downloads (Debian) instructions |
Previous Message | Adrian Klaver | 2024-04-11 16:30:32 | Re: Question on trigger |