From: | Paul Pierce <paulpierce70(at)yahoo(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Issue with partition elimination |
Date: | 2011-04-22 01:26:18 |
Message-ID: | 505432.93484.qm@web120209.mail.ne1.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
This is a good one :)
Here is a "brief" description of our issue(Postgres 9.0):
Tables:
main fact table:
Table "public.parent_fact"
Column | Type |
----------------------+-----------------------------+-----------
etime | date | not null
pcamp_id | integer |
location_id | integer |
impressions | bigint |
clicks | int
this table partitioned by etime.
We are trying to build a report, which has last week numbers alongside with this
week numbers. For example: if today is Wednesday, I want to compare daily
numbers from last week 3 days (mon through wed) with this week 3 days(mon
through wed).
To accomplish that, we've decided to build a transformation table, which has two
columns:
Table "public.trans_last_week"
Column | Type | Modifiers
----------+-----------------------------+-----------
etime | date |
lw_etime | date |
So for each date(etime), we have lw_etime, which is essentially etime-7 days.
Here is the first query, which performs fine:
select a11.location_id AS location_id,
a11.pcamp_id AS pcamp_id,
sum(a11.clicks)
from parent_fact a11
where a11.etime between '2011-14-18' and '2011-04-20'
group by a11.location_id,
a11.pcamp_id
everything is good there - it calculates numbers from the current week and goes
to only 3 partitions to aggregate numbers.
Here is the second query:
select a11.location_id AS location_id,
a11.pcamp_id AS pcamp_id,
sum(a11.clicks)
from parent_fact a11
join trans_last_week a12
on (a11.etime = a12.lw_etime)
where a12.etime between '2011-14-18' and '2011-04-20'
group by a11.location_id,
a11.pcamp_id
Here it scans through all partitions in the parent_fact table and runs 3-4 times
slower.
What was noticed, that the only case when Postgres is actually going to execute
the query against the right partitions is query #1.
Is that by design? Second query join, will also result in 3 days(3 partitions)
This query (#3) also scans all partitions:
select a11.location_id AS location_id,
a11.pcamp_id AS pcamp_id,
sum(a11.clicks)
from parent_fact a11
where a11.etime in (select a12.etime from trans_last_week a12
where a11.etime = a12.lw_etime)
group by a11.location_id,
a11.pcamp_id
Thank you!
From | Date | Subject | |
---|---|---|---|
Next Message | drvillo | 2011-04-22 09:21:35 | Checkpoint execution overrun impact? |
Previous Message | Merlin Moncure | 2011-04-21 20:12:37 | Re: oom_killer |