From: | Richard Huxton <dev(at)archonet(dot)com> |
---|---|
To: | Scara Maccai <m_lists(at)yahoo(dot)it> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: totally different plan when using partitions |
Date: | 2009-08-13 09:22:24 |
Message-ID: | 4A83DB50.9040607@archonet.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Scara Maccai wrote:
> Thank you for your reply. This makes partitions unusable for me...
> hope someone explains why this happens... this still looks like a bug
> to me... BTW the problem arises when adding the second "left outer
> join": when using only 1 partitioned table (that is, only 1 "left
> outer join") the 2 plans are identical.
>
> My answers follow.
>
>> Well, the first version was expecting about 400 rows, the second
>> 15000, so it's not surprising that they have different plans.
>
> Well, they're using exactly the same tables; I don't understand why
> one expects 400 rows and the other 15000....
Well let's see. It's difficult to be certain without the table
definitions and sizes, but...
Both sequentially scan "cell_bsc_60_0610" because it will return < 300
rows. Seems sensible, and it gets the row estimate right.
The fast one then uses the index "teststscell73_0610_1_pkey" to probe
for matches and then again via "teststscell13_0610_1_pkey". Two nested
loops only make sense where you have a small number of rows matching,
otherwise it can get expensive going back and fore to the index and
table all the time. In this case, we end up matching more rows than we
expected (rows=60 loops=285 gives us ~17100 rows to check against the
second index) but not enough to slow us down.
That first plan estimated a cost of 33391 for its 408 rows and the
second 70402 for 15982 rows. That's double the estimated cost, but it
takes four times longer to complete. If the first query had estimated
the number of rows correctly it would have *looked* more expensive than
the second. So - with your particular setup PostgreSQL thinks it takes
longer to do index searches than it really does (or perhaps it thinks
sorting is quicker, or both).
You might want to adjust your cost estimates (see ch 18.6 of the
manuals). Start by checking effective_cache_size. Then maybe reduce
random_page_cost. Small steps though - these settings will affect all
your other queries too.
Also if you have the time, try issuing "set enable_mergejoin=off" and
re-issue query #2. That will force it to choose another plan.
Oh - if the partitions hold historical (unchanging) data it might be
worth CLUSTERing them on the pkey index too.
>> I'm not sure whether the planner is smart enough to cope with the
>> multiple tests on time vs the partitioning and realise it can use
>> your index on the partition.
>
> Sorry, didn't understand that...
The partitioning relies on proving that only partition P needs to be
considered. The more complicated the situation the harder it is to do
that. In this case, I'm not sure that's whats happening though - it
seems to narrow the partition down well enough.
>> I'm assuming the partition is defined as being "between '2006-10-01
>> 00:00:00' and '2006-10-06 00:00:00'" - strikes me as an odd
>> interval, and you don't say anywhere.
>
> Data is partitioned on a "4 tables per month" basis
But that's from the start of the 1st to the start of the 6th - five full
days rather than 7+. It also *includes* the start of the 6th (<= not <)
which suggests overlap on the partitions. What does the definition of
your partition say *exactly*?
--
Richard Huxton
Archonet Ltd
From | Date | Subject | |
---|---|---|---|
Next Message | Sam Mason | 2009-08-13 10:33:38 | Re: Looping through string constants |
Previous Message | Greg Smith | 2009-08-13 08:00:06 | Re: Help interpreting pg_stat_bgwriter output |