Re: Query performance problems with partitioned tables

From: Fei Liu <fei(dot)liu(at)aepnetworks(dot)com>
To: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>, Andreas Haumer <andreas(at)xss(dot)co(dot)at>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Query performance problems with partitioned tables
Date: 2007-05-08 18:41:38
Message-ID: 4640C462.2040909@aepnetworks.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Scott Marlowe wrote:
> On Thu, 2007-05-03 at 21:37, Merlin Moncure wrote:
>
>> On 5/3/07, Fei Liu <fei(dot)liu(at)aepnetworks(dot)com> wrote:
>>
>>> Hello, Andreas, I too am having exactly the same issue as you do.
>>> Comparing my partitioned and plain table performance, I've found that
>>> the plain tables perform about 25% faster than partitioned table. Using
>>> 'explain select ...', I see that constraints are being used so in
>>> partitioned tables fewer rows are examined. But still partitioned tables
>>> are 25% slower, what a let down.
>>>
>> That's a little bit harsh. The main use of partitioning is not to
>> make the table faster but to make the maintenance easier. When
>> constraint exclusion works well for a particular query you can get a
>> small boost but many queries will break down in a really negative way.
>> So, you are sacrificing flexibility for easier maintenance. You have
>> to really be careful how you use it.
>>
>> The best case for partitioning is when you can logically divide up
>> your data so that you really only have to deal with one sliver of it
>> at a time...for joins and such. If the OP could force the constraint
>> exclusion (maybe by hashing the timestamp down to a period and using
>> that for where clause), his query would be fine. The problem is it's
>> not always easy to do that.
>>
>
> Agree++
>
> I've been testing partitioning for a zip code lookup thing that was
> posted here earlier, and I partitioned a 10,000,000 row set into about
> 400 partitions. I found that selecting a range of areas defined by x/y
> coordinates was faster without any indexes. The same selection with one
> big table and one big (x,y) index took 3 to 10 seconds typically, same
> select against the partitions with no indexes took 0.2 to 0.5 seconds.
>
> For that particular application, the only way to scale it was with
> partitioning.
>
In my particular case, I have 2 million records uniformly split up in 40
partitions. It's ranged data varying with time, each partition has one
month of data. Do you think this is a good candidate to seek performance
boost with partitioned tables?

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2007-05-08 18:58:44 Re: Query performance problems with partitioned tables
Previous Message Pomarede Nicolas 2007-05-08 17:31:24 Re: truncate a table instead of vaccum full when count(*) is 0