Re: Thousands of tables versus on table?

From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: david(at)lang(dot)hm, Thomas Andrews <tandrews(at)soliantconsulting(dot)com>, Mark Lewis <mark(dot)lewis(at)mir3(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Thousands of tables versus on table?
Date: 2007-06-05 19:34:08
Message-ID: 4665BAB0.2020605@g2switchworks.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Gregory Stark wrote:
> "Scott Marlowe" <smarlowe(at)g2switchworks(dot)com> writes:
>
>
>> Sorry, I think I initially read your response as "Postgres doesn't really get
>> any faster by breaking the tables up" without the "like that" part.
>>
>
> Well breaking up the tables like that or partitioning, either way should be
> about equivalent really. Breaking up the tables and doing it in the
> application should perform even better but it does make the schema less
> flexible and harder to do non-partition based queries and so on.
>
True, but we can break it up by something other than the company name on
the survey, in this instance, and might find it far easier to manage by,
say, date range, company ID range, etc...
Plus with a few hand rolled bash or perl scripts we can maintain our
database and keep all the logic of partitioning out of our app. Which
would allow developers not wholly conversant in our partitioning scheme
to participate in development without the fear of them putting data in
the wrong place.
> Where the win in partitioning comes in is in being able to disappear some of
> the data entirely. By making part of the index key implicit in the choice of
> partition you get away with a key that's half as large. And in some cases you
> can get away with using a different key entirely which wouldn't otherwise have
> been feasible to index. In some cases you can even do sequential scans whereas
> in an unpartitioned table you would have to use an index (or scan the entire
> table).
>
Yeah, I found that out recently while I benchmarking a 12,000,000 row
geometric data set. Breaking it into 400 or so partitions resulted in
no need for indexes and response times of 0.2 or so seconds, where
before that I'd been in the 1.5 to 3 second range.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Chander Ganesan 2007-06-05 20:11:51 Re: Append table
Previous Message Gregory Stark 2007-06-05 18:55:44 Re: Thousands of tables versus on table?