From: | Jeremy Harris <jgh(at)wizmail(dot)org> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: database size growing continously |
Date: | 2009-10-30 20:18:45 |
Message-ID: | 4AEB4A25.7080701@wizmail.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 10/30/2009 08:01 PM, Greg Stark wrote:
> On Fri, Oct 30, 2009 at 12:53 PM, Anj Adu<fotographs(at)gmail(dot)com> wrote:
>> Any relational database worth its salt has partitioning for a reason.
>>
>> 1. Maintenance. You will need to delete data at some
>> point.(cleanup)...Partitions are the only way to do it effectively.
>
> This is true and it's unavoidably a manual process. The database will
> not know what segments of the data you intend to load and unload en
> masse.
>
>> 2. Performance. Partitioning offer a way to query smaller slices of
>> data automatically (i.e the query optimizer will choose the partition
>> for you) ...very large tables are a no-no in any relational
>> database.(sheer size has limitations)
>
> This I dispute. Databases are designed to be scalable and very large
> tables should perform just as well as smaller tables.
>
> Where partitions win for performance is when you know something about
> how your data is accessed and you can optimize the access by
> partitioning along the same keys. For example if you're doing a
> sequential scan of just one partition or doing a merge join of two
> equivalently partitioned tables and the partitions can be sorted in
> memory.
>
> However in these cases it is possible the database will become more
> intelligent and be able to achieve the same performance gains
> automatically. Bitmap index scans should perform comparably to the
> sequential scan of individual partitions for example.
>
So, on the becoming more intelligent front: PostgreSQL already does
some operations as background maintenance (autovacuum). Extending
this to de-bloat indices does not seem conceptually impossible, nor for
the collection of table-data statistics for planner guidance (also, why
could a full-table-scan not collect stats as a side-effect?). Further out,
how about the gathering of statistics on queries to guide the automatic
creation of indices? Or to set up a partitioning scheme on a previously
monolithic table?
- Jeremy
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2009-10-30 22:16:44 | Re: database size growing continously |
Previous Message | Anj Adu | 2009-10-30 20:11:06 | Re: database size growing continously |