Quick Links

Re: Thousands of tables versus on table?

From:	Thomas Andrews <tandrews(at)soliantconsulting(dot)com>
To:	Y Sidhu <ysidhu(at)gmail(dot)com>
Cc:	Gregory Stark <stark(at)enterprisedb(dot)com>, Mark Lewis <mark(dot)lewis(at)mir3(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject:	Re: Thousands of tables versus on table?
Date:	2007-06-04 20:14:14
Message-ID:	C289EAD6.A79%tandrews@soliantconsulting.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Linux 2.4.9, if I¹m reading this right.

=thomas

On 6/4/07 4:08 PM, "Y Sidhu" <ysidhu(at)gmail(dot)com> wrote:

> On 6/4/07, Thomas Andrews <tandrews(at)soliantconsulting(dot)com> wrote:
>>
>>
>>
>> On 6/4/07 3:43 PM, "Gregory Stark" <stark(at)enterprisedb(dot)com> wrote:
>>
>>> >
>>> > "Thomas Andrews" < tandrews(at)soliantconsulting(dot)com
>>> <mailto:tandrews(at)soliantconsulting(dot)com> > writes:
>>> >
>>>> >> I guess my real question is, does it ever make sense to create thousands
of
>>>> >> tables like this?
>>> >
>>> > Sometimes. But usually it's not a good idea.
>>> >
>>> > What you're proposing is basically partitioning, though you may not
>>> actually
>>> > need to put all the partitions together for your purposes. Partitioning's
>>> main
>>> > benefit is in the management of the data. You can drop and load partitions
>>> in
>>> > chunks rather than have to perform large operations on millions of
>>> records.
>>> >
>>> > Postgres doesn't really get any faster by breaking the tables up like
>>> that. In
>>> > fact it probably gets slower as it has to look up which of the thousands
>>> of
>>> > tables you want to work with.
>>> >
>>> > How often do you update or delete records and how many do you update or
>>> > delete? Once per day is a very low frequency for vacuuming a busy table,
>>> you
>>> > may be suffering from table bloat. But if you never delete or update
>>> records
>>> > then that's irrelevant.
>>
>> It looks like the most inserts that have occurred in a day is about 2000.
>> The responders table has 1.3 million records, the responses table has 50
>> million records. Most of the inserts are in the responses table.
>>
>>> >
>>> > Does reindexing or clustering the table make a marked difference?
>>> >
>>
>> Clustering sounds like it might be a really good solution. How long does a
>> cluster command usually take on a table with 50,000,000 records? Is it
>> something that can be run daily/weekly?
>>
>> I'd rather not post the schema because it's not mine - I'm a consultant. I
>> can tell you our vacuum every night is taking 2 hours and that disk IO is
>> the real killer - the CPU rarely gets higher than 20% or so.
>>
>> =thomas
>>
>>
>> ---------------------------(end of broadcast)---------------------------
>> TIP 5: don't forget to increase your free space map settings
>
>
> What OS are you running on?
>

In response to

Re: Thousands of tables versus on table? at 2007-06-04 20:08:38 from Y Sidhu

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Gregory Stark	2007-06-04 20:18:43	Re: Thousands of tables versus on table?
Previous Message	Y Sidhu	2007-06-04 20:08:38	Re: Thousands of tables versus on table?