Re: Smaller multiple tables or one large table?

From: Benedict Holland <benedict(dot)m(dot)holland(at)gmail(dot)com>
To: John R Pierce <pierce(at)hogranch(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Smaller multiple tables or one large table?
Date: 2012-06-15 18:58:32
Message-ID: CAD+mzoxuwowWpUHb+HOUGVjCzdXJMxXETeNMB9-8pKMtGvEkhg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Will the processes know that I have n tables which are constrained in their
definition on primary keys? I am thinking a table constraint specifying
that the primary key on that table is within some boundary. That way the
single process can spawn one thread per n table and leave the thread
management to the OS. Assuming it is well behaved, this should use every
ounce of resource I throw at it and instead of sequentially going though
one large table, it will sequentially go through 1 of n short tables in
parallel with k other tables. The results of this would have to be
aggregated but with a large enough table, the aggregation would pale in
comparison to the run time of the query split between several smaller
tables.

The tables would have to be specified with a table pk constraint falling
between two ranges. A view would then be created to manage all of the small
tables with triggers handling insert and update operations. Select would
have to be view specific but that is really cheap compared to updates. That
should have the additional benefit of only hitting a specific table(s) with
an update.

Basically, I don't see how this particular configuration breaks and if
PostgreSQL already has the ability to do this as it seems very useful to
manage very large data sets.

Thanks,
~Ben

On Fri, Jun 15, 2012 at 2:42 PM, John R Pierce <pierce(at)hogranch(dot)com> wrote:

> On 06/15/12 11:34 AM, Benedict Holland wrote:
>
>> I am on postgres 9.0. I don't know the answer to what should be a fairly
>> straight forward question. I have several static tables which are very
>> large (around the order of 14 million rows and about 10GB). They are all
>> linked together through foreign keys and indexed on rows which are queried
>> and used most often. While they are more or less static, update operations
>> do occur. This is not on a super fast computer. It has 2 cores with 8gb of
>> ram so I am not expecting queries against them to be very fast but I am
>> wondering in a structural sense if I should be dividing up the tables into
>> 1 million row tables through constraints and a view. The potential speedup
>> I could see being quite large where postgresql would split off all of the
>> queries into n table chucks running on k cores and then aggregate all of
>> the data for display or operation. Is there any documentation to make
>> postgesql do this and is it worth it?
>>
>
> postgres won't do that, one query is one process. your application could
> conceivably run multiple threads, each with a seperate postgres connection,
> and execute multiple queries in parallel, but it would have to do any
> aggregation of the results itself.
>
>
>
>> Also, is there a benefit to have one large table or many small tables as
>> far indexes go?
>>
>
> small tables only help if you can query the specific table you 'know' has
> your data, for instance, if you have time based data, and you put a month
> in each table, and you know that this query only needs to look at the
> current month, so you just query that one month's table.
>
>
>
> --
> john r pierce N 37, W 122
> santa cruz ca mid-left coast
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-general<http://www.postgresql.org/mailpref/pgsql-general>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mark Phillips 2012-06-15 20:20:52 Re: full text index / search
Previous Message John R Pierce 2012-06-15 18:42:10 Re: Smaller multiple tables or one large table?