From: | Steve Crawford <scrawford(at)pinpointresearch(dot)com> |
---|---|
To: | "Girish Bajaj" <gbajaj(at)tietronix(dot)com>, <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: Table Partitioning and Rules |
Date: | 2003-07-17 19:39:51 |
Message-ID: | 200307171239.51837.scrawford@pinpointresearch.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
OK, so basically you are trying to keep a hundered some odd attributes on
everyone in the US. It's possible that a 150 column table is properly
normalized (I have a similar situation) but it is rare.
Suppose it is really properly normalized. You can still benefit from indexes
on just some of the columns by choosing those most commonly used in queries.
You may also want to research partial indexes (create index foo ... where
bar=baz) which can under certain circumstances be far smaller and faster than
full indexes.
Breaking the table up won't improve the full table scan - in fact as each
sub-table's file grows it will probably fragment on the disk much worse than
a single growing file would which will, along with all the overhead of
joining all the tables, make things worse.
Review your structure carefully. Plan on $$$ for the hardware.
Cheers,
Steve
On Thursday 17 July 2003 12:22 pm, Girish Bajaj wrote:
> The problem is that Im worried about sequential scans. This particular
> table can have upto 150 cols and 250 million records. Now we have a
> reporting requirement that someone could select on ANY col and filter on
> any col as well. Meaning someone could so a SELECT on col number
> 1,2,310,1000 from contact where col num 75='X' and col num 139 = 'Y'.
>
> I cant possibly index all the cols in the table. So I thought Id best
> manage the data by splitting up the table into multiple partitions and
> eventually depending on application logic, only scan those tables that are
> necessary to scan sequentially instead of the whole big table.
>
> Im getting a little confused here cause eventually I would want to join in
> this 250 million gigantic table as well.. and that would be a real big
> problem causing loads of sequential scans wouldn't it?
>
> Thanks,
> Girish
>
>
> -----Original Message-----
> From: Josh Berkus [mailto:josh(at)agliodbs(dot)com]
> Sent: Thursday, July 17, 2003 2:03 PM
> To: Richard Huxton; Girish Bajaj; pgsql-sql(at)postgresql(dot)org
> Subject: Re: [SQL] Table Partitioning and Rules
>
> Girish,
>
> > > Essentially Im trying to store a persons information in a table in the
> > > database. Since we could have millions of people, with duplicates! Ive
> > > decided we need to partition the table into segments where all people
>
> with
>
> > > the LastName starting from A to G will be in one table. H-N will be in
> > > another table and O-Z in the third. Ive created a VIEW that does a
> > > UNION
>
> on
>
> > > all the tables.
>
> This sounds hideously inefficient and a management headache besides. I
> think
> PostgreSQL will accept up to 2 billion rows in any one table, and splitting
> stuff into 3 tables will not improve your performance ... quite the
> opposite.
>
> Change your database design.
From | Date | Subject | |
---|---|---|---|
Next Message | Dmitry Tkach | 2003-07-17 19:42:14 | Re: Table Partitioning and Rules |
Previous Message | Josh Berkus | 2003-07-17 19:38:42 | Re: Table Partitioning and Rules |