Re: Selecting across Multiple Tables

From: will trillich <will(at)serensoft(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Selecting across Multiple Tables
Date: 2002-01-10 17:41:20
Message-ID: 20020110114120.A28002@serensoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Dec 05, 2001 at 02:43:07PM +0100, Antonio Fiol Bonnín wrote:
> I'm afraid that performance shoud be substantially better if you had a
> single table with all the attributes. Except for situations like:
>
> Contact_name, Company_name, Phone_number, Billing_address
>
> In this case, billing address is something "company-specific", and not
> "contact-specific". So I'd see something like:
>
> Table1: Contact_name, Phone_number, id_company
> Table2: id_company, Company_name, Billing_address
>
> If and only if you intend to have multiple contacts for each company. If
> you only intend to have one contact for each company, performance (and
> file size) will be better if you stick everything in the same table.
>
> Another situation that could lead to good results by splitting tables is
> the following:
>
> Original Table: Report_number, magnitudeA, magnitudeB, magnitudeC
>
> If reports usually only contain one magnitude, either A, B, or C, you
> may think of splitting that into three tables:
>
> Table A: Report_number, magnitudeA
> Table B: Report_number, magnitudeB
> Table C: Report_number, magnitudeC
>
> You may substantially improve disk usage, but it depends mostly on how
> your data is organised. However, performance does not seem to me that it
> may get improved by splitting tables, in general, as you will need (in
> general, again) to re-join the information to access it.
>
> Each case is different, and it depends much on your numbers.
>
> As a general advice:
>
> EXPLAIN SELECT ...
> And then analyze precisely what is happening. A long (high cost) seq
> scan is bad. But an index scan can also be bad, if it is not on the best
> possible index. I have recently optimised a request that used to take
> about 50 sec, and now it is sub-second, just by finding (and creating)
> the missing index.
>
> In your case, I suppose I should not need to say that if you have the
> split tables you should index all of them on the ID you use to access
> the data.

very helpful nudgings... nice general advice! we need more like
this. nice work, Antonio!

> I hope that helps, but I could be completely wrong.

:)

--
DEBIAN NEWBIE TIP #99 from Greg Wiley <greg(at)orthogony(dot)com>
:
Wondering WHERE TO SPECIFY BASH DEFAULTS? In case you are in X
and launching a term, .bashrc is automatically loaded (instead
of .bash_profile, .profile, .login) when bash is not a login
shell.

Also see http://newbieDoc.sourceForge.net/ ...

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2002-01-10 18:22:07 Re: Performance tips
Previous Message Bruce Momjian 2002-01-10 17:38:07 Re: Q about function