Quick Links

Re: Data warehousing requirements

From:	Josh Berkus <josh(at)agliodbs(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Cc:	Gabriele Bartolini <angusgb(at)tin(dot)it>, "Aaron Werman" <awerman2(at)hotmail(dot)com>
Subject:	Re: Data warehousing requirements
Date:	2004-10-07 22:50:20
Message-ID:	200410071550.20665.josh@agliodbs.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Gabriele,

> That's another interesting argument. Again, I had in mind the space
> efficiency principle and I decided to use null IDs for dimension tables if
> I don't have the information. I noticed though that in those cases I can't
> use any index and performances result very poor.

For one thing, this is false optimization; a NULL isn't saving you any table
size on an INT or BIGINT column. NULLs are only smaller on variable-width
columns. If you're going to start counting bytes, make sure it's an informed
count.

More importantly, you should never, ever allow null FKs on a star-topology
database. LEFT OUTER JOINs are vastly less efficient than INNER JOINs in a
query, and the difference between having 20 outer joins for your data view,
vs 20 regular joins, can easily be a difference of 100x in execution time.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Re: Data warehousing requirements at 2004-10-07 17:07:04 from Gabriele Bartolini

Responses

Re: Data warehousing requirements at 2004-10-08 02:43:33 from Tom Lane

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Aaron Werman	2004-10-08 01:19:44	Re: Data warehousing requirements
Previous Message	Mischa Sandberg	2004-10-07 20:00:04	Re: sequential scan on select distinct