Re: Schema boggle...

From: Greg Stark <gsstark(at)mit(dot)edu>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Schema boggle...
Date: 2003-11-06 05:34:04
Message-ID: 87ad7a12nn.fsf@stark.dyndns.tv
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql


"Marc G. Fournier" <scrappy(at)postgresql(dot)org> writes:

> Actually, the use of schema's was my idea, to speed up some dreadfully
> slow queries dealing with traffic stats from a table that was growing
> painfully monolithic ... the idea is/was that it would be easier to
> backup/remove all data pertaining to a specific client if they decided to
> close their account ...

I would add an <aol> me too </aol> to the comment that this is a horrible idea
and will be an enormous pain. You're denormalizing your data in a way that
will cause you great pain.

What you're trying to implement is called "partitioned tables". And you're
right, bulk loading and deleting is one of the big advantages of partitioned
tables. But Postgres doesn't support partitioned tables, so trying to roll
your own is hard.

Using schemas seems like a bad impedance match here too, as you've found out
with the search_path. Inherited tables is a closer match, still a lot of
things won't be automatic, and primary keys may be a problem.

But what most people use as the closest approximation to partitioned tables in
Postgres is partial indexes. It lets you keep your indexes to a reasonable
size but still accelerate the poor-selectivity client_id column in parallel.

But you still would be storing all the records in a single table and would
have to do some big vacuuming whenever you delete a whole client. vacuum full
may be your friend.

I don't understand why the queries should be any slower dealing with the
normalized data in a single table versus the dernormalized tables in multiple
tables. The one big exception is any query doing "where client_id = ?" where
the latter allows the use of a sequential scan instead of an index scan.

--
greg

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Devrim GUNDUZ 2003-11-06 06:23:24 New developer site (Was: Re: [HACKERS] Changes to Contributor List)
Previous Message Tom Lane 2003-11-06 05:28:38 Re: equal() perf tweak

Browse pgsql-sql by date

  From Date Subject
Next Message Jerome Alet 2003-11-06 06:18:07 Re: avoid circular references
Previous Message Josh Berkus 2003-11-06 05:15:39 Re: avoid circular references