| From: | Stephan Szabo <sszabo(at)megazone23(dot)bigpanda(dot)com> | 
|---|---|
| To: | Alvaro Herrera <alvherre(at)dcc(dot)uchile(dot)cl> | 
| Cc: | johnnnnnn <john(at)phaedrusdeinus(dot)org>, <pgsql-performance(at)postgresql(dot)org>, <pgsql-general(at)postgresql(dot)org> | 
| Subject: | Re: [PERFORM] CLUSTER command | 
| Date: | 2002-12-13 02:11:50 | 
| Message-ID: | 20021212175208.B15052-100000@megazone23.bigpanda.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general pgsql-interfaces pgsql-performance | 
On Thu, 12 Dec 2002, Alvaro Herrera wrote:
> On Thu, Dec 12, 2002 at 04:03:47PM -0800, Stephan Szabo wrote:
> > On Thu, 12 Dec 2002, johnnnnnn wrote:
> >
> > > I think the code changes would be complicated. Just at a 30-second
> > > consideration, this would need to touch:
> > > - all sql (selects, inserts, updates, deletes)
> > > - vacuuming
> > > - indexing
> > > - statistics gathering
> > > - existing clustering
> >
> > I think his idea was to treat it similarly to the way that the
> > system treats tables >2G with .N files.  The only thing is that
> > I believe the code that deals with that wouldn't be particularly
> > easy to change to do it though, but I've only taken a cursory look at
> > what I think is the place that does that(storage/smgr/md.c). Some sort of
> > good partitioning system would be nice though.
>
> I don't think this is doable without a huge amount of work.  The storage
> manager doesn't know anything about what is in a page, let alone a
> tuple.  And it shouldn't, IMHO.  Upper levels don't know how are pages
> organized in disk; they don't know about .1 segments and so on, and they
> shouldn't.
Which is part of why I said it wouldn't be easy to change to do that,
there's no good way to communicate that information.  Like I said, I
didn't look deeply, but I had to look though, because you can never tell
with bits of old university code to do mostly what you want that haven't
been exercised in years floating around.
> I think this kind of partition doesn't buy too much.  I would really
> like to have some kind of auto-clustering, but it should be implemented
> in some upper level; e.g., by leaving some empty space in pages for
> future tuples, and arranging the whole heap again when it runs out of
> free space somewhere.  Note that this is very far from the storage
> manager.
Auto clustering would be nice.
I think Jean-Luc's suggested partitioning mechanism has certain usage
patterns that it's a win for and most others that it's not. Since the
usage pattern I can think of (very large table with a small number of
breakdowns where your conditions are primarily on those breakdowns) aren't
even remotely in the domain of things I've worked with, I can't say
whether it'd end up really being a win to avoid the index reads for the
table.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Joseph Shraibman | 2002-12-13 02:17:41 | Re: Planner weakness (was: Re: ExecEvalExpr: unknown expression type | 
| Previous Message | Williams, Travis L, NPONS | 2002-12-13 02:08:03 | What port to connect on? | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jean-Luc Lachance | 2002-12-13 16:42:25 | Re: [PERFORM] CLUSTER command | 
| Previous Message | Charles H. Woloszynski | 2002-12-13 01:06:35 | Re: [PERFORM] CLUSTER command | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | johnnnnnn | 2002-12-13 03:22:38 | automated index suggestor -- request for comment | 
| Previous Message | Charles H. Woloszynski | 2002-12-13 01:06:35 | Re: [PERFORM] CLUSTER command |