From: | Jeff Davis <pgsql(at)j-davis(dot)com> |
---|---|
To: | Gregory Stark <stark(at)enterprisedb(dot)com> |
Cc: | Zeugswetter Andreas ADI SD <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Neil Conway <neilc(at)samurai(dot)com>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, pgsql-hackers(at)postgreSQL(dot)org |
Subject: | Re: [PATCHES] Proposed patch: synchronized_scanning GUCvariable |
Date: | 2008-01-29 17:48:15 |
Message-ID: | 1201628895.10057.677.camel@dogma.ljc.laika.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
On Tue, 2008-01-29 at 10:55 +0000, Gregory Stark wrote:
> "Zeugswetter Andreas ADI SD" <Andreas(dot)Zeugswetter(at)s-itsolutions(dot)at> writes:
>
> > Sorry, but I don't grok this at all. Why the heck would we care if we have 2
> > parts of the table perfectly clustered, because we started in the middle ?
> > Surely our stats collector should recognize such a table as perfectly
> > clustered. Does it not ? We are talking about one breakage in the readahead
> > logic here, this should only bring the clustered property from 100% to some
> > 99.99% depending on table size vs readahead window.
>
> Well clusteredness is used or could be used for a few different heuristics,
> not all of which this would be quite as well satisfied as readahead. But for
Can you give an example? Treating a file as a circular structure does
not impose any significant cost that I can see.
> It would be great if Postgres picked up a serious statistics geek who could
> pipe up in discussions like this with "how about using the Euler-Jacobian
> Centroid" or some such thing. If you have any suggestions of what metric to
> use and how to calculate the info we need from it that would be great.
Agreed.
> One suggestion from a long way back was scanning the index and counting how
> many times the item pointer moves backward to an earlier block. That would
An interesting metric. As you say, we really need a statistician to
definitively say what the correct metrics are, and what kind of sampling
we need to make good estimates.
> still require a full index scan though. And it doesn't help for columns which
> aren't indexed though I'm not sure we need this info for columns which aren't
> indexed. It's also not clear how to interpolate from that the amount of random
> access a given query would perform.
I don't think "clusteredness" has any meaning at all in postgres for an
unindexed column. I suppose a table could be clustered without an index,
but currently there's no way to do that in postgresql.
Regards,
Jeff Davis
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-01-29 18:40:02 | Re: Large pgstat.stat file causes I/O storm |
Previous Message | Cristian Gafton | 2008-01-29 16:43:46 | Re: Large pgstat.stat file causes I/O storm |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-01-29 18:20:27 | Re: NUMERIC key word |
Previous Message | Zeugswetter Andreas ADI SD | 2008-01-29 15:37:34 | Re: [PATCHES] Proposed patch: synchronized_scanning GUCvariable |