From: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: slower merge join on sorted data chosen over nested loop |
Date: | 2005-10-17 19:17:20 |
Message-ID: | 20051017191720.GI86144@pervasive.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Oct 10, 2005 at 09:10:38PM -0400, Tom Lane wrote:
> I recall thinking about changing the formula to more accurately count
> the number of pages touched; but I desisted when I realized that it
> would drastically increase the cost estimates for index searches, and
> that's surely the wrong direction to be going in. We really can't do
> that until we have some usable infrastructure to allow estimating the
> probability that those pages are already in cache. In the meantime,
> the tweaks under discussion here amount to assuming that the metapage
> and all upper pages are always in cache.
>
> The current cost estimate to fetch a single tuple via indexscan is
> basically random_page_cost + 2, plus some near-negligible cpu costs.
> Not counting the metapage would take that down to random_page_cost + 1.
> This would definitely move the goalposts, particularly for people
> who run with smaller-than-default random_page_cost, but I'm not sure
> if it's enough to solve the problem.
>
> Also, all this is really just a sideshow; I think the main problem for
> join estimation is that because we cost an inner-indexscan nestloop as
> taking N times the cost of one execution of the inner scan, we fail to
> account for cacheing effects in the table itself as well as the index.
> That would cut into the random_page_cost part of the cost estimate as
> well as the index cost. For all the reasons I've cited, it's pretty
> hard to justify reducing the estimate for an indexscan standing on its
> own --- but in the context of a nestloop join, it's easier to make a
> case.
One thing I noticed the last time I looked at all of this was that index
correlation seems to be severely mis-weighted in scan calculations.
http://archives.postgresql.org/pgsql-performance/2005-04/msg00669.php
has more info on this.
I suspect that until that issue is addressed other changes to the cost
estimates won't make any useful difference.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
From | Date | Subject | |
---|---|---|---|
Next Message | Jim C. Nasby | 2005-10-17 19:55:03 | Re: slower merge join on sorted data chosen over |
Previous Message | Jonah H. Harris | 2005-10-17 19:10:56 | Re: PostgreSQL roadmap for 8.2 and beyond. |