Quick Links

Re: 7.4beta2 vs 7.3.3

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Gaetano Mendola <mendola(at)bigfoot(dot)com>
Cc:	"pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 7.4beta2 vs 7.3.3
Date:	2003-09-20 15:12:32
Message-ID:	23810.1064070752@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Gaetano Mendola <mendola(at)bigfoot(dot)com> writes:
> What about the wrong row expected ?

After I looked more closely, I realized that the planner hasn't any hope
of getting a really correct answer on that. You've got

WHERE ... ud.id_class = cd.id_class AND
cd.id_provider = 39;

Now the planner doesn't have any problem figuring out that this will
select one "cd" row, but the number of "ud" rows matched varies wildly
depending on which one cd.id_class value happens to be involved.
Without actually pre-executing the query it has no way to know which
value will be involved, and so it has to fall back to a default
estimate, which is IIRC (number of rows in ud)/(number of distinct values).
Then there's a similar problem with estimating the number of rows
retrieved from ul.

> Anyway if the rows expected are 400 ( instead of 43 ) why not an index
> scan, with 400 rows on 1500000 seems a good choise do an index scan,
> isn't it ?

The trouble here is that because of the very skewed data statistics (in
both ud and ul), the planner can't really be sure that this query will
retrieve only a few rows from either table. There are other values in
both tables that would have retrieved vastly more data. The hash join
may be slower for this particular id_provider value, but it won't get
very much worse with other id_provider values --- a nestloop plan will.

Your idea of reducing id_provider to id_class using a separate query
seems like a good one to me --- that will allow the planner to generate
different plans depending on which id_class value is involved.

regards, tom lane

In response to

Re: 7.4beta2 vs 7.3.3 at 2003-09-20 10:27:20 from Gaetano Mendola

Responses

Re: 7.4beta2 vs 7.3.3 at 2003-09-20 19:48:37 from Gaetano Mendola

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Neil Conway	2003-09-20 15:34:24	Re: why postgresql is so slow?
Previous Message	Tom Lane	2003-09-20 14:42:05	Re: semtimedop instead of setitimer/semop/setitimer