From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Karl Wright <kwright(at)metacarta(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Performance query about large tables, lots of concurrent access |
Date: | 2007-06-19 14:56:15 |
Message-ID: | 28036.1182264975@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Karl Wright <kwright(at)metacarta(dot)com> writes:
> [2007-06-18 09:39:49,797]ERROR Plan: Index Scan using i1181764142395 on
> intrinsiclink (cost=0.00..14177.29 rows=5 width=253)
> [2007-06-18 09:39:49,797]ERROR Plan: Index Cond: ((jobid = $2) AND
> ((childidhash)::text = ($3)::text))
> [2007-06-18 09:39:49,797]ERROR Plan: Filter: ((childid = ($4)::text)
> AND ((isnew = ($5)::bpchar) OR (isnew = ($6)::bpchar)))
>> In this case it looks like the planner is afraid that that's exactly
>> what will happen --- a cost of 14177 suggests that several thousand row
>> fetches are expected to happen, and yet it's only predicting 5 rows out
>> after the filter.
> Well, that's odd, because the hash in question that it is using is the
> SHA-1 hash of a URL. There's essentially one row per URL in this table.
What about isnew?
Also, how many rows do *you* expect out of the query? The planner is
not going to be aware of the hashed relationship between childidhash
and childid --- it'll think those are independent conditions which they
evidently aren't. So it may be that the query really does retrieve
thousands of rows, and the rows=5 estimate is bogus because it's
double-counting the selectivity of the childid condition.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Chris Browne | 2007-06-19 15:11:19 | Re: Maintenance question / DB size anomaly... |
Previous Message | Joshua_Kramer | 2007-06-19 14:54:06 | Re: [pgsql-advocacy] [PERFORM] Postgres VS Oracle |