From: | Marti Raudsepp <marti(at)juffo(dot)org> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Bogus nestloop rows estimate in 8.4.7 |
Date: | 2012-05-28 07:45:06 |
Message-ID: | CABRT9RARmcPq-2_Q84xXcp7JCND0SdkGBwRB4t1oX5rdXTZNKA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi list,
This bug isn't causing me any immediate problems -- the plan works out
well regardless -- but PostgreSQL 8.4.7 is somehow overestimating the
number of rows coming from a nestloop join, when joining 2 large
partitioned tables. Maybe it's been fixed in more recent versions,
sadly it's an EOL Linux distro and I have no immediate plans to
upgrade.
It's estimating to join 135957 x 281 rows, but the product is somehow
2946151270877
In reality, it's joining 132577 x ~1 rows to get 133116 results
QUERY PLAN
GroupAggregate (cost=852067259163.57..977278688175.85
rows=2946151270877 width=36)
-> Sort (cost=852067259163.57..859432637340.77 rows=2946151270877 width=36)
Sort Key: b.banner_id, b.client_body_id,
(COALESCE(b.partner_body_id, a.partner_body_id)), b.space_id,
b.campaign_id, a.evt_type_id
-> Nested Loop (cost=0.00..213859871.55 rows=2946151270877 width=36)
Join Filter: (a.request_id = b.request_id)
-> Append (cost=0.00..5905.69 rows=135957 width=20)
-> Index Scan using "XIF01request" on request a
(cost=0.00..8.27 rows=1 width=20)
Index Cond: ((request_time >= '2012-05-28
09:00:00'::timestamp without time zone) AND (request_time <
'2012-05-28 10:00:00'::timestamp without time zone))
-> Index Scan using "XIF01request_1222" on
request_1222 a (cost=0.00..5897.42 rows=135956 width=20)
Index Cond: ((request_time >= '2012-05-28
09:00:00'::timestamp without time zone) AND (request_time <
'2012-05-28 10:00:00'::timestamp without time zone))
-> Append (cost=0.00..1569.44 rows=281 width=32)
-> Seq Scan on request_data b (cost=0.00..11.30
rows=130 width=32)
-> Index Scan using
"IX_relationship64_request_d_c_1150" on request_d_c_1150 b
(cost=0.00..9.56 rows=2 width=32)
Index Cond: (b.request_id = a.request_id)
*snip lots of partition index scans*
Query:
SELECT '2012-05-28T09:00:00', count(*),
uniq(sort(array_agg(visitor_id))), banner_id, client_body_id,
partner_body_id, space_id, campaign_id, evt_type_id FROM stats_request
WHERE stats_request.request_time >= '2012-05-28T09:00:00' AND
stats_request.request_time < (timestamp '2012-05-28T09:00:00' +
interval E'1 hour')::timestamp
GROUP BY banner_id, client_body_id, partner_body_id, space_id,
campaign_id, evt_type_id ORDER BY banner_id, client_body_id,
partner_body_id, space_id, campaign_id, evt_type_id;
Full EXPLAIN ANALYZE is attached.
Regards,
Marti
Attachment | Content-Type | Size |
---|---|---|
explain-analyze.txt | text/plain | 36.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Marti Raudsepp | 2012-05-28 08:18:22 | Re: Bogus nestloop rows estimate in 8.4.7 |
Previous Message | Sandro Santilli | 2012-05-28 06:48:21 | Re: Interrupting long external library calls |