Weird behaviour in planner (PostgreSQL v 9.2.14)

From: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
To: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Weird behaviour in planner (PostgreSQL v 9.2.14)
Date: 2015-12-11 14:57:23
Message-ID: CAK-MWwQzUMCR4D7vUWgK47d__dfFpzk9D-F1LqxAxXgTNEv=7Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I found very weird behaviour on planner side with estimation error of
700.000.000.

Situation (with explain analyze):

EXPLAIN ANALYZE
select * from person2obj
WHERE
p2o_id IN (SELECT p2o_id::bigint FROM (SELECT * FROM (SELECT column1 AS
p2o_id FROM (
VALUES ('2056892'), up to 199 values total
) AS __CDP_VALUES__) AS __CDP_DATA__) AS __TARGET__ );

;

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=2.99..16316.34 rows=199 width=58) (actual
time=0.196..1.202 rows=198 loops=1)
-> HashAggregate (cost=2.99..4.98 rows=199 width=32) (actual
time=0.160..0.205 rows=199 loops=1)
-> Values Scan on "*VALUES*" (cost=0.00..2.49 rows=199 width=32)
(actual time=0.003..0.088 rows=199 loops=1)
-> Index Scan using pk_person2obj on person2obj (cost=0.00..81.96
rows=1 width=58) (actual time=0.004..0.004 rows=1 loops=199)
Index Cond: (p2o_id = ("*VALUES*".column1)::bigint)

​Estimate looks pretty reasonable.

However, with length of the value list 200 (or more), the database switch
to completely different (and very weird) estimation of 700.000.000:

QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=3.00..16398.33 rows=714143698 width=58) (actual
time=0.200..1.239 rows=200 loops=1)
-> HashAggregate (cost=3.00..5.00 rows=200 width=32) (actual
time=0.165..0.201 rows=200 loops=1)
-> Values Scan on "*VALUES*" (cost=0.00..2.50 rows=200 width=32)
(actual time=0.004..0.090 rows=200 loops=1)
-> Index Scan using pk_person2obj on person2obj (cost=0.00..81.96
rows=1 width=58) (actual time=0.004..0.004 rows=1 loops=200)
Index Cond: (p2o_id = ("*VALUES*".column1)::bigint)

The all estimates looks ok until the final nested loop plan estimate of
​700.000.000

PS: the person2obj table contains ~1.4 billion tuples, p2o_id - primary key.

--
Maxim Boguk
Senior Postgresql DBA
http://www.postgresql-consulting.ru/ <http://www.postgresql-consulting.com/>

Phone RU: +7 910 405 4718
Phone AU: +61 45 218 5678

LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
Jabber: maxim(dot)boguk(at)gmail(dot)com

"People problems are solved with people.
If people cannot solve the problem, try technology.
People will then wish they'd listened at the first stage."

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Corradini, Carlos 2015-12-11 15:10:49 Re: [GENERAL] plpgsql function with RETURNS SETOF refcursor in JAVA
Previous Message Adrian Klaver 2015-12-11 14:37:10 Re: [GENERAL] plpgsql function with RETURNS SETOF refcursor in JAVA