Re: Row estimates off by two orders of magnitude with hstore

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Patrick Krecker <patrick(at)judicata(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Row estimates off by two orders of magnitude with hstore
Date: 2015-06-10 22:08:10
Message-ID: CAHyXU0ybx+aCNs3ZceRkRt-KF_tfAVeZ8iWpF1+0HhK_D5FT_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Jun 10, 2015 at 4:37 PM, Patrick Krecker <patrick(at)judicata(dot)com> wrote:
> On Wed, Jun 10, 2015 at 2:08 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Wed, Jun 10, 2015 at 3:55 PM, Patrick Krecker <patrick(at)judicata(dot)com> wrote:
>>> OK. Well, fortunately for us, we have a lot of possible solutions this
>>> problem, and it sounds like actually getting statistics for attributes
>>> ? 'reference' is not realistic. I just wanted to make sure it wasn't
>>> some configuration error on our part.
>>>
>>> Can anyone explain where exactly the estimate for that clause comes from?
>>>
>>> I tried adding an index and I don't think it improved the estimation,
>>> the planner still thinks there will be 9k rows as a result of type_id
>>> = 23 and attributes ? 'reference'. [1]. It might make the pathological
>>> plan less likely though. It's not clear to me that it reduces the risk
>>> of a pathological plan to zero.
>>
>> no, but done in conjunction with disabling managing out nestloops and
>> materliaze query plans, nestloops (say, via SET LOCAL) it will
>> probably be fast and future proof..
>>
>> merlin
>
> Wouldn't wrapping it in an optimization fence (e.g. SELECT * FROM
> (...) AS t WHERE t.attributes ? 'reference') have the same effect as
> disabling materialize, but allow the planner to optimize the inner
> query however it wants?

yes, but
select * from (query) q;

is not an optimization fence. the server is smarter than you and I and
will immediately flatten that back out :-). however,

select * from (query ... OFFSET 0) q;
and the more portable
with data as (query) select ... from query;

can fix up the estimates. they are both materialization fences
essentially though.

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message ben.play 2015-06-11 08:56:21 Re: How to reduce writing on disk ? (90 gb on pgsql_tmp)
Previous Message Patrick Krecker 2015-06-10 21:37:00 Re: Row estimates off by two orders of magnitude with hstore