Re: Query run in 27s with 15.2 vs 37ms with 14.6

From: Charles <peacech(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: Query run in 27s with 15.2 vs 37ms with 14.6
Date: 2023-02-21 02:07:31
Message-ID: CABthHP-eDmpBgnP5nnw9=TKpRPbTjgVyJx=y6E7pq530=kJ-qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Feb 21, 2023 at 9:03 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Greetings,
>
> On Mon, Feb 20, 2023 at 20:54 Charles <peacech(at)gmail(dot)com> wrote:
>
>> On Tue, Feb 21, 2023 at 3:58 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>
>>> Greetings,
>>>
>>> * Charles (peacech(at)gmail(dot)com) wrote:
>>> > Wrapping the query with a select * from (...) t where length(code) = 4
>>> puts
>>> > the execution time back to 27 seconds.
>>> >
>>> > This is a bit unexpected since I expect that the result from the inner
>>> > query to be executed first and then filtered.
>>>
>>> It's really not- PG will (correctly) attempt to pull in such subselects
>>> into the overall optimization, which is generally better for everyone.
>>> If you want to force it, you can use a WITH MATERIALIZED CTE, or throw
>>> in an 'OFFSET 0' as a hack into your sub-select, but really it's a much
>>> better idea to generate extended stats on what you're filtering as has
>>> been suggested, or come up with a better data representation where
>>> you're not doing a search on a 'length()' as you are.
>>>
>>>
>>>
>> Generating extended statistics on the expression (length(code)) causes
>> the planner to generate has join which runs in 183ms (still slower than
>> 37ms on 14.6). Using materialized cte (no extended stats) results in nested
>> loop that runs in 229ms. I think I'll revert back to postgresql 14 for now.
>> Thank you for the suggestions.
>>
>
> As pointed out elsewhere, the query costs seem pretty close and so you’re
> likely going to get with the bad plan at some point on 14 also. Have you
> tried tweaking the statistics target? Might help. Though really, as was
> said before, modeling the data better likely would help a lot.
>
>
I have used 14.6 for probably a year running this query multiple times a
day (>50x) and have never experienced a slow response time.

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2023-02-21 02:22:03 Re: Query run in 27s with 15.2 vs 37ms with 14.6
Previous Message Stephen Frost 2023-02-21 02:02:50 Re: Query run in 27s with 15.2 vs 37ms with 14.6