From: | Charles <peacech(at)gmail(dot)com> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: Query run in 27s with 15.2 vs 37ms with 14.6 |
Date: | 2023-02-21 02:07:31 |
Message-ID: | CABthHP-eDmpBgnP5nnw9=TKpRPbTjgVyJx=y6E7pq530=kJ-qw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Tue, Feb 21, 2023 at 9:03 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Greetings,
>
> On Mon, Feb 20, 2023 at 20:54 Charles <peacech(at)gmail(dot)com> wrote:
>
>> On Tue, Feb 21, 2023 at 3:58 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>
>>> Greetings,
>>>
>>> * Charles (peacech(at)gmail(dot)com) wrote:
>>> > Wrapping the query with a select * from (...) t where length(code) = 4
>>> puts
>>> > the execution time back to 27 seconds.
>>> >
>>> > This is a bit unexpected since I expect that the result from the inner
>>> > query to be executed first and then filtered.
>>>
>>> It's really not- PG will (correctly) attempt to pull in such subselects
>>> into the overall optimization, which is generally better for everyone.
>>> If you want to force it, you can use a WITH MATERIALIZED CTE, or throw
>>> in an 'OFFSET 0' as a hack into your sub-select, but really it's a much
>>> better idea to generate extended stats on what you're filtering as has
>>> been suggested, or come up with a better data representation where
>>> you're not doing a search on a 'length()' as you are.
>>>
>>>
>>>
>> Generating extended statistics on the expression (length(code)) causes
>> the planner to generate has join which runs in 183ms (still slower than
>> 37ms on 14.6). Using materialized cte (no extended stats) results in nested
>> loop that runs in 229ms. I think I'll revert back to postgresql 14 for now.
>> Thank you for the suggestions.
>>
>
> As pointed out elsewhere, the query costs seem pretty close and so you’re
> likely going to get with the bad plan at some point on 14 also. Have you
> tried tweaking the statistics target? Might help. Though really, as was
> said before, modeling the data better likely would help a lot.
>
>
I have used 14.6 for probably a year running this query multiple times a
day (>50x) and have never experienced a slow response time.
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-02-21 02:22:03 | Re: Query run in 27s with 15.2 vs 37ms with 14.6 |
Previous Message | Stephen Frost | 2023-02-21 02:02:50 | Re: Query run in 27s with 15.2 vs 37ms with 14.6 |