| From: | Charles <peacech(at)gmail(dot)com> |
|---|---|
| To: | Stephen Frost <sfrost(at)snowman(dot)net> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Query run in 27s with 15.2 vs 37ms with 14.6 |
| Date: | 2023-02-21 02:07:31 |
| Message-ID: | CABthHP-eDmpBgnP5nnw9=TKpRPbTjgVyJx=y6E7pq530=kJ-qw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
On Tue, Feb 21, 2023 at 9:03 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> Greetings,
>
> On Mon, Feb 20, 2023 at 20:54 Charles <peacech(at)gmail(dot)com> wrote:
>
>> On Tue, Feb 21, 2023 at 3:58 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
>>
>>> Greetings,
>>>
>>> * Charles (peacech(at)gmail(dot)com) wrote:
>>> > Wrapping the query with a select * from (...) t where length(code) = 4
>>> puts
>>> > the execution time back to 27 seconds.
>>> >
>>> > This is a bit unexpected since I expect that the result from the inner
>>> > query to be executed first and then filtered.
>>>
>>> It's really not- PG will (correctly) attempt to pull in such subselects
>>> into the overall optimization, which is generally better for everyone.
>>> If you want to force it, you can use a WITH MATERIALIZED CTE, or throw
>>> in an 'OFFSET 0' as a hack into your sub-select, but really it's a much
>>> better idea to generate extended stats on what you're filtering as has
>>> been suggested, or come up with a better data representation where
>>> you're not doing a search on a 'length()' as you are.
>>>
>>>
>>>
>> Generating extended statistics on the expression (length(code)) causes
>> the planner to generate has join which runs in 183ms (still slower than
>> 37ms on 14.6). Using materialized cte (no extended stats) results in nested
>> loop that runs in 229ms. I think I'll revert back to postgresql 14 for now.
>> Thank you for the suggestions.
>>
>
> As pointed out elsewhere, the query costs seem pretty close and so you’re
> likely going to get with the bad plan at some point on 14 also. Have you
> tried tweaking the statistics target? Might help. Though really, as was
> said before, modeling the data better likely would help a lot.
>
>
I have used 14.6 for probably a year running this query multiple times a
day (>50x) and have never experienced a slow response time.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Rowley | 2023-02-21 02:22:03 | Re: Query run in 27s with 15.2 vs 37ms with 14.6 |
| Previous Message | Stephen Frost | 2023-02-21 02:02:50 | Re: Query run in 27s with 15.2 vs 37ms with 14.6 |