From: | "Henrik Peinar (nodeSWAT(dot)com)" <henrik(at)nodeswat(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Help needed to understand query planner regression with incremental sort |
Date: | 2023-05-31 20:45:20 |
Message-ID: | CAEwi-F_HrFf5ug_FyWLB=Vtb7L_j1QiuXjY65Dw+0oRwoWR2gA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Thank you for taking the time to look this through.
After reading your answer it obviously makes perfect sense, I was just
thrown off by a fact that when it started happening, it happened on every
query execution (for the same test query that I used). But I failed to
think about trying different search keywords to see if that made any
difference in the query plan / estimations.
I'll investigate this further and will try to clean the indexes up and make
sure the filter has correct index to use.
And yes, the schema attachment seems a bit off, the DDL copy didn't include
the ordering for the fields inside composite indexes I think, so that's why
they appear the same.
Very much appreciated of your time.
With best,
Henrik.
On Wed, May 31, 2023 at 6:31 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Henrik Peinar (nodeSWAT.com)" <henrik(at)nodeswat(dot)com> writes:
> > *Quick description: *After upgrading our Aurora PostgreSQL engine from
> v11
> > to v15, one of our often run SQL's started taking tens of seconds instead
> > of running sub 100ms. Explain analyze showed that the query planner had
> > switched to using incremental sort. Running ANALYZE on the table fixed
> the
> > issue temporarily, but it came back after few days.
>
> Hmm. I think it's quite accidental that you get one plan over the other,
> because it looks like the key difference is something the planner doesn't
> account for. In the fast case you have
>
> -> Index Scan using products_pinned_at_listed_at_ix on products
> (cost=0.43..3929423.12 rows=26523 width=1463) (actual time=2.325..32.872
> rows=61 loops=1)
> Filter: ((desc_tsv @@ '''nike'':*'::tsquery) AND (status =
> 'published'::prod_status))
> Rows Removed by Filter: 3376
>
> The index is only being used to produce ordered output here: the filter
> condition isn't related to the index. And what we see is that the
> query is fast because the desired rows are found in the first 3376+61
> rows visited in this direction. Meanwhile in the slow case you have
>
> -> Index Scan Backward using products_pinned_at_ix on
> products (cost=0.43..1172249.47 rows=26139 width=1460) (actual
> time=5.263..5203.180 rows=18411 loops=1)
> Filter: ((desc_tsv @@ '''nike'':*'::tsquery) AND
> (status = 'published'::prod_status))
> Rows Removed by Filter: 5415895
>
> Again, the filter condition isn't exploiting the index, we're just using
> the index to (partially) satisfy the ORDER BY. This one takes a long time
> because it has to trawl through 5415895+61 rows before reaching the LIMIT.
>
> So AFAICS, the runtime differential is mostly/entirely because the rows
> satisfying the filter condition are located near one end of the range of
> pinned_at. That is not a correlation that the planner knows anything
> about, so it's unable to see that these two ways of scanning the table
> will have significantly different costs. Moreover, I think you'd be
> mistaken to draw any great conclusions from this specific example,
> because with some other search term(s) the results might be totally
> different due to the required rows not falling in the same place.
>
> What I'd think about if this type of query is important is to set up
> an index that can actually be used to satisfy the filter condition,
> so that you're not forcing it into "scan the whole table till you
> find the rows you want". It looks like you already have such an
> index, ie a GIN index on the desc_tsv column, although I don't trust
> that your schema attachment is actually accurate because if it is
> then you have a bunch of duplicative indexes. You might try
> dropping the other indexes to see if you can coerce the planner
> into using that one, and then seeing what the cost estimate is.
>
> regards, tom lane
>
From | Date | Subject | |
---|---|---|---|
Next Message | Lian Jiang | 2023-05-31 20:57:39 | Re: speed up full table scan using psql |
Previous Message | Mayer, Nicholas J | 2023-05-31 20:41:09 | RE: EXTERNAL: Re: Question - Does PostgreSQL have an Evaluation Assurance Level? |