From: | Cosmin Prund <cprund(at)gmail(dot)com> |
---|---|
To: | Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters |
Date: | 2020-01-16 18:15:09 |
Message-ID: | CAGU4dz-jq678=qccivjxJsrfG+jb0H-tLuHnPoJ=6BszDKJikQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hello Michael and hello again Tom, sorry for mailing you directly. I just
hit Reply in gmail - I expected the emails to have a reply-to=Pgsql.
Apparently they do not.
Running the same query with a different "Ver" produces a proper plan.
Here's a non-redacted example (Ver=91):
EXPLAIN (ANALYZE, BUFFERS) select "IdRand", "IdColoana", "Valoare" from
"LucrareBugetDate" where ("LucrareBugetVersiuneId" = 91) and
("LucrareBugetDateId" in (10,11));
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using
"IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" on
"LucrareBugetDate" (cost=0.56..4.95 rows=2 width=13) (actual
time=3.617..3.631 rows=2 loops=1)
Index Cond: (("LucrareBugetVersiuneId" = 91) AND ("LucrareBugetDateId" =
ANY ('{10,11}'::integer[])))
Buffers: shared hit=9 read=3
Planning time: 0.223 ms
Execution time: 3.663 ms
(5 rows)
I have reindex everything, not just this INDEX.
"reltuples" for this table is 41712436.
> I'd be curious of the fraction in the MCVs frequency list in stats
indicates that rows with Ver = 92 are rare and therefore the index on
only Ver column is sufficient to find the rows quickly.
There are 25 valid values for "Ver" in this database. I ran the query for
all of them. The only one miss-behaving is "92". I ran the query with
random values for Ver (invalid values), the query plan always attempts to
use the index using both values.
I looked into "most_common_values" in pg_stats, this value (92) is not in
that list.
Finally I ran "ANALYZE" again and now the problem went away. Running the
query with Ver=92 uses the proper plan. I'm not happy with this - I know I
haven't solved the problem (I've ran ANALYZE multiple times before).
On Thu, 16 Jan 2020 at 19:00, Michael Lewis <mlewis(at)entrata(dot)com> wrote:
> Does the behavior change with different values of Ver column? I'd be
> curious of the fraction in the MCVs frequency list in stats indicates that
> rows with Ver = 92 are rare and therefore the index on only Ver column is
> sufficient to find the rows quickly. What is reltuples for this table by
> the way?
>
> I also wonder if the situation may be helped by re-indexing the "index on
> both columns" to remove any chance of issues on bloat in the index. Which
> order are the columns by the way? If Ver is first, is there also an index
> on only id column?. Since you aren't on v12, you don't get to re-index
> concurrently but I assume you know the work around of create concurrently
> (different name), drop concurrently (old one), and finally rename new index.
>
From | Date | Subject | |
---|---|---|---|
Next Message | Laurenz Albe | 2020-01-16 18:20:12 | Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters |
Previous Message | Tom Lane | 2020-01-16 17:27:39 | Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters |