Re: Re: Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: gzh <gzhcoder(at)126(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Re: Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11
Date: 2022-10-11 08:21:42
Message-ID: CAFj8pRCqi4KdMMvk1yO9Mx8mPZUSctvPjLWeG5Epp9Ryhah-kQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

út 11. 10. 2022 v 10:01 odesílatel gzh <gzhcoder(at)126(dot)com> napsal:

> Thank you for providing the requested information.
>
>
> The WebSite has been used for many years, and this upgrade is only a
> version upgrade of the PostgreSQL database.
> My customer does not want to modify the SQL because it will increase the
> cost of the project(All SQL that contains a LIMIT clause needs to be
> analyzed and checked).
>
> Is there no other way to solve the problem?
>

I don't know about any alternative

Regards

Pavel

>
>
>
>
> At 2022-10-11 13:24:12, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
>
>
> út 11. 10. 2022 v 7:08 odesílatel gzh <gzhcoder(at)126(dot)com> napsal:
>
>> Hi, Pavel
>>
>>
>> > The LIMIT clause changes total cost. This is a very aggressive clause.
>> And
>>
>> > although it is absolutely useless in this case, Postgres does not have
>> any
>>
>> > logic for removing it. Postgres doesn't try to fix developer's mistakes.
>>
>> Sorry,I didn't understand what you mean.
>>
>> Couldn't the LIMIT clause be used like the SQL statement below?
>>
>>
>> >> new=# explain analyze select 2 from analyze_word_reports where (cseid =
>>
>> >> 94) limit 1;
>>
>
> there was query
>
> SELECT aggregate() FROM xx LIMIT 1
>
>
>> This SQL statement is no problem under PostgreSQL 8.4, the index works
>> well.
>>
>>
>>
> The optimizer is under nonstop change. And you can expect from any new
> release
>
> 75% queries are without change, 20% queries are faster, and 5% queries are
> slower
>
> The optimization is based on statistics and estimations, and searching for
> the optimal solution in space of all solutions. In any version there are
> smaller or bigger changes of estimation methods, and between old 8.4 and 12
> there are big changes in possibilities of how the query can be executed. So
> there is a higher possibility to find some really fast queries, but there
> is a higher possibility to find some local optimum or slow query too.
> Usually the optimizer is smarter (what is the benefit), but more sensitive
> too (what is the cost). You cannot expect the same result, when the data
> and algorithm is changed in any version. Postgres doesn't garant the
> immutability of execution plans.
>
> The clause LIMIT with low LIMIT value can be problematic in more cases.
> The model in Postgres expects data are uniformly stored in the table
> (heap), but the reality can be different. The common trick in these cases
> is using OFFSET 0 clause like
>
> SELECT * FROM (SELECT * FROM foo LIMIT 1000 OFFSET 0) s LIMIT 10.
>
>
>
>
>
>
>
>>
>>
>>
>>
>> At 2022-10-11 12:13:47, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>
>>
>>
>> út 11. 10. 2022 v 6:05 odesílatel gzh <gzhcoder(at)126(dot)com> napsal:
>>
>>>
>>> Hi, Pavel
>>>
>>> Thank you for your reply.
>>>
>>>
>>> > the LIMIT clause is in this case totally useless and messy, and maybe
>>> can
>>>
>>> > negative impacts optimizer
>>>
>>> Yes. After removing the LIMIT clause, the performance is improved.
>>>
>>> The execution plan shows that the index worked.
>>>
>>> We've noticed it, but I don't want to fix the problem by modifying the
>>> SQL until I find the cause.
>>>
>>
>> The LIMIT clause changes total cost. This is a very aggressive clause.
>> And although it is absolutely useless in this case, Postgres does not have
>> any logic for removing it. Postgres doesn't try to fix developer's mistakes.
>>
>>
>>
>>>
>>>
>>>
>>>
>>> At 2022-10-11 11:32:48, "Pavel Stehule" <pavel(dot)stehule(at)gmail(dot)com> wrote:
>>>
>>>
>>>
>>> út 11. 10. 2022 v 5:13 odesílatel gzh <gzhcoder(at)126(dot)com> napsal:
>>>
>>>> Hi, Tom
>>>> Thank you for your reply.
>>>>
>>>> > When you're asking for help, please don't give us vague statements
>>>>
>>>> > like "doesn't seem to work".
>>>>
>>>> I understand.
>>>>
>>>>
>>>> > Did the plan (including rowcount
>>>>
>>>> > estimates) change at all? To what? How far off is that rowcount
>>>>
>>>> > estimate, anyway --- that is, how many rows actually have cseid = 94?
>>>>
>>>> Please refer to the new execution plan (PostgreSQL 12.11) below.
>>>>
>>>>
>>>> new=# show enable_seqscan;
>>>>
>>>> enable_seqscan
>>>>
>>>> ----------------
>>>>
>>>> on
>>>>
>>>> (1 行)
>>>>
>>>>
>>>> new=# select count(*) from analyze_word_reports;
>>>>
>>>> count
>>>>
>>>> ----------
>>>>
>>>> 21331980
>>>>
>>>> (1 行)
>>>>
>>>>
>>>> new=# select count(*) from analyze_word_reports where (cseid = 94);
>>>>
>>>> count
>>>>
>>>> ---------
>>>>
>>>> 1287156
>>>>
>>>> (1 行)
>>>>
>>>>
>>>> new=# explain analyze select count(2) from analyze_word_reports where
>>>> (cseid = 94) limit 1;
>>>>
>>>>
>>>> QUERY PLAN
>>>>
>>>>
>>>>
>>>>
>>>
>>> the LIMIT clause is in this case totally useless and messy, and maybe
>>> can negative impacts optimizer
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>>
>>>>
>>>> ----------------------------------------------------------------------------------------------------------------------------------
>>>>
>>>>
>>>> --------------------------------------------------------------------------
>>>>
>>>> Limit (cost=65184.06..65184.07 rows=1 width=8) (actual
>>>> time=123.713..133.035 rows=1 loops=1)
>>>>
>>>> -> Finalize Aggregate (cost=65184.06..65184.07 rows=1 width=8)
>>>> (actual time=123.712..133.033 rows=1 loops=1)
>>>>
>>>> -> Gather (cost=65183.85..65184.06 rows=2 width=8) (actual
>>>> time=123.548..133.024 rows=3 loops=1)
>>>>
>>>> Workers Planned: 2
>>>>
>>>> Workers Launched: 2
>>>>
>>>> -> Partial Aggregate (cost=64183.85..64183.86 rows=1
>>>> width=8) (actual time=119.495..119.496 rows=1 loops=3)
>>>>
>>>> -> Parallel Index Only Scan using
>>>> analyze_word_reports_index_cseid on analyze_word_reports (cost=0.56..6290
>>>>
>>>> 9.34 rows=509805 width=0) (actual time=0.031..98.706 rows=429052
>>>> loops=3)
>>>>
>>>> Index Cond: (cseid = 94)
>>>>
>>>> Heap Fetches: 1287156 Planning Time: 0.122
>>>> ms Execution Time: 133.069 ms
>>>>
>>>> (11 行)
>>>>
>>>>
>>>> new=# explain analyze select 2 from analyze_word_reports where (cseid =
>>>> 94) limit 1;
>>>>
>>>> QUERY
>>>> PLAN
>>>>
>>>>
>>>>
>>>>
>>>> ----------------------------------------------------------------------------------------------------------------------------------
>>>>
>>>> ---
>>>>
>>>> Limit (cost=0.00..0.43 rows=1 width=4) (actual
>>>> time=2156.964..2156.966 rows=1 loops=1)
>>>>
>>>> -> Seq Scan on analyze_word_reports (cost=0.00..528550.75
>>>> rows=1223533 width=4) (actual time=2156.962..2156.964 rows=1 loops=
>>>>
>>>> 1)
>>>>
>>>> Filter: (cseid = 94)
>>>>
>>>> Rows Removed by Filter: 18320180 Planning Time: 0.086 ms
>>>> Execution Time: 2156.985 ms
>>>>
>>>> (6 行)
>>>>
>>>>
>>>>
>>>> > If the estimate is far off, then increasing the table's statistics
>>>>
>>>> > target might help.
>>>>
>>>> Thank you for your advice.
>>>>
>>>> Please tell me how to set the table's statistics up to improve
>>>> performance.
>>>>
>>>>
>>>> new=# select oid from pg_class where relname = 'analyze_word_reports';
>>>>
>>>> oid
>>>>
>>>> -------
>>>>
>>>> 16429
>>>>
>>>> (1 行)
>>>>
>>>>
>>>> new=# select attrelid,attname,attstattarget from pg_attribute where
>>>> attrelid=16429 and attname='cseid';
>>>>
>>>> attrelid | attname | attstattarget
>>>>
>>>> ----------+---------+---------------
>>>>
>>>> 16429 | cseid | -1
>>>>
>>>> (1 行)
>>>>
>>>>
>>>> > Another thing that would be worth checking is whether
>>>>
>>>> > "set enable_seqscan = off" prods it to choose the plan you want.
>>>>
>>>> > If not, then there's something else going on besides poor estimates.
>>>>
>>>> "set enable_seqscan = off" works, and the performance is greatly
>>>> improved, which is almost the same as PostgreSQL 8.4.
>>>>
>>>> The enable_seqscan(PostgreSQL 8.4) is on, will this change have an
>>>> unknown effect on other queries?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> At 2022-10-10 10:45:54, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> >gzh <gzhcoder(at)126(dot)com> writes:
>>>> >> I've run analyze(not vacuum analyze), but it doesn't seem to work.
>>>> >
>>>> >When you're asking for help, please don't give us vague statements
>>>> >like "doesn't seem to work". Did the plan (including rowcount
>>>> >estimates) change at all? To what? How far off is that rowcount
>>>> >estimate, anyway --- that is, how many rows actually have cseid = 94?
>>>> >
>>>> >If the estimate is far off, then increasing the table's statistics
>>>> >target might help.
>>>> >
>>>> >Another thing that would be worth checking is whether
>>>> >"set enable_seqscan = off" prods it to choose the plan you want.
>>>> >If not, then there's something else going on besides poor estimates.
>>>> >
>>>> > regards, tom lane
>>>>
>>>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message gzh 2022-10-11 09:17:22 Re: Different execution plan between PostgreSQL 8.4 and 12.11
Previous Message gzh 2022-10-11 08:01:41 Re:Re: Re: Re: Re: Different execution plan between PostgreSQL 8.4 and 12.11