Re: Postgres using nested loops despite setting enable_nestloop to false

From: Frits Jalvingh <jal(at)etc(dot)to>
To: "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Postgres using nested loops despite setting enable_nestloop to false
Date: 2020-11-19 11:52:42
Message-ID: CAKhTGFX1-8Osh=UXhQDLXBRAJFokxM7TVgK4Hru=Dk+1ovV6iQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I found out that setting:
set join_collapse_limit = 14;
set from_collapse_limit = 14;
In addition to disabling the nested loops does produce a viable plan, with
only the nested loop to generate the tijd table cross join as a basic part
down low... The original values for those were 12. It does seem scary to
update those as the possibility of having 14! plans to choose from seems...
scary...

It does feel a bit like throwing dice...

I assume the bad plan is being made by the gequ planner. Is there a way to
discourage it from using those nested loops?

Regards,

Frits

On Tue, Nov 17, 2020 at 5:42 PM Frits Jalvingh <jal(at)etc(dot)to> wrote:

> Hello Tom, thanks for your help!
>
> I understand that the "time" table cross join needs a nested loop. Indeed
> that nested loop is present in all plans generated.
> But it is the _second_ (topmost) nested loop that is the issue. Once the
> time table has been joined it should be possible to do something else for
> that second nested loop. This is proven by that query on 9.6 (which has
> only one nested loop for that exact same query, on almost the same database
> content as the Postgresql 13 one). Even on Postgresql 13 a correct plan is
> made in another database (exact same structure, different data); I have
> attached the plan that is made there too.
> All databases that make a plan without the second nested loops also finish
> the query within a reasonable time period (16 seconds on the .9.6 server).
> On the 13 server with the nested loops plan the process times out after 2
> hours.
>
> As far as the row counts go: yes, this database is not by far the biggest
> one, so the row counts are less. It also depends on what query we actually
> run (we can have hundreds of them on different tables, and not all tables
> are that big).
>
> I disabled nested_loops not just for fun, I disabled it because without it
> many of the queries effectively hang because their plan estimate expects
> only a few rows while in reality there are millions. Disabling nested loops
> will let lots of the generated queries fail, even on smaller datasets.
>
> I have no idea of how to get rid of those inequality queries, except by
> not using SQL and doing them by hand in code.. That would prove to be
> disastrous for performance as I'd have to read all those datasets
> completely... Do you have an idea on how to do that better?
>
> Regards,
> Frits
>
>
> On Tue, Nov 17, 2020 at 5:21 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> Frits Jalvingh <jal(at)etc(dot)to> writes:
>> > I have attached both plans, both made with set enable_nestloop = false
>> in
>> > the attachments.
>>
>> The reason why you're getting a nested loop is that the planner has no
>> other choice. The "tijd" table has no join conditions that would be
>> amenable to hash- or merge-joining it to something else, because both
>> of those join methods require a plain equality join condition. AFAICS
>> in a quick look, all of tijd's join conditions look more like
>>
>> Join Filter: ((di04238cluster.dv_start_dts <= tijd.einddatum) AND
>> (di04238cluster.dv_end_dts > tijd.einddatum))
>>
>> which is not amenable to anything except brute force cross-join-and-
>> test-the-condition.
>>
>> Given that, it's likely that "enable_nestloop = false" is making things
>> worse not better, by artificially distorting the plan shape.
>>
>> Seeing the large number of joins involved, I wonder what your
>> geqo_threshold, join_collapse_limit, and from_collapse_limit settings
>> are, and whether you can get a better plan by increasing them.
>>
>> The planner doesn't seem to think that any of these joins involve
>> a very large number of rows, so I doubt that your work_mem setting
>> is very relevant. However, are these rowcount estimates accurate?
>> You claimed upthread that you were dealing with hundreds of millions
>> of rows, but it's impossible to credit that cost estimates like
>>
>> -> Seq Scan on s_h_cluster_ssm di01905cluster (cost=0.00..155.05
>> rows=1072 width=24)
>> Filter: (soort = 'FIN'::text)
>>
>> correspond to scanning large tables.
>>
>> In the end, I fear that finding a way to get rid of those
>> inequality join conditions may be your only real answer.
>>
>> regards, tom lane
>>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Alexey Bashtanov 2020-11-19 13:38:46 Re: How to prioritise walsender reading from pg_wal over WAL writes?
Previous Message Laurenz Albe 2020-11-18 16:20:58 Re: How to prioritise walsender reading from pg_wal over WAL writes?