Re: Plan weirdness. A sort produces more rows than the node beneath it

From: Dane Foster <studdugie(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: psql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Plan weirdness. A sort produces more rows than the node beneath it
Date: 2023-08-04 16:41:28
Message-ID: CA+Wxin+8NEuMCwYzdjV=Eb0uA1bNfqZGASznGKf9Bzwp0CiP-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>
> Hmm. The planner should avoid using a merge join if it knows that
> to be true. Maybe analyze'ing that table would prompt it to use
> some other join method?

The planner has updated stats on the table and wants to use a nested loop:

https://explain.dalibo.com/plan/3814d5356cc82528

But the nested loop version is around 8 seconds slower so I forced the
issue. But thanks to this conversation I now understand what's happening
with the row count. This understanding helped make the nested loops' plan
easier to understand. Unfortunately, there doesn't seem to be any hope for
the merge join variant in terms of being easily understood. The uninitiated
sees a scan node and its parent sort node and their brain defaults to
thinking: the sort node will produce the same number of rows as the node
feeding it.

Cheers,

Dane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message An, Hongguo (CORP) 2023-08-08 23:07:04 Function call very slow from JDBC/java but super fast from DBear
Previous Message Tom Lane 2023-08-04 15:31:57 Re: Plan weirdness. A sort produces more rows than the node beneath it