Re: BUG #16824: Planner chooses poor path on query with Merge Join and pagination

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: jacobmitash(at)gmail(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16824: Planner chooses poor path on query with Merge Join and pagination
Date: 2021-01-14 00:16:39
Message-ID: 1343668.1610583399@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> My understanding of this ID pagination is that it should be very quick as it
> needs only find the ID in the index, and then scan the next 50 entries.

Well, that's what it's doing, so far as the "sub s" scan is concerned.
Your beef is with the subsequent join.

> 1. Why is the Merge Join performing so slowly?
> It seems to be because the planner doesn't recognize that it can apply the
> subscription_id index condition on the inner table. If I explicitly tell it:
> "AND si.subscription_id > '7ca1...'", then it applies an index condition and
> is almost instant.

We don't attempt to infer derived inequalities. Given "a = b AND b = c",
the planner will deduce "a = c". However, given "a = b AND b > c", we
do not deduce "a > c". This is an empirical decision based on the
frequency with which such a deduction would help versus the planner
cycles that would be spent looking for such cases.

> 2. Why does the planner believe that Merge Join (as-is) is optimal here?

The cost is estimated to be slightly lower. It might be right, so
far as the time to run the join to completion (without a LIMIT) is
concerned. Large nestloop joins tend to suck :-(. But the reason
that it then makes the wrong choice with the LIMIT applied,
fundamentally, is that the fraction of the total cost that will
actually be incurred with the LIMIT present is nonlinear, and it
doesn't know that. Doing better is a research problem.

In short, there's nothing here that I'd call a bug that we're likely
to fix anytime soon. In the meantime, if you can improve matters
by manually injecting the extra inequality, that seems like the
thing to do.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Etsuro Fujita 2021-01-14 02:56:08 Re: BUG #16807: Assert failed in postgres_fdw/estimate_path_cost_size with an empty foreign table
Previous Message PG Bug reporting form 2021-01-13 22:13:55 BUG #16824: Planner chooses poor path on query with Merge Join and pagination