From: | Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Michał Kłeczek <michal(at)kleczek(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Draft LIMIT pushdown to Append and MergeAppend patch |
Date: | 2023-10-09 01:49:59 |
Message-ID: | CAKU4AWpuMbw6-Uhxb_kd11iaX=44f5UXTkrm+7XsqbmgxR+dEA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Oct 9, 2023 at 8:52 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> On Sun, 8 Oct 2023 at 18:32, Michał Kłeczek <michal(at)kleczek(dot)org> wrote:
> > On 8 Oct 2023, at 03:33, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com> wrote:
> >> For the patches for performance improvement, it is better to provide
> >> an example to show how much benefits we can get. As for this case,
> >> I'm doubtful it can work as an improvement.
>
> > Could you elaborate a little why you think it won’t work as an
> improvement?
> > Is it because in practice LIMIT _is_ pushed down already during
> execution?
> > From what I understand postgres_fdw does indeed fetch on demand.
> > OTOH pushing down LIMIT is considered an improvement (as witnessed in
> the postgres_fdw code itself after d50d172e51)
>
> In my opinion, analysis of where we can push LIMIT node deeper into
> the plan is an interesting area for research and work.
>
Well, really impressive on your feedback, always professional and
PATIENT, just like what you helped me in pretty many areas for the
last N years.
Yes, I think "analysis of where we can .." is the key point. SQL is
an complex area because of its ever-changing, so providing an
example will be pretty helpful for communication. However
"doubtful" might not be a good word:(:(
>
> The Append / MergeAppend case is certainly one place where pushing
> LIMIT nodes down could be beneficial. Of course, if there was some
> filter or join or aggregation/distinct, etc that occurred after the
> Append/MergeAppend then you'd not be able to do this as the pushed
> limit might be met before we've got enough rows at the top level of
> the plan and that could result in fewer than rows being output than
> what was requested in the query (aka wrong results).
I'm not totally agree with this, my main idea came from Tom's reply
at [1]. The best situation should be we know we should "plan for"
top-N rows, but we don't need to really add the execution overhead.
and in my current knowledge, in the pretty number of cases, we have
achieved this already. If an area which is missed and can be shown
within an example, I would be pretty happy to change my mind.
Andy was working
> around this area recently (see [1] and corresponding thread). He had
> to add a bunch of code that checked for operations that might mean
> we'd need to read more than the tuple_fraction rows from the Append
> node. If we had nice way to know when building base rel paths if
> there were or were not upper-level operations that affect if LIMIT
> pushing can be done, then that might make such a patch cleaner. Andy
> in one of his proposed patches [2] added a field to PlannerInfo to
> mark this. That field wasn't well named or documented, so anything
> you did would have to be an improvement on that.
>
> Looking at your patch, I see you've solved this by delaying the
> pushing down until the upper planner and just checking if the lower
> planner (join search) produced an Append or MergeAppend path. I've not
> really developed an opinion yet what's the best method. I feel
> creating the correct paths up-front is likely more flexible and more
> true to the path costing code.
> It might also be worth taking a step backwards and seeing if there are
> any other cases where we could push down LIMITs and try to see if
> there's something more generic that could be built to do this in a way
> that's beneficial to more cases. I can't think of any off the top of
> my head, but I've not thought very hard about it.
>
> FWIW, it's trivial to mock up the possible benefits of pushing LIMIT
> nodes down with something like the following:
>
> create table rp (a int) partition by range (a);
> create table rp1 partition of rp for values from (0) to (1000000);
> create table rp2 partition of rp for values from (1000000) to (2000000);
> insert into rp select x from generate_series(0, 1999999)x;
>
> -- limit not pushed:
> explain analyze select * from rp order by a limit 10;
> Execution Time: 148.041 ms
>
> -- limit pushed (mocked):
> explain analyze (select * from rp1 order by a limit 10) union all
> (select * from rp2 order by a limit 10) limit 10;
> Execution Time: 49.263 ms
>
about 3x faster for this case.
>
> However, it may also be worth you reading over [3] and the ultimate
> reason I changed my mind on that being a good idea. Pushing LIMITs
> below an Append seems quite incomplete when we don't yet push sorts
> below Appends, which is what that patch did. I just was not
comfortable proceeding with [3] as nodeSort.c holds onto the tuplesort
> until executor shutdown. That'll be done for rescan reasons, but it
> does mean if you pushed Sort below Append that we could have a very
> large number of sort nodes holding onto work_mem all at once. I find
> that a bit scary, especially so given the excessive partitioning cases
> I've seen and worked on recently. I did consider if we maybe could
> adjust nodeSort.c to do tuplesort_end() after the final row. We'd need
> to only do that if we could be somehow certain there were going to be
> no rescans. I don't have a plan on how that would be detected.
>
That patch looks interesting and the example there should be not
uncommon in the real user case. I'd see if I can do anything useful.
[1] https://www.postgresql.org/message-id/11228.1118365833@sss.pgh.pa.us
--
Best Regards
Andy Fan
From | Date | Subject | |
---|---|---|---|
Next Message | Erik Wienhold | 2023-10-09 01:53:27 | Re: Fix output of zero privileges in psql |
Previous Message | Yasuo Honda | 2023-10-09 01:46:43 | Re: pg_stat_statements and "IN" conditions |