From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Surafel Temesgen <surafel3000(at)gmail(dot)com> |
Cc: | vik(dot)fearing(at)2ndquadrant(dot)com, Mark Dilger <hornschnorter(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, andrew(at)tao11(dot)riddles(dot)org(dot)uk |
Subject: | Re: FETCH FIRST clause PERCENT option |
Date: | 2019-01-04 14:27:38 |
Message-ID: | b6a3bdfa-d5c8-534c-c1e8-fdabee061b04@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 1/4/19 7:40 AM, Surafel Temesgen wrote:
>
>
> On Thu, Jan 3, 2019 at 4:51 PM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>> wrote:
>
>
> On 1/3/19 1:00 PM, Surafel Temesgen wrote:
> > Hi
> >
> > On Tue, Jan 1, 2019 at 10:08 PM Tomas Vondra
> > <tomas(dot)vondra(at)2ndquadrant(dot)com
> <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>
> <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com
> <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>>> wrote:
>
> > The execution part of the patch seems to be working correctly,
> but I
> > think there's an improvement - we don't need to execute the
> outer plan
> > to completion before emitting the first row. For example,
> let's say the
> > outer plan produces 10000 rows in total and we're supposed to
> return the
> > first 1% of those rows. We can emit the first row after
> fetching the
> > first 100 rows, we don't have to wait for fetching all 10k rows.
> >
> >
> > but total rows count is not given how can we determine safe to
> return row
> >
>
> But you know how many rows were fetched from the outer plan, and this
> number only grows grows. So the number of rows returned by FETCH FIRST
> ... PERCENT also only grows. For example with 10% of rows, you know that
> once you reach 100 rows you should emit ~10 rows, with 200 rows you know
> you should emit ~20 rows, etc. So you may track how many rows we're
> supposed to return / returned so far, and emit them early.
>
>
>
>
> yes that is clear but i don't find it easy to put that in formula. may
> be someone with good mathematics will help
>
What formula? All the math remains exactly the same, you just need to
update the number of rows to return and track how many rows are already
returned.
I haven't tried doing that, but AFAICS you'd need to tweak how/when
node->count is computed - instead of computing it only once it needs to
be updated after fetching each row from the subplan.
Furthermore, you'll need to stash the subplan rows somewhere (into a
tuplestore probably), and whenever the node->count value increments,
you'll need to grab a row from the tuplestore and return that (i.e.
tweak node->position and set node->subSlot).
I hope that makes sense. The one thing I'm not quite sure about is
whether tuplestore allows adding and getting rows at the same time.
Does that make sense?
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2019-01-04 14:32:16 | Re: Fast path for empty relids in check_outerjoin_delay() |
Previous Message | Tomas Vondra | 2019-01-04 14:12:05 | Re: Delay locking partitions during query execution |