From: | Surafel Temesgen <surafel3000(at)gmail(dot)com> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, vik(dot)fearing(at)2ndquadrant(dot)com, Mark Dilger <hornschnorter(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, andrew(at)tao11(dot)riddles(dot)org(dot)uk |
Subject: | Re: FETCH FIRST clause PERCENT option |
Date: | 2019-03-01 12:47:43 |
Message-ID: | CALAY4q950E1veZENqCB9akHqK8KbC0KkCjOXVWRZaa0-ku8R7w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Mar 1, 2019 at 4:33 AM Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Hello.
>
> At Thu, 28 Feb 2019 21:16:25 +0100, Tomas Vondra <
> tomas(dot)vondra(at)2ndquadrant(dot)com> wrote in <
> fbd08ad3-5dd8-3169-6cba-38d610d7be7f(at)2ndquadrant(dot)com>
> > > One biggest issue seems to be we don't know the total number of
>
> # One *of* the biggest *issues*?
>
> > > outer tuples before actually reading a null tuple. I doubt of
> > > general shortcut for that. It also seems preventing limit node
> > > from just using materialized outer.
> > >
> >
> > Sure, if you actually want all tuples, you'll have to execute the outer
> > plan till completion. But that's not what I'm talking about - what if we
> > only ever need to read one row from the limit?
>
> We have no choice than once reading all tuples just to find we
> are to return just one row, since estimator is not guaranteed to
> be exact as required for this purpose.
>
> > To give you a (admittedly, somewhat contrived and artificial example):
> >
> > SELECT * FROM t1 WHERE id IN (
> > SELECT id FROM t2 ORDER BY x FETCH FIRST 10 PERCENT ROWS ONLY
> > );
> >
> > Maybe this example is bogus and/or does not really matter in practice. I
> > don't know, but I've been unable to convince myself that's the case.
>
> I see such kind of idiom common. Even in the quite simple example
> above, *we* cannot tell how many tuples the inner should return
> unless we actually fetch all tuples in t2. This is the same
> problem with count(*).
>
> The query is equivalent to the folloing one.
>
> SELECT * FROM t1 WHERE id IN (
> SELECT id FROM t2 ORDER BY x
> FETCH FIRST (SELECT ceil(count(*) * 0.1) FROM t2) ROWS ONLY
> );
>
> This scans t2 twice, but this patch does only one full scan
> moving another partial scan to tuplestore. We would win if the
> outer is complex enough.
>
Okay here is the previous implementation with uptread review comment
included and it also consider OFFSET clause in percentage calculation
regards
Surafel
Attachment | Content-Type | Size |
---|---|---|
fetch-with-percent-v9.patch | text/x-patch | 40.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2019-03-01 12:53:03 | Re: NOT IN subquery optimization |
Previous Message | Etsuro Fujita | 2019-03-01 12:42:47 | Re: Problems with plan estimates in postgres_fdw |