From: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Floris Van Nee <florisvannee(at)optiver(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "jesper(dot)pedersen(at)redhat(dot)com" <jesper(dot)pedersen(at)redhat(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, James Coleman <jtc331(at)gmail(dot)com>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Bhushan Uparkar <bhushan(dot)uparkar(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> |
Subject: | Re: Index Skip Scan |
Date: | 2019-09-05 19:20:06 |
Message-ID: | CA+q6zcXO-XzM2Be7ZX8SZf7Xr-Hw5gMRXrtiMkUoivpF8J-9DA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On Mon, Sep 2, 2019 at 3:28 PM Dmitry Dolgov <9erthalion6(at)gmail(dot)com> wrote:
>
> > On Wed, Aug 28, 2019 at 9:32 PM Floris Van Nee <florisvannee(at)optiver(dot)com> wrote:
> >
> > I'm afraid I did manage to find another incorrect query result though
>
> Yes, it's an example of what I was mentioning before, that the current modified
> implementation of `_bt_readpage` wouldn't work well in case of going between
> pages. So far it seems that the only problem we can have is when previous and
> next items located on a different pages. I've checked how this issue can be
> avoided, I hope I will post a new version relatively soon.
Here is the version in which stepping between the pages works better. It seems
sufficient to fix the case you've mentioned before, but for that we need to
propagate keepPrev logic through `_bt_steppage` & `_bt_readnextpage`, and I
can't say I like this solution. I have an idea that maybe it would be simpler
to teach the code after index_skip to not do `_bt_next` right after one skip
happened before. It should immediately elliminate several hacks from index skip
itself, so I'll try to pursue this idea.
> On Wed, Sep 4, 2019 at 10:45 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:
Thank you for checking it out!
> Surely it isn't right to add members prefixed with "ioss_" to
> struct IndexScanState.
Yeah, sorry. I've incorporated IndexScan support originally only to show that
it's possible (with some limitations), but after that forgot to clean up. Now
those fields are renamed.
> I'm surprised about this "FirstTupleEmitted" business. Wouldn't it make
> more sense to implement index_skip() to return the first tuple if the
> scan is just starting? (I know little about executor, apologies if this
> is a stupid question.)
I'm not entirely sure, which exactly part do you mean? Now the first tuple is
returned by `_bt_first`, how would it help if index_skip will return it?
> It would be good to get more knowledgeable people to review this patch.
> It's clearly something we want, yet it's been there for a very long
> time.
Sure, that would be nice.
Attachment | Content-Type | Size |
---|---|---|
v25-0001-Index-skip-scan.patch | application/octet-stream | 85.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2019-09-05 19:27:28 | Re: tableam vs. TOAST |
Previous Message | Robert Haas | 2019-09-05 19:17:51 | Re: [HACKERS] CLUSTER command progress monitor |