From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Georgios <gkokolatos(at)protonmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Konstantin Knizhnik <knizhnik(at)garret(dot)ru>, Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
Subject: | Re: index prefetching |
Date: | 2024-02-15 20:30:06 |
Message-ID: | CAH2-WzkToUXuqfktW3CPoKq3odtNChkFwQFWHARz=n-h_Zm2Kw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Feb 15, 2024 at 3:13 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > This is why I don't think that the tuples with lower page offset
> > numbers are in any way significant here. The significant part is
> > whether or not you'll actually need to visit more than one leaf page
> > in the first place (plus the penalty from not being able to reorder
> > the work across page boundaries in your initial v1 of prefetching).
>
> To me this your phrasing just seems to reformulate the issue.
What I said to Tomas seems very obvious to me. I think that there
might have been some kind of miscommunication (not a real
disagreement). I was just trying to work through that.
> In practical terms you'll have to wait for the full IO latency when fetching
> the table tuple corresponding to the first tid on a leaf page. Of course
> that's also the moment you had to visit another leaf page. Whether the stall
> is due to visit another leaf page or due to processing the first entry on such
> a leaf page is a distinction without a difference.
I don't think anybody said otherwise?
> > > That's certainly true / helpful, and it makes the "first entry" issue
> > > much less common. But the issue is still there. Of course, this says
> > > nothing about the importance of the issue - the impact may easily be so
> > > small it's not worth worrying about.
> >
> > Right. And I want to be clear: I'm really *not* sure how much it
> > matters. I just doubt that it's worth worrying about in v1 -- time
> > grows short. Although I agree that we should commit a v1 that leaves
> > the door open to improving matters in this area in v2.
>
> I somewhat doubt that it's realistic to aim for 17 at this point.
That's a fair point. Tomas?
> We seem to
> still be doing fairly fundamental architectual work. I think it might be the
> right thing even for 18 to go for the simpler only-a-single-leaf-page
> approach though.
I definitely think it's a good idea to have that as a fall back
option. And to not commit ourselves to having something better than
that for v1 (though we probably should commit to making that possible
in v2).
> I wonder if there are prerequisites that can be tackled for 17. One idea is to
> work on infrastructure to provide executor nodes with information about the
> number of tuples likely to be fetched - I suspect we'll trigger regressions
> without that in place.
I don't think that there'll be regressions if we just take the simpler
only-a-single-leaf-page approach. At least it seems much less likely.
> One way to *sometimes* process more than a single leaf page, without having to
> redesign kill_prior_tuple, would be to use the visibilitymap to check if the
> target pages are all-visible. If all the table pages on a leaf page are
> all-visible, we know that we don't need to kill index entries, and thus can
> move on to the next leaf page
It's possible that we'll need a variety of different strategies.
nbtree already has two such strategies in _bt_killitems(), in a way.
Though its "Modified while not pinned means hinting is not safe" path
(LSN doesn't match canary value path) seems pretty naive. The
prefetching stuff might present us with a good opportunity to replace
that with something fundamentally better.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2024-02-15 20:48:16 | Re: 035_standby_logical_decoding unbounded hang |
Previous Message | Tomas Vondra | 2024-02-15 20:26:59 | Re: logical decoding and replication of sequences, take 2 |