Re: Direct I/O

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Dagfinn Ilmari Mannsåker <ilmari(at)ilmari(dot)org>, Christoph Berg <myon(at)debian(dot)org>, mikael(dot)kjellstrom(at)gmail(dot)com, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Direct I/O
Date: 2023-08-22 20:48:50
Message-ID: CAH2-WznqRDgg8=cGBrnsB-ymOf-6qVeb7JYSu+dBsGu7AYThbQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 19, 2023 at 10:43 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> I don't think the "evict on every buffer access" works quite that way - unless
> you have a completely even access pattern, buffer access frequency will
> increase usage count much more frequently on some buffers than others. And if
> you have a completely even access pattern, it's hard to come up with a good
> cache replacement algorithm...

My guess is that the most immediate problem in this area is the
problem of "correlated references" (to use the term from the LRU-K
paper). I gave an example of that here:

https://postgr.es/m/CAH2-Wzk7T9K3d9_NY+jEXp2qQGMYoP=gZMoR8q1Cv57SxAw1OA@mail.gmail.com

In other words, I think that the most immediate problem may in fact be
the tendency of usage_count to get incremented multiple times in
response to what is (for all intents and purposes) the same logical
page access. Even if it's not as important as I imagine, it still
seems likely that verifying that our input information isn't garbage
is the logical place to begin work in this general area. It's
difficult to be sure about that because it's so hard to look at just
one problem in isolation. I suspect that you were right to point out
that a larger shared buffers tends to look quite different to a
smaller shared buffers. That same factor is going to complicate any
analysis of the specific problem that I've highlighted (to say nothing
of the way that contention complicates the picture).

There is an interesting paper that compared the hit rates seen for
TPC-C to TPC-E on relatively modern hardware:

https://www.cs.cmu.edu/~chensm/papers/TPCE-sigmod-record10.pdf

It concludes that the buffer misses for each workload look rather
similar, past a certain point (past a certain buffer pool size): both
workloads have cache misses that seem totally random. The access
patterns may be very different, but that doesn't necessarily have any
visible effect on buffer misses. At least provided that you make
certain modest assumptions about buffer pool size, relative to working
set size.

The most sophisticated cache management algorithms (like ARC) work by
maintaining metadata about recently evicted buffers, which is used to
decide whether to favor recency over frequency. If you work backwards
then it follows that having cache misses that look completely random
is desirable, and perhaps even something to work towards. What you
really don't want is a situation where the same small minority of
pages keep getting ping-ponged into and out of the buffer pool,
without ever settling, even though the buffer cache is large enough
that that's possible in principle. That pathological profile is the
furthest possible thing from random.

With smaller shared_buffers, it's perhaps inevitable that buffer cache
misses are random, and so I'd expect that managing the problem of
contention will tend to matter most. With larger shared_buffers it
isn't inevitable at all, so the quality of the cache eviction scheme
is likely to matter quite a bit more.

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-08-22 20:54:45 Re: subscription/015_stream sometimes breaks
Previous Message Jeff Davis 2023-08-22 20:42:41 Re: PG 16 draft release notes ready