Quick Links

Re: [HACKERS] Clock with Adaptive Replacement

From:	Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To:	Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
Cc:	Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [HACKERS] Clock with Adaptive Replacement
Date:	2018-04-23 21:41:16
Message-ID:	CAEepm=3p6TmW2oUFyp8zRnYVum1G72fwToDLRbv9kK_V1DU9xQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Feb 12, 2016 at 10:02 AM, Konstantin Knizhnik
<k(dot)knizhnik(at)postgrespro(dot)ru> wrote:
> Hi hackers,
>
> What do you think about improving cache replacement clock-sweep algorithm in
> PostgreSQL with adaptive version proposed in this article:
>
> http://www-cs.stanford.edu/~sbansal/pubs/fast04.pdf
>
> Are there some well known drawbacks of this approach or it will be
> interesting to adopt this algorithm to PostgreSQL and measure it impact om
> performance under different workloads?

I'm not currently planning to work in this area and have done no real
investigation, so please consider the following to be "water cooler
talk".

I also came across that paper while reading about buffering as general
background. Yeah, CAR[T] looks pretty interesting at first glance.
Automatic scan resistance seems like something we'd want, its approach
to frequency sounds smarter than GCLOCK's usage counter with arbitrary
parameter 5. Here's a nice slide deck from the authors:

http://www.cse.iitd.ernet.in/~sbansal/pubs/fast04ppt.pdf

There doesn't seem to be as much written about GCLOCK beyond the 1992
TPC-A paper[1], possibly because as the CAR paper says "[a]
fundamental disadvantage of GCLOCK is that it requires counter
increment on every page hit which makes it infeasible for virtual
memory", making it less interesting to the OS guys. That is, we
actually have slightly more information, because we have an
opportunity to bump the usage counter when pinning and the VM guys
don't, probably explaining why the paper compares with CLOCK and not
GCLOCK.

One question is whether SPC-1 looks anything like a typical database
workload. Say someone wrote a prototype CAR[T] patch for PostgreSQL,
how would you validate it?

I'm also curious about how the algorithms at different levels interact
when using buffered IO. While other databases typically do direct IO,
you can still find a trail of papers about this topic since disk
arrays and network-attached storage systems also have caches and page
replacement problems[2][3], and of course multi-level caching problems
apply to non-database applications too. I wonder to what extent
Linux's use of "a mash of a number of different algorithms with a
number of modifications for catching corner cases and various
optimisations"[4] affects the performance of different clock-based
algorithms operating higher up.

If we had a page replacement algorithm with CART's magical claimed
properties and it really does perform better than GCLOCK + our scan
resistance special cases, I wonder if it would be tempting to stick
SLRU contents into shared buffers. slru.c could gain a
smgr-compatible interface so that bufmgr.c could treat those buffers
like any other, using smgr_which to select the appropriate storage
manager. (I'm going to be proposing something similar for undo log
storage, more on that later.)

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.452.9699&rep=rep1&type=pdf
[2] https://www.usenix.org/legacy/event/usenix01/full_papers/zhou/zhou.pdf
[3] https://www.usenix.org/legacy/event/usenix02/full_papers/wong/wong_html/
[4] http://www.spinics.net/lists/linux-mm/msg13385.html

--
Thomas Munro
http://www.enterprisedb.com

In response to

Clock with Adaptive Replacement at 2016-02-11 21:02:53 from Konstantin Knizhnik

Responses

Re: [HACKERS] Clock with Adaptive Replacement at 2018-04-24 05:04:22 from Andrey Borodin

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jonathan S. Katz	2018-04-23 22:28:58	Re: Boolean partitions syntax
Previous Message	David Rowley	2018-04-23 21:23:57	Re: bms_prev_member won't work correctly if bitmapword is 64-bits