From: | Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com> |
---|---|
To: | Mitsuru IWASAKI <iwasaki(at)jp(dot)freebsd(dot)org> |
Cc: | pgsql-hackers(at)postgresql(dot)org, jeff(dot)janes(at)gmail(dot)com |
Subject: | Re: patch for new feature: Buffer Cache Hibernation |
Date: | 2011-05-05 11:35:52 |
Message-ID: | BANLkTikc81tQqKv_yuMsD+UnQxMKvuTUgA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2011/5/5 Mitsuru IWASAKI <iwasaki(at)jp(dot)freebsd(dot)org>:
> Hi,
>
>> I think that PgFincore (http://pgfoundry.org/projects/pgfincore/)
>> provides similar functionality. Are you familiar with that? If so,
>> could you contrast your approach with that one?
>
> I'm not familiar with PgFincore at all sorry, but I got source code
> and documents and read through them just now.
> # and I'm a novice on postgres actually...
> The target both is to reduce physical I/O, but their approaches and
> gains are different.
> My understanding is like this;
>
> +---------------------+ +---------------------+
> | Postgres(backend) | | Postgres |
> | +-----------------+ | | |
> | | DB Buffer Cache | | | |
> | | (shared buffers)| | | |
> | |*my target | | | |
> | +-----------------+ | | |
> | ^ ^ | | |
> | | | | | |
> | v v | | |
> | +-----------------+ | | +-----------------+ |
> | | buffer manager | | | | pgfincore | |
> | +-----------------+ | | +-----------------+ |
> +---^------^----------+ +----------^----------+
> | |smgrread() |posix_fadvise()
> |read()| | userland
> ==================================================================
> | | | kernel
> | +-------------+-------------+
> | |
> | v
> | +------------------------+
> | | File System |
> | | +-----------------+ |
> +------>| | FS Buffer Cache | |
> | |*PgFincore target| |
> | +-----------------+ |
> | ^ ^ |
> +----|-------|-----------+
> | |
> ==================================================================
> | | hardware
> +---------|-------|----------------+
> | | v Physical Disk |
> | | +------------------+ |
> | | | base/16384/24598 | |
> | v +------------------+ |
> | +------------------------------+ |
> | |Buffer Cache Hibernation Files| |
> | +------------------------------+ |
> +----------------------------------+
>
littel detail, pgfincore store its data per relation in a file, like you do.
I rewrote a bit that, and it will store its data directly in
postgresql tables, as well as it will be able to restore the cache
from raw bitstring.
> In summary, PgFincore's target is File System Buffer Cache, Buffer
> Cache Hibernation's target is DB Buffer Cache(shared buffers).
Correct. (btw I am very happy of your idea and that you get time to do it)
>
> PgFincore is trying to preload database file by posix_fadvise() into
> File System Buffer Cache, not into DB Buffer Cache(shared buffers).
> On query execution, buffer manager will get DB buffer blocks by
> smgrread() from file system unless necessary blocks exist in DB Buffer
> Cache. At this point, physical reads may not happen because part of
> (or entire) database file is already loaded into FS Buffer Cache.
>
> The gain depends on the file system, especially size of File System
> Buffer Cache.
> Preloading database file is equivalent to following command in short.
> $ cat base/16384/24598 > /dev/null
Not exactly.
it exists 2 calls :
* pgfadv_WILLNEED
* pgfadv_WILLNEED_snapshot
The former ask to load each segment of a relation *but* the kernel can
decide to not do that or load only part of each segment. (so it is not
as brutal as cat file > /dev/null )
The later read *exactly* each blocks required in each segment, not all
blocks except if all were in cache while doing the snapshot. (this one
is the part of the snapshot/restore combo)
>
> I think PgFincore is good for data warehouse in applications.
Pgfincore with bitstring storage in a table allow streaming to
HotStandbys and get better response in case of switch-over/fail-over
by doing some house-keeping on the HotStandby and keep it really hot
;)
Even web applications have large database today ....
(they is more, but it is no the subject)
>
>
> Buffer Cache Hibernation, my approach, is more simple and straight forward.
> It try to save/load the contents of DB Buffer Cache(shared buffers) using
> regular files(called Buffer Cache Hibernation Files).
> At startup, buffer manager will load DB buffer blocks into DB Buffer
> Cache from Buffer Cache Hibernation Files which was saved at the last
> shutdown. Note that database file will not be read, so it is not
> cached in File System Buffer Cache at all. Only contents of DB Buffer
> Cache are filled. Therefore, the DB buffer cache miss penalty would
> be larger than PgFincore's.
>
> The gain depends on the size of shared buffers, and how often the
> similar queries are executed before and after restarting.
>
> Buffer Cache Hibernation is good for OLTP in applications.
It is very helpfull for debugging and analysis purpose, also, IIUC.
I may prefer the per relation approach (so you can snapshot and
restore only the interesting tables/index). Given what I read in your
patch it looks easy to do, isn't it ?
I also prefer the idea to keep a map of the Buffer Cache (yes, like
what I do with pgfincore) than storing the data directly and reading
it directly. This later part semmes a bit dangerous to me, even if it
looks sane from a normal postgresql stop/start process.
>
>
> I think that PgFincore and Buffer Cache Hibernation is not exclusive,
> they can co-work together in different caching levels.
Yes.
>
>
>
> Sorry for my poor english skill, but I'm doing my best :)
better than me, and anyway your patch remain very easy to read in all case.
>
> Thanks
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2011-05-05 12:15:20 | Re: GSoC 2011: Fast GiST index build |
Previous Message | Teodor Sigaev | 2011-05-05 11:06:39 | Re: GSoC 2011: Fast GiST index build |