From: | Radosław Smogura <rsmogura(at)softperience(dot)eu> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <gsstark(at)mit(dot)edu>, Greg Smith <greg(at)2ndquadrant(dot)com>, Joshua Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: MMAP Buffers |
Date: | 2011-04-17 21:32:18 |
Message-ID: | 201104172332.18381.rsmogura@softperience.eu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> Sunday 17 April 2011 22:01:55
> On Sun, Apr 17, 2011 at 11:48 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > =?utf-8?q?Rados=C5=82aw_Smogura?= <rsmogura(at)softperience(dot)eu> writes:
> >> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> Sunday 17 April 2011 01:35:45
> >>
> >>> ... Huh? Are you saying that you ask the kernel to map each individual
> >>> shared buffer separately? I can't believe that's going to scale to
> >>> realistic applications.
> >>
> >> No, I do
> >> mrempa(mmap_buff_A, MAP_FIXED, temp);
> >> mremap(shared_buff_Y, MAP_FIXED, mmap_buff_A),
> >> mrempa(tmp, MAP_FIXED, mmap_buff_A).
> >
> > There's no mremap() in the Single Unix Spec, nor on my ancient HPUX box,
> > nor on my quite-up-to-date OS X box. The Linux man page for it says
> > "This call is Linux-specific, and should not be used in programs
> > intended to be portable." So if the patch is dependent on that call,
> > it's dead on arrival from a portability standpoint.
> >
> > But in any case, you didn't explain how use of mremap() avoids the
> > problem of the kernel having to maintain a separate page-mapping-table
> > entry for each individual buffer. (Per process, yet.) If that's what's
> > happening, it's going to be a significant performance penalty as well as
> > (I suspect) a serious constraint on how many buffers can be managed.
>
> I share your suspicions, although no harm in measuring it.
>
> But I don't understand is how this approach avoids the problem of
> different processes seeing different buffer contents. If backend A
> has the buffer mmap'd and backend B wants to modify it (and changes
> the mapping), backend A is still looking at the old buffer contents,
> isn't it? And then things go boom.
Each process has simple "mirror" of shared descriptors.
I "believe" that modifications to buffer content may be only done when holding
exclusive lock (with some simple exceptions) (+ MVCC), actually I saw only two
things that can change already loaded data and cause damage, you have
described (setting hint bits during scan, and vacuum - 1st may only cause, I
think, that two processes will ask for same transaction statuses <except
vacuum>, 2nd one is impossible as vacumm requires exclusive pin). When buffer
tag is changed the version of buffer is bumped up, and checked against local
version - this about reading buffer.
In other cases after obtaining lock check is done if buffer has associated
updatable buffer and if local "mirror" has it too, then swap should take
place.
Logic about updatable buffers is similar to "shared buffers", each updatable
buffer has pin count, and updatable buffer can't be free if someone uses it,
but in contrast to "normal buffers", updatable buffers doesn't have any
support for locking etc. Updatable buffers exists only on free list, or when
associated with buffer.
In future, I will change version to shared segment id, something like
relation's oid + block, but ids will have continuous numbering 1,2,3..., so I
will be able to bypass smgr/md during read, and tag version check - this looks
like faster solution.
Regards,
Radek
From | Date | Subject | |
---|---|---|---|
Next Message | Dan Ports | 2011-04-17 22:43:36 | Re: Formatting Curmudgeons WAS: MMAP Buffers |
Previous Message | Robert Haas | 2011-04-17 20:34:29 | Re: Formatting Curmudgeons WAS: MMAP Buffers |