Re: MMAP Buffers

From: Radosław Smogura <rsmogura(at)softperience(dot)eu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <gsstark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Smith <greg(at)2ndquadrant(dot)com>, Joshua Berkus <josh(at)agliodbs(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MMAP Buffers
Date: 2011-04-17 17:26:31
Message-ID: 201104171926.31900.rsmogura@softperience.eu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> Sunday 17 April 2011 17:48:56
> =?utf-8?q?Rados=C5=82aw_Smogura?= <rsmogura(at)softperience(dot)eu> writes:
> > Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> Sunday 17 April 2011 01:35:45
> >
> >> ... Huh? Are you saying that you ask the kernel to map each individual
> >> shared buffer separately? I can't believe that's going to scale to
> >> realistic applications.
> >
> > No, I do
> > mrempa(mmap_buff_A, MAP_FIXED, temp);
> > mremap(shared_buff_Y, MAP_FIXED, mmap_buff_A),
> > mrempa(tmp, MAP_FIXED, mmap_buff_A).
>
> There's no mremap() in the Single Unix Spec, nor on my ancient HPUX box,
> nor on my quite-up-to-date OS X box. The Linux man page for it says
> "This call is Linux-specific, and should not be used in programs
> intended to be portable." So if the patch is dependent on that call,
> it's dead on arrival from a portability standpoint.
Good point. This is from initial concept, and actually I done this to do not
leave gaps in VM in which library or something could be mmaped. Last time I
think about using mmap to replace just one VM page.

> But in any case, you didn't explain how use of mremap() avoids the
> problem of the kernel having to maintain a separate page-mapping-table
> entry for each individual buffer. (Per process, yet.) If that's what's
> happening, it's going to be a significant performance penalty as well as
> (I suspect) a serious constraint on how many buffers can be managed.
>
> regards, tom lane
Kernel merges vm_structs. So mappings are compacted. I'm not kernel
specialist, but skipping memory consumption, for not compacted mappings,
kernel uses btrees for dealing with TLB, so it should not matter if there is
100 vm_structs or 100000 vm_structs.

Swap isn't made everywhere. When buffer is initialy read (privaterefcount
==1), then any access to this buffer will directly point to latest valid area.
If it has assigned shmem area then this will be used. I plan to add
"readbuffer for update" to prevent swaps, when it's almost sure that buffer
will be used for update.

I measured performance of page modifications (with unpining, full process on
stand alone unit test) it's 2x-3x more time of normal page reads, but this
result may not be sure, as I saw memcpy to memory above 2GB is slower then
memcpy to first 2GB (this may be idea to try to put some shared structs <
2GB).

I know that this patch is big question. Sometimes I'm optimistic, and
sometimes I'm pessimistic about final result.

Regards,
Radek

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2011-04-17 18:02:11 Re: MMAP Buffers
Previous Message Tom Lane 2011-04-17 17:17:09 Re: blah blah set client_encoding segfault