Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: knizhnik <knizhnik(at)garret(dot)ru>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, James Mansion <james(at)mansionfamily(dot)plus(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Fetter <david(at)fetter(dot)org>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
Date: 2014-01-09 19:30:40
Message-ID: CAGTBQpZK2vjGj=Cju7vLXKt_jr6Q6n4h9eT5or2COUayOs0A8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-announce pgsql-hackers

On Thu, Jan 9, 2014 at 4:24 PM, knizhnik <knizhnik(at)garret(dot)ru> wrote:
> On 01/09/2014 09:46 PM, Claudio Freire wrote:
>>
>> On Thu, Jan 9, 2014 at 2:22 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>>
>>> It would be nice to have better operating system support for this.
>>> For example, IIUC, 64-bit Linux has 128TB of address space available
>>> for user processes. When you clone(), it can either share the entire
>>> address space (i.e. it's a thread) or none of it (i.e. it's a
>>> process). There's no option to, say, share 64TB and not the other
>>> 64TB, which would be ideal for us. We could then map dynamic shared
>>> memory segments into the shared portion of the address space and do
>>> backend-private allocations in the unshared part. Of course, even if
>>> we had that, it wouldn't be portable, so who knows how much good it
>>> would do. But it would be awfully nice to have the option.
>>
>> You can map a segment at fork time, and unmap it after forking. That
>> doesn't really use RAM, since it's supposed to be lazily allocated (it
>> can be forced to be so, I believe, with PROT_NONE and MAP_NORESERVE,
>> but I don't think that's portable).
>>
>> That guarantees it's free.
>>
>> Next, you can map shared memory at explicit addresses (linux's mmap
>> has support for that, and I seem to recall Windows did too).
>>
>> All you have to do, is some book-keeping in shared memory (so all
>> processes can coordinate new mappings).
>
> As far as I undersand the main advantage of DSM is that segment can be
> allocated at any time - not only at fork time.
> And it is not because of memory consumption: even without unmap, allocation
> of some memory region doesn't cause loose pg physical memory. And there are
> usually no problem with exhaustion of virtual space at 64-bit architecture.
> But using some combination of flags (as MAP_NORESERVE), it is usually
> possible to completely eliminate overhead of reserving some address range in
> virtual space. But mapping dynamically created segment (not at fork time) to
> the same address really seems to be a big challenge.

At fork time I only wrote about reserving the address space. After
reserving it, all you have to do is implement an allocator that works
in shared memory (protected by a lwlock of course).

In essence, a hypothetical pg_dsm_alloc(region_name) would use regular
shared memory to coordinate returning an already mapped region (same
address which is guaranteed to work since we reserved that region), or
allocate one (within the reserved address space).

In response to

Responses

Browse pgsql-announce by date

  From Date Subject
Next Message knizhnik 2014-01-09 19:30:59 Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
Previous Message knizhnik 2014-01-09 19:24:59 Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL

Browse pgsql-hackers by date

  From Date Subject
Next Message knizhnik 2014-01-09 19:30:59 Re: [ANNOUNCE] IMCS: In Memory Columnar Store for PostgreSQL
Previous Message Oskari Saarenmaa 2014-01-09 19:26:40 [PATCH] Filter error log statements by sqlstate