Re: Keeping temporary tables in shared buffers

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Asim Praveen <apraveen(at)pivotal(dot)io>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, David Kimura <dkimura(at)pivotal(dot)io>
Subject: Re: Keeping temporary tables in shared buffers
Date: 2018-05-25 06:40:10
Message-ID: 4b94cc72-c505-9625-11e2-563dc5d39398@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 25/05/18 09:25, Asim Praveen wrote:
> On Thu, May 24, 2018 at 8:19 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> So then you have to think about how to transition smoothly between "rel
>> is in local buffers" and "rel is in shared buffers", bearing in mind that
>> ever having the same page in two different buffers would be disastrous.
>
> Local buffers would not be used at all if temp tables start residing in
> shared buffers. The transition mentioned above shouldn't be needed.

What is the performance difference between the local buffer manager and
the shared buffer manager? The local buffer manager avoids all the
locking overhead, which has to amount to something, but how big a
difference is it?

>> I think that would be a deal breaker right there, because of the
>> distributed overhead of making the tags bigger. However, I don't
>> actually understand why you would need to do that. Temp tables
>> have unique OIDs/relfilenodes anyway, don't they? Or if I'm
>> misremembering and they don't, couldn't we make them so?
>
> My parochial vision of the overhead is restricted to 4 * NBuffers of
> additional shared memory, as 4 bytes are being added to BufferTag. May I
> please get some enlightenment?

Any extra fields in BufferTag make computing the hash more expensive.
It's a very hot code path, so any cycles spent are significant.

In relation to Andres' patches to rewrite the buffer manager with a
radix tree, there was actually some discussion of trying to make
BufferTag *smaller*. For example, we could rearrange things so that
pg_class.relfilenode is 64 bits wide. Then you could assume that it
never wraps around, and is unique across all relations in the cluster.
Then you could replace the 12-byte relfilenode+dbid+spcid triplet, with
just the 8-byte relfilenode. Doing something like that might be the
solution here, too.

> Temp tables have unique filename on disk: t_<backendID>_<relfilenode>. The
> logic to assign OIDs and relfilenodes, however, doesn't differ. Given a
> RelFileNode, it is not possible to tell if it's a temp table or not.
> RelFileNodeBackend allows for that distinction but it's not used by buffer
> manager.

Could you store the backendid in BufferDesc, outside of BufferTag? Is it
possible for a normal table and a temporary table to have the same
relfilenode+dbid+spcid triplet?

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-05-25 06:40:55 Re: Redesigning the executor (async, JIT, memory efficiency)
Previous Message Heikki Linnakangas 2018-05-25 06:26:47 Re: Redesigning the executor (async, JIT, memory efficiency)