From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Reducing the size of BufferTag & remodeling forks |
Date: | 2015-09-13 13:16:32 |
Message-ID: | CANP8+jLT+zQdj=6T2fsBNFj_wvqDdPvJ7fkTg++_RcOx5QoCEw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12 September 2015 at 21:19, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2015-09-12 13:12:26 +0100, Simon Riggs wrote:
> > Why do we have to do buffer lookups using the full buffer tag?
>
> We don't necessarily.
>
> > Why not just use (relNode, blockNum) and resolve hash collisions, if any?
>
> I tried that and unfortunately it came out as a negative - the number of
> collision gets higher and collisions in a hashtable will often imply a
> cache miss. You can even see the comparison function rising in the
> profile.
>
> I've actually changed course a bit and I'm trying something different: A
> two level structure. One hashtable that maps (RelFileNode, ForkNumber)
> to a 'open relation' data structure, and from there a radix tree over
> just the block number. To avoid having to look up in the hashtable
> frequently there's a pointer in RelationData to a fork's radix tree.
>
> This seems to have several advantages:
> * It doesn't require changes to the physical representation
>
Seems necessary
> * A radix tree allows ordered traversal, allowing for efficient
> prefetching et al.
>
Could be very important... some benchmarks of whether that was worthwhile
would really help
> * The ability to efficiently get all pages that belong to a relation
> makes dropping/truncating tables radically more efficient than now.
>
Not very important, since there are other approaches
> * A single lookup in the radix tree is on average significantly faster
> than in the hash table. I've measured ~350 vs 60 cycles in a
> nonconcurrent workload and ~820 vs 72~ cycles in a concurrent
> workload, using a recorded buffer access traces.
>
Good
> * Having a 'open relation' representation in shared memory also makes it
> much easier to implement more efficient relation extension and
> truncation - we can have pointers in there that signal how far the
> relation has been extended and how far it has been initialized.
>
Probably important, for extension
> The biggest disadvantage is that it's a good bit more complex than what
> we have right now. That we need dynamically many radix trees makes
> memory management nontrivial.
>
Which might mean we lose benefit by requiring many additional memory
accesses.
> Sounds halfway sane?
>
I think it is a good research direction, as long as we get the focus right.
Another similar approach is direct allocation of chunks of memory for
in-memory access. That would be much less flexible, but access would be in
<5 cycles. I mention this for comparison only.
So if the focus is on more efficient and reasonably flexible access to data
in memory, then it sounds good.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2015-09-13 13:17:28 | Re: New gist vacuum. |
Previous Message | Stephen Frost | 2015-09-13 12:49:34 | Re: [DOCS] Missing COMMENT ON POLICY |