From: | NikhilS <nikkhils(at)gmail(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: PrivateRefCount (for 8.3) |
Date: | 2007-01-16 09:55:58 |
Message-ID: | d3c4af540701160155j23e4e846uae30a7b919f49aff@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Most likely a waste of development effort --- have you got any evidence
> of a real effect here? With 200 max_connections the size of the arrays
> is still less than 10% of the space occupied by the buffers themselves,
> ergo there isn't going to be all that much cache-thrashing compared to
> what happens in the buffers themselves. You're going to be hard pressed
> to buy back the overhead of the hashing.
>
> It might be interesting to see whether we could shrink the refcount
> entries to int16 or int8. We'd need some scheme to deal with overflow,
> but given that the counts are now backed by ResourceOwner entries, maybe
> extra state could be kept in those entries to handle it.
I did some instrumentation coupled with pgbench/dbt2/views/join query runs
to find out the following:
(a) Maximum number of buffers pinned simultaneously by a backend: 6-9
(b) Maximum value of simultaneous pins on a given buffer by a backend: 4-6
(a) indicates that for large shared_buffers value we will end up with space
wastage due to a big PrivateRefCount array per backend (current allocation
is (int32 * shared_buffers)).
(b) indicates that the refcount to be tracked per buffer is a small enough
value. And Tom's suggestion of exploring int16 or int8 might be worthwhile.
Following is the Hash Table based proposal based on the above readings:
- Do away with allocating NBuffers sized PrivateRefCount array which is
an allocation of (NBuffers * int).
- Define Pvt_RefCnt_Size to be 64 (128?) or some such value so as to be
multiples
ahead of the above observed ranges. Define Overflow_Size to be 8 or some
similar small value to handle collisions.
- Define the following Hash Table entry to keep track of reference counts
struct HashRefCntEnt
{
int32 BufferId;
int32 RefCnt;
int32 NextEnt; /* To handle collisions */
};
- Define a similar Overflow Table entry as above to handle collisions.
An array HashRefCntTable of such HashRefCntEnt'ries of size Pvt_RefCnt_Size
will get
initialized in the InitBufferPoolAccess function.
An OverflowTable of size Overflow_Size will be allocated. This array will be
sized dynamically (2* current Overflow_Size) to accomodate more entries if
it cannot accomodate further collisions in the main table.
We do not want the overhead of a costly hashing function. So we will use
(%Pvt_RefCnt_Size i.e modulo Pvt_RefCnt_Size) to get the index where the
buffer
needs to go. In short our hash function is (bufid % Pvt_RefCnt_Size) which
should be a cheap enough operation.
Considering that 9-10 buffers will be needed, the probability of collisions
will be less. Collisions will arise only if buffers with ids (x, x +
Pvt_RefCnt_Size, x + 2*Pvt_RefCnt_Size etc.) get used in the same operation.
This should be pretty rare.
Functions PinBuffer, PinBuffer_Locked, IncrBufferRefCount, UnpinBuffer etc.
will be modified to consider the above mechanism properly. The changes will
be localized in the buf_init.c and bufmgr.c files only.
Comments please.
Regards,
Nikhils
--
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Hubert FONGARNAND | 2007-01-16 10:26:37 | Temparary disable constraint |
Previous Message | Magnus Hagander | 2007-01-16 09:14:26 | Re: [HACKERS] Checkpoint request failed on version 8.2.1. |