Re: Process local hint bit cache

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Process local hint bit cache
Date: 2011-03-31 22:33:02
Message-ID: AANLkTi=fUKMYRkEES1SO=iyC+ynX92pv+q2yKfTSLzHq@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 30, 2011 at 2:35 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
> On Wed, Mar 30, 2011 at 11:23 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> On 30.03.2011 18:02, Robert Haas wrote:
>>>
>>> On Wed, Mar 30, 2011 at 10:40 AM, Greg Stark<gsstark(at)mit(dot)edu>  wrote:
>>>>
>>>> But one way or another the hint bits have to get set sometime. The
>>>> sooner that happens the less clog i/o has to happen in the meantime.
>>>
>>> I talked about this with Merlin a bit yesterday.  I think that his
>>> thought is that most transactions will access a small enough number of
>>> distinct CLOG pages, and the cache accesses might be fast enough, that
>>> we really wouldn't need to get the hint bits set, or at least that
>>> vacuum/freeze time would be soon enough.  I'm not sure if that's
>>> actually true, though - I think the overhead of the cache might be
>>> higher than he's imagining.  However, there's a sure-fire way to find
>>> out... code it up and see how it plays.
>>
>> I did a little experiment: I hacked SetHintBits() to do nothing, so that
>> hint bits are never set. I then created a table with 100000 rows in it:
>>
>> postgres=# \d foo
>>      Table "public.foo"
>>  Column |  Type   | Modifiers
>> --------+---------+-----------
>>  a      | integer |
>>
>> postgres=# INSERT INTO foo SELECT generate_series(1, 100000);
>> INSERT 0 100000
>>
>> And ran queries on the table:
>>
>> postgres=# do $$
>> declare
>>  i int4;
>> begin
>>  loop
>>    perform COUNT(*) FROM foo;
>>  end loop;
>> end;
>> $$;
>>
>> This test case is designed to exercise the visibility tests as much as
>> possible. However, all the tuples are loaded in one transaction, so the
>> one-element cache in TransactionLogFetch is 100% effective.
>>
>> I ran oprofile on that. It shows that about 15% of the time is spent in
>> HeapTupleSatisfiesMVCC and its subroutines. 6.6% is spent in
>> HeapTupleSatisfiesMVCC itself. Here's the breakdown of that:
>>
>> $ opreport  -c --global-percent
>>
>> CPU: Intel Architectural Perfmon, speed 2266 MHz (estimated)
>> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
>> mask of 0x00 (No unit mask) count 100000
>> samples  %        app name                 symbol name
>> ...
>> -------------------------------------------------------------------------------
>>  2143      0.4419  postgres                 postgres heapgettup_pagemode
>>  73277    15.1091  postgres                 postgres heapgetpage
>> 31858 6.5688  postgres                 postgres HeapTupleSatisfiesMVCC
>>  31858 6.5688  postgres                 postgres HeapTupleSatisfiesMVCC
>> [self]
>>  12809     2.6411  postgres                 postgres
>> TransactionIdIsInProgress
>>  12091     2.4931  postgres                 postgres XidInMVCCSnapshot
>>  7150      1.4743  postgres                 postgres
>> TransactionIdIsCurrentTransactionId
>>  7056      1.4549  postgres                 postgres TransactionIdDidCommit
>>  1839      0.3792  postgres                 postgres TransactionIdPrecedes
>>  1467      0.3025  postgres                 postgres SetHintBits
>>  1155      0.2382  postgres                 postgres TransactionLogFetch
>> -------------------------------------------------------------------------------
>> ...
>>
>> I then ran the same test again with an unpatched version, to set the hint
>> bits. After the hint bits were set, I ran oprofile again:
>>
>> -------------------------------------------------------------------------------
>>  275       0.4986  postgres                 heapgettup_pagemode
>>  4459      8.0851  postgres                 heapgetpage
>> 3005      5.4487  postgres                 HeapTupleSatisfiesMVCC
>>  3005      5.4487  postgres                 HeapTupleSatisfiesMVCC [self]
>>  1620      2.9374  postgres                 XidInMVCCSnapshot
>>  110       0.1995  postgres                 TransactionIdPrecedes
>> -------------------------------------------------------------------------------
>>
>> So with hint bits set, HeapTupleSatisfiesMVCC accounts for 8% of the total
>> CPU time, and without hint bits, 15%.
>>
>> Even if clog access was free, hint bits still give a significant speedup
>> thanks to skipping all the other overhead like TransactionIdIsInProgress()
>> and TransactionIdIsCurrentTransactionId(). Speeding up clog access is
>> important; when the one-element cache isn't saving you the clog access
>> becomes a dominant factor. But you have to address that other overhead too
>> before you can get rid of hint bits.
>
> Here is a patch demonstrating the caching action, but without the
> cache table, which isn't done yet -- It only works using the very last
> transaction id fetched.  I used macros so I could keep the changes
> quite localized.

While playing with the data structure to implement an xid cache inside
TransactionLogFetch, I learned a nasty lesson.
HeapTupleSatisifiesMVCC is super sensitive to any changes and even
jumping through a couple of calls to TransactionLogFetch was
measurable in a couple of performance test case I came up with. I
hauntingly recalled Tom's warnings to beware unexpected performance
downsides.

In short, irregardless of implementation, I unfortunately don't think
any caching as or under the clog abstraction is going to work without
impacting at least some cases.

OTOH, maybe some micro optimization of HeapTupleSatisifiesMVCC could
provide the same benefit I'm going after here. If you save off the
the last xid that came back committed in a static variable, you can
check that at the same level as if it were a hint bit. From my
testing, this is well and truly 'free' in the sense that it can
defer/eliminate clog checks and hint bit i/o.

The patch attached gives all the advantages I was after in the
previous patch without the downsides I found (extra calls to
TransactionIdInProgress and inefficient calls out to
TransactionLogFetch).

It remains to be determined if it's possible to make a structure that
is fast enough to expand the above to more than 1 xid. For starters,
all the calls should be inline. Considering we are only after the
commit bits, a super tight hashing/bucketing algorithm might fit the
bill.

merlin

diff -x '*.sql' -x '*.o' -x '*.txt' -rupN
postgresql-9.1alpha4/src/backend/utils/time/tqual.c
postgresql-9.1alpha4_hbcache/src/backend/utils/time/tqual.c
--- postgresql-9.1alpha4/src/backend/utils/time/tqual.c 2011-03-09
08:19:24.000000000 -0600
+++ postgresql-9.1alpha4_hbcache/src/backend/utils/time/tqual.c
2011-03-31 17:03:43.135195002 -0500
@@ -912,7 +912,10 @@ bool
HeapTupleSatisfiesMVCC(HeapTupleHeader tuple, Snapshot snapshot,
Buffer buffer)
{
- if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED))
+ static TransactionId LastCommitXid = InvalidTransactionId;
+
+ if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)
+ && LastCommitXid != HeapTupleHeaderGetXmin(tuple))
{
if (tuple->t_infomask & HEAP_XMIN_INVALID)
return false;
@@ -985,8 +988,11 @@ HeapTupleSatisfiesMVCC(HeapTupleHeader t
else if
(TransactionIdIsInProgress(HeapTupleHeaderGetXmin(tuple)))
return false;
else if (TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple)))
+ {
+ LastCommitXid = HeapTupleHeaderGetXmin(tuple);
SetHintBits(tuple, buffer, HEAP_XMIN_COMMITTED,
HeapTupleHeaderGetXmin(tuple));
+ }
else
{
/* it must have aborted or crashed */

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2011-03-31 22:38:33 Re: Process local hint bit cache
Previous Message Kevin Grittner 2011-03-31 22:09:23 Re: cast from integer to money