Quick Links

Re: Process local hint bit cache

From:	Merlin Moncure <mmoncure(at)gmail(dot)com>
To:	Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Greg Stark <gsstark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Process local hint bit cache
Date:	2011-03-30 19:35:42
Message-ID:	AANLkTikpDH5b9=QogCxL+3H=rqc+sMXC3F02VvWp33Yx@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, Mar 30, 2011 at 11:23 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> On 30.03.2011 18:02, Robert Haas wrote:
>>
>> On Wed, Mar 30, 2011 at 10:40 AM, Greg Stark<gsstark(at)mit(dot)edu> wrote:
>>>
>>> But one way or another the hint bits have to get set sometime. The
>>> sooner that happens the less clog i/o has to happen in the meantime.
>>
>> I talked about this with Merlin a bit yesterday. I think that his
>> thought is that most transactions will access a small enough number of
>> distinct CLOG pages, and the cache accesses might be fast enough, that
>> we really wouldn't need to get the hint bits set, or at least that
>> vacuum/freeze time would be soon enough. I'm not sure if that's
>> actually true, though - I think the overhead of the cache might be
>> higher than he's imagining. However, there's a sure-fire way to find
>> out... code it up and see how it plays.
>
> I did a little experiment: I hacked SetHintBits() to do nothing, so that
> hint bits are never set. I then created a table with 100000 rows in it:
>
> postgres=# \d foo
> Table "public.foo"
> Column | Type | Modifiers
> --------+---------+-----------
> a | integer |
>
> postgres=# INSERT INTO foo SELECT generate_series(1, 100000);
> INSERT 0 100000
>
> And ran queries on the table:
>
> postgres=# do $$
> declare
> i int4;
> begin
> loop
> perform COUNT(*) FROM foo;
> end loop;
> end;
> $$;
>
> This test case is designed to exercise the visibility tests as much as
> possible. However, all the tuples are loaded in one transaction, so the
> one-element cache in TransactionLogFetch is 100% effective.
>
> I ran oprofile on that. It shows that about 15% of the time is spent in
> HeapTupleSatisfiesMVCC and its subroutines. 6.6% is spent in
> HeapTupleSatisfiesMVCC itself. Here's the breakdown of that:
>
> $ opreport -c --global-percent
>
> CPU: Intel Architectural Perfmon, speed 2266 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit
> mask of 0x00 (No unit mask) count 100000
> samples % app name symbol name
> ...
> -------------------------------------------------------------------------------
> 2143 0.4419 postgres postgres heapgettup_pagemode
> 73277 15.1091 postgres postgres heapgetpage
> 31858 6.5688 postgres postgres HeapTupleSatisfiesMVCC
> 31858 6.5688 postgres postgres HeapTupleSatisfiesMVCC
> [self]
> 12809 2.6411 postgres postgres
> TransactionIdIsInProgress
> 12091 2.4931 postgres postgres XidInMVCCSnapshot
> 7150 1.4743 postgres postgres
> TransactionIdIsCurrentTransactionId
> 7056 1.4549 postgres postgres TransactionIdDidCommit
> 1839 0.3792 postgres postgres TransactionIdPrecedes
> 1467 0.3025 postgres postgres SetHintBits
> 1155 0.2382 postgres postgres TransactionLogFetch
> -------------------------------------------------------------------------------
> ...
>
> I then ran the same test again with an unpatched version, to set the hint
> bits. After the hint bits were set, I ran oprofile again:
>
> -------------------------------------------------------------------------------
> 275 0.4986 postgres heapgettup_pagemode
> 4459 8.0851 postgres heapgetpage
> 3005 5.4487 postgres HeapTupleSatisfiesMVCC
> 3005 5.4487 postgres HeapTupleSatisfiesMVCC [self]
> 1620 2.9374 postgres XidInMVCCSnapshot
> 110 0.1995 postgres TransactionIdPrecedes
> -------------------------------------------------------------------------------
>
> So with hint bits set, HeapTupleSatisfiesMVCC accounts for 8% of the total
> CPU time, and without hint bits, 15%.
>
> Even if clog access was free, hint bits still give a significant speedup
> thanks to skipping all the other overhead like TransactionIdIsInProgress()
> and TransactionIdIsCurrentTransactionId(). Speeding up clog access is
> important; when the one-element cache isn't saving you the clog access
> becomes a dominant factor. But you have to address that other overhead too
> before you can get rid of hint bits.

Here is a patch demonstrating the caching action, but without the
cache table, which isn't done yet -- It only works using the very last
transaction id fetched. I used macros so I could keep the changes
quite localized.

The payoff is obvious:

stock postgres:
postgres=# create table v as select generate_series(1,50000000) v;
select count(*) from v;
SELECT 50000000
Time: 70010.160 ms

select count(*) from v;
run 1: 64.5 seconds <-- !
run 2: 21.3 seconds
run 3: 19.3 seconds

hint bit patch:
run 1: 19.2 seconds <-- the 'money shot'
run 2: 20.7 seconds
run 3: 19.3 seconds

Of course, until I get the cache table mechanism finished, you only
see real benefit if you have significant # of pages all the same
transaction. otoh, checking the last transaction has cost of 0, so
your worst case performance is the old behavior. I'm pretty sure I
can make a cache that is cheap to check and maintain because I'm
completely free from locks, side effects, etc.

btw I haven't forgotten your idea to move TransactionIdInProgress
Down. I think this is a good idea, and will experiment with it pre and
post cache.

merlin

Attachment	Content-Type	Size
hbcache.patch	application/octet-stream	6.1 KB

In response to

Re: Process local hint bit cache at 2011-03-30 16:23:42 from Heikki Linnakangas

Responses

Re: Process local hint bit cache at 2011-03-30 22:30:34 from Merlin Moncure
Re: Process local hint bit cache at 2011-03-31 22:33:02 from Merlin Moncure

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2011-03-30 19:39:05	Re: Problem with pg_upgrade?
Previous Message	Robert Haas	2011-03-30 19:32:26	Re: pg_dump --binary-upgrade vs. ALTER TYPE ... DROP ATTRIBUTE