Re: Sampling profiler updated

From: Dimitri Fontaine <dfontaine(at)hi-media(dot)com>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Sampling profiler updated
Date: 2009-07-18 20:09:01
Message-ID: 7E9C11D8-7471-48A5-A22D-F587D53CFB78@hi-media.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Le 14 juil. 09 à 11:47, Itagaki Takahiro a écrit :
> I updated Sampling profiler patch to be applied to HEAD cleanly.

Which I confirm, as I just spent some time to reviewing the patch (it
was left unassigned in the commit fest). Reviewing code didn't strike
anything so obvious that I'd notice...

> Basic concept of the patch is same as DTrace probes:
> Call pgstat_push_condition(condition) before a operation and call
> pgstat_pop_condition() at the end of the operation. Those functions
> should be light-weight because they only change a variable on shared
> memory without any locks.

...except that the typical integration is like this:

+ pgstat_push_condition(PGCOND_XLOG_OPEN);
fd = BasicOpenFile(path, O_RDWR | PG_BINARY |
get_sync_bit(sync_method),
S_IRUSR | S_IWUSR);
+ pgstat_pop_condition();

And we have this:

! void
! pgstat_push_condition(PgCondition condition)
! {
! volatile PgBackendStatus *beentry = MyBEEntry;
! PgCondition prev;
!
! if (profiling_interval <= 0 || !beentry)
! return;

I'm wondering if it'll be enough for people not interested into
profiling not to bother. The patch contains a lot of added call sites.
I'm not sure if it's possible to arrange for not calling the function
at all when profiling is disabled (GUC profiling_interval = 0).

On the same vein:

+ static void
+ LockPageContent(volatile BufferDesc *buf, LWLockMode mode)
+ {
+ pgstat_push_condition(PGCOND_LWLOCK_PAGE);
+ LWLockAcquire(buf->content_lock, mode);
+ pgstat_pop_condition();
+ }
+
+ static void
+ LockPageIO(volatile BufferDesc *buf, LWLockMode mode)
+ {
+ pgstat_push_condition(PGCOND_LWLOCK_IO);
+ LWLockAcquire(buf->io_in_progress_lock, mode);
+ pgstat_pop_condition();
+ }

With the call site being (in src/backend/storage/buffer/bufmgr.c,
FlushRelationBuffers(Relation rel), FlushDatabaseBuffers(Oid dbid)):
! LWLockAcquire(bufHdr->content_lock, LW_SHARED);
...
! LockPageContent(bufHdr, LW_SHARED);

Maybe there's nothing to worry about, but I figured I'd better raise
the concert for further reviewing.

Oh, and this too:
*************** LockBuffer(Buffer buffer, int mode)
*** 2372,2380 ****
if (mode == BUFFER_LOCK_UNLOCK)
LWLockRelease(buf->content_lock);
else if (mode == BUFFER_LOCK_SHARE)
! LWLockAcquire(buf->content_lock, LW_SHARED);
else if (mode == BUFFER_LOCK_EXCLUSIVE)
! LWLockAcquire(buf->content_lock, LW_EXCLUSIVE);

Now the build went fine, but the testing (default configuration) not
so much:

dim=# create table series(i integer);
dim=# insert into series select generate_series(1, 10000000);
LOG: checkpoints are occurring too frequently (4 seconds apart)
HINT: Consider increasing the configuration parameter
"checkpoint_segments".
WARNING: condition stack overflow: 10
...
WARNING: condition stack overflow: 11
WARNING: condition stack overflow: 11
WARNING: condition stack overflow: 11
ERROR: canceling statement due to user request

dim=# select count(*) from series;
count
-------
0
(1 row)

Time: 9504.624 ms

dim=# select * from pg_profiles;
profid | profname | profnum
--------+---------------------+---------
83000 | LWLock:Page | 15
58000 | Data:Extend | 7
80018 | LWLock:BgWriterComm | 1
10000 | CPU | 85
22000 | Network:Send | 128
80009 | LWLock:WALWrite | 6
55000 | Data:Read | 1
45000 | XLog:Write | 1
15000 | CPU:Utility | 5
15100 | CPU:Commit | 1
57000 | Data:Write | 15
42000 | XLog:Insert | 24
46000 | XLog:Flush | 4
(13 rows)

Time: 11.372 ms

The error I got is matching this part of the patch:
! void
! pgstat_push_condition(PgCondition condition)
! {
...
! if (condition_stack_top < MAX_COND_STACK)
! condition_stack[condition_stack_top] = prev;
! else
! elog(WARNING, "condition stack overflow: %d", condition_stack_top);

So I'm going to change patch state to "Returned with Feedback" as I
guess we'll need to talk about the issue and find a way to solve it,
and I don't think this state prevent from getting back to the patch in
this same fest.

Regards,
--
dim

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2009-07-18 20:38:20 Re: Sampling profiler updated
Previous Message Robert Haas 2009-07-18 20:01:58 Re: SE-PostgreSQL?