Re: Monitoring Object access

From: adi hirschtein <adiso75(at)gmail(dot)com>
To: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Monitoring Object access
Date: 2010-09-12 14:02:35
Message-ID: AANLkTik2-DbRDxmGSMuRv4Axm=wry8hhpJoHJOq50n+A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Craig,

Thanks a lot for the quick response!
I'm coming from the Oracle side of the house and In oracle for instance, you
use shared buffer as well, but you are still able to see which session is
waiting for which blocks
and if one session is doing the "real" I/O then the other one wait on 'wait
for other session" event so you are able to know who did the actual I/O
the reason behind it is that you want to check which objects is being
heavily hit by which business processes or users and then tier your storage
accordingly.
I agree with your point about the OS buffer cache, I need to monitor it as
well.
is there any place rather than pg_stat_activity that you think I should take
a look at?

Best Regard,
Adi

On Sun, Sep 12, 2010 at 4:04 PM, Craig Ringer
<craig(at)postnewspapers(dot)com(dot)au>wrote:

> On 09/12/2010 06:52 PM, adi hirschtein wrote:
>
>> Hi,
>> Using the catalog tables, is there any way to correlate session id/user
>> id to which object (i.e. tables, indexes etc) it access and much how
>> disk reads or I/O wait has been done against the objects.
>> in general, I'd like to see which objects are being accessed by which
>> user and the time/amount of I/O wait/reads.
>>
>
> There isn't really anything like that, no. You have pg_stat_activity, but
> it's a pretty coarse tool.
>
> The shared buffer cache and the use of things like synchronized sequential
> scans means that it wouldn't even be possible to truly track who's causing
> load reliably. As I understand it, if Joe and Fred both to a "SELECT * FROM
> sometable", it's quite likely that only one of the backends will actually
> appear to read anything from disk. Which one is pretty much luck.
>
> Adding something like this would require instrumenting not only the
> backends' disk I/O code, but also their shared buffer access code. It'd
> potentially add a lot of overhead, and I'm not sure the results would mean
> very much because the caching effects would make "fred's backend did X I/O
> operations" less meaningful. Fred's might just have been the first that ran
> after Joe's giant seqscan cross join of two big tables that forced
> everything else out of shared_buffers.
>
> Even if you did have that instrumentation, you'd need OS-level
> instrumentation (dtrace, prof, etc) to track the OS's buffer cache, which
> PostgreSQL relies on heavily. Without that you can't tell the difference
> between a query that caused I/O calls from postgresql but could be satisfied
> by OS buffer cache and one that required expensive physical disk I/O to
> satisfy.
>
> Really, I don't know if it's realistic to do what you're suggesting unless
> every user's data set is isolated from every other user's, in which case you
> can maybe use OS-level tools like prof or DTrace to achieve it if you set
> postgresql up to log whichstarts which backend pid.
>
> If there's any significant overlap in the data sets used by the users (if
> they use the same databases or schema) I wouldn't think you'd be able to get
> any meaningful results.
>
> --
> Craig Ringer
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2010-09-12 15:11:27 Re: why can't see the updated value after SPI_execute("update ....", false, 1);
Previous Message Scott Marlowe 2010-09-12 13:54:28 Re: Incrementally Updated Backups