Re: Tracking wait event for latches

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tracking wait event for latches
Date: 2016-09-23 04:32:03
Message-ID: CAA4eK1JJPueEUMMZCRsuak89b1KSEf=htmccLDdjCBPSOa+mKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 22, 2016 at 7:19 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Sep 21, 2016 at 5:49 PM, Thomas Munro
> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>
> So, I tried to classify these. Here's what I came up with.
>
> Activity: ArchiverMain, AutoVacuumMain, BgWriterMain,
> BgWriterHibernate, CheckpointerMain, PgStatMain, RecoveryWalAll,
> RecoveryWalStream, SysLoggerMain, WalReceiverMain, WalWriterMain
>
> Client: SecureRead, SecureWrite, SSLOpenServer, WalSenderMain,
> WalSenderWaitForWAL, WalSenderWriteData, WalReceiverWaitStart
>
> Timeout: BaseBackupThrottle, PgSleep, RecoveryApplyDelay
>
> IPC: BgWorkerShutdown, BgWorkerStartup, ExecuteGather,
> MessageQueueInternal, MessageQueuePutMessage, MessageQueueReceive,
> MessageQueueSend, ParallelFinish, ProcSignal, ProcSleep, SyncRep
>
> Extension: Extension
>

We already call lwlock waits from an extension as "extension", so I
think just naming this an Extension might create some confusion.

> I classified all of the main loop waits as waiting for activity; all
> of those are background processes that are waiting for something to
> happen and are more or less happy to sleep forever until it does. I
> also included the RecoveryWalAll and RecoveryWalStream events in
> there; those don't have the sort of "main loop" flavor of the others
> but they are happy to wait more or less indefinitely for something to
> occur. Likewise, it was pretty easy to find all of the events that
> were waiting for client I/O, and I grouped those all under "Client".
> A few of the remaining wait events seemed like they were clearly
> waiting for a particular timeout to expire, so I gave those their own
> "Timeout" category.
>
> I believe these categorizations are actually useful for users. For
> example, you might want to see all of the waits in the system but
> exclude the "Client", "Activity", and "Timeout" categories because
> those are things that aren't signs of a problem. A "Timeout" wait is
> one that you explicitly requested, a "Client" wait isn't the server's
> fault, and an "Activity" wait just means nothing is happening. In
> contrast, a "Lock" or "LWLock" or "IPC" wait shows that something is
> actually delaying work that we'd ideally prefer to have get done
> sooner.
>
> I grouped the rest of this stuff as "IPC" because all of these events
> are cases where one server process is waiting for another server
> processes . That could be further subdivided, of course: most of
> those events are only going to occur in relation to parallel query,
> but I didn't want to group it that way explicitly because both
> background workers and shm_mq have other potential uses. ProcSignal
> and ProcSleep are related to heavyweight locks and SyncRep is of
> course related to synchronous replication. But they're all related
> in that one server process is waiting for another server process to
> tell it that a certain state has been reached, so IPC seems like a
> good categorization.
>
> Finally, extensions got their own category in this taxonomy, though I
> wonder if it would be better to instead have
> Activity/ExtensionActivity, Client/ExtensionClient,
> Timeout/ExtensionTimeout, and IPC/ExtensionIPC instead of making it a
> separate toplevel category.
>

+1. It can avoid confusion.

> To me, this seems like a pretty solid toplevel categorization and a
> lot more useful than just throwing all of these in one bucket and
> saying "good luck".

Agreed. This categorisation is very good and can help patch author to proceed.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-09-23 04:43:47 Re: Tracking wait event for latches
Previous Message Peter Eisentraut 2016-09-23 04:10:12 Re: pageinspect: Hash index support