Re: Mysterious performance degradation in exceptional cases

From: Matthias Apitz <guru(at)unixarea(dot)de>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Mysterious performance degradation in exceptional cases
Date: 2022-09-18 09:30:40
Message-ID: YyblQJpDscQH9SGn@c720-r368166
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

El día jueves, septiembre 15, 2022 a las 08:40:24a. m. -0700, Adrian Klaver escribió:

> On 9/14/22 22:33, Matthias Apitz wrote:
> > El día miércoles, septiembre 14, 2022 a las 07:19:31a. m. -0700, Adrian Klaver escribió:
> >
> > > On 9/14/22 01:31, Matthias Apitz wrote:
>
> > > Where is the inter library software, in your application or are you reaching
> > > out to another application?
> >
> > The above 'app-server' fulfills the search requested by the
> > 'ILL-software' (or the 'test search'), i.e. looks up for one single
> > librarian record (one row in the PostgreSQL database) and delivers
> > it to the 'ILL-software'. The request from the 'ILL-software' is not
> > a heavy duty, more or less 50 requests per day.
> >
> > > Is the search running across a remote network?
> >
> > The real search comes over the network through a stunnel. But we
> > watched with tcpdump the incoming search and the response by the
> > 'app-server' locally. In the case of the timeout, the 'app-server' does not
> > answer within 180 seconds, i.e. does not send anything into the stunnel,
> > and the remote 'ILL-software' terminates the connection with an F-packet.
>
> The 'app-server' does not answer, but does the database not answer also?
>
> Have you looked to see if the database is providing a response in a timely
> manner and if it is getting 'lost' in the 'app-server'?

I do not "think" that the time is spent elsewhere in the 'app-server'.
to make a fact of the "thinking", I enabled the tracing of our dblayer
showing how many milliseconds have been spent between entering the dblayer and
returning the result to the application layers. This is in effect since
36 hours now. Since September 13 the problem has not showed up anymore.
We are waiting for it...

> Also have you considered Tom Lane's suggestion of using auto_explain?

I'm afraid that this would affect all other applications using the same
server. We still have other options to use. When the above tracing shows
a result, i.e. which of the high level operations of the dblayer uses
more time (more milliseconds) than normal, the next level is the detailed logging
of the ESQL/C operations (where we added time stamps which normaly this
logging does not have in its source code).

Thanks

matthias

--
Matthias Apitz, ✉ guru(at)unixarea(dot)de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub

Sahra Wagenknecht im Bundestag: "Aber die Vorstellung, dass wir Putin dadurch
bestrafen, dass wir Millionen Familien in Deutschland in die Armut stürzen und
dass wir unsere Industrie zerstören, während Gasprom Rekordgewinne macht – ja,
wie bescheuert ist das denn?" Recht hat sie!

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2022-09-18 14:47:32 Re: Mysterious performance degradation in exceptional cases
Previous Message Avi Weinberg 2022-09-18 07:08:48 Logical Replication - Give One Subscription Priority Over Other Subscriptions