From: | Steeve Boulanger <sboulanger29(at)gmail(dot)com> |
---|---|
To: | Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why? |
Date: | 2024-11-21 23:50:13 |
Message-ID: | CAAiSvx-JPJCGc_VR+YrUNo75pb0yrmB8Ng3zBOCf+Mksdo5JYg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> 1) Do the 77 share some trait the other 80 don't.
No pattern found yet .. but still verifying a few things
> 2) Do the OS system logs reveal anything?
Nothing found in syslog
> 3) What was happening in the databases just prior to the time the stats
reset?
Here's an example (log extracts) for a stats reset occurrence:
select datname, stats_reset, now()-stats_reset as since_reset
from pg_stat_database
where ( now()-stats_reset ) < interval '1 day'
order by 3 limit 1;
datname | stats_reset | since_reset
----------------+-------------------------------+-----------------
MyDB | *2024-11-21 13:48:34.332*785+00 | 00:00:22.266304
<--LOGS-->
2024-11-21 13:48:34.324 UTC pid=[322035][2] db=[MyDB] usr=[user1]
client=[host1] app=[[unknown]]LOG: connection authorized: user=user1
database=MyDB applicatio
n_name=app1 <..>
<.. no calls at "2024-11-21 13:48:34.332" - WHY?? ..>
2024-11-21 13:48:34.336 UTC pid=[322035][3] db=[MyDB] usr=[user1]
client=[host1] app=[app1]LOG: duration: 1.071 ms parse <unnamed>: SELECT
<..>
<--LOGS-->
As you can see from above, the stats for MyDB were reset at ".332" . The
only logs before/after for the db was the connection (at .324), and then
the parse (at .336). NB: I also checked the logs at ".333" in case there
would have been a rounding up, but nothing relevant was found. With that
said, I only verified one occurence - tomorrow I'll check a few more just
to validate.
> 4) Do you have external tools accessing these databases?
We have internal micro-services accessing the databases, as well as a
monitoring tool (Netdata), and some of the Devs use pgAdmin. I discarded
the scenario where someone would inadvertently do a "pg_stat_reset" via
pgAdmin, just because a lot of databases have their stats reset within a
short period of time.
On the other hand, Netdata does connect to most (if not all) databases
frequently by its nature - so as a test, I stopped the Netdata service
today to see if tomorrow we're still seeing the stats reset or not. I can
report back tomorrow on this.
> 5) Is the cluster directly open to the world?
No. It's an on-premise installation. Only local applications can connect to
it.
-Steeve
On Thu, Nov 21, 2024 at 4:32 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:
> On 11/21/24 13:31, Steeve Boulanger wrote:
> > > All I can think to do is look at the logs around the stats_reset
> times
> > > for the databases and see if there is anything relevant.
> >
> > That was already done, but nothing relevant was found unfortunately.
>
> Unless it was not recognized as relevant. Since for the time being I am
> eliminating magic as the cause, something concrete is causing this and
> it should be leaving a trace. In your post you had this affecting 77 out
> of 157 databases in the cluster.
>
> 1) Do the 77 share some trait the other 80 don't.
>
> 2) Do the OS system logs reveal anything?
>
> 3) What was happening in the databases just prior to the time the stats
> reset?
>
> 4) Do you have external tools accessing these databases?
>
> 5) Is the cluster directly open to the world?
>
> >
> > -Steeve
> >
> > On Thu, Nov 21, 2024 at 3:12 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> > <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
> >
> > On 11/21/24 12:57, Steeve Boulanger wrote:
> > >
> > > > Please reply to list also.
> > >
> > > My apologies - I thought I did a "Reply all", but apparently not.
> > I'm a
> > > little bit of a noob with email distrib lists.
> > >
> > > > 1) What is log_min_error_statement set to?
> > >
> > > name | setting | pending_restart
> > > -------------------------+---------+-----------------
> > > log_min_error_statement | error | f
> > >
> > > > 2) Did you reload the server when changing?:
> > >
> > > yes - pg_reload_conf()
> >
> > All I can think to do is look at the logs around the stats_reset
> times
> > for the databases and see if there is anything relevant.
> >
> > >
> > > -Steeve
> >
> >
> > --
> > Adrian Klaver
> > adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> >
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Subhash Udata | 2024-11-22 03:57:44 | Re: CVE-2024-10979 Vulnerability Impact on PostgreSQL 11.10 |
Previous Message | Adrian Klaver | 2024-11-21 22:32:06 | Re: Database stats ( pg_stat_database.stats_reset ) get reset on daily basis - why? |