Re: Checksum errors in pg_stat_database

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: David Steele <david(at)pgmasters(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Checksum errors in pg_stat_database
Date: 2019-03-09 18:49:50
Message-ID: CABUevEwcfsiyvXGUQ=e4MrdFMmJc2P4Z955tUjHRsYvx43rV=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 9, 2019 at 10:41 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:

> On Sat, Mar 9, 2019 at 9:34 AM Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> >
> > On Sat, Mar 9, 2019 at 12:35 AM Magnus Hagander <magnus(at)hagander(dot)net>
> wrote:
> > >
> > > On Mon, Mar 4, 2019 at 11:31 AM Julien Rouhaud <rjuju123(at)gmail(dot)com>
> wrote:
> > >>
> > >> On Fri, Feb 22, 2019 at 3:01 PM Magnus Hagander <magnus(at)hagander(dot)net>
> wrote:
> > >> >
> > >> > It tracks things that happen in the general backends. Possibly we
> should also consider counting the errors actually found when running base
> backups? OTOH, that part of the code doesn't really track things like
> databases (as it operates just on the raw data directory underneath), so
> that implementation would definitely not be as clean...
> > >>
> > >> Sorry I just realized that I totally forgot this part of the thread.
> > >>
> > >> While it's true that we operate on raw directory, I see that sendDir()
> > >> already setup a isDbDir var, and if this is true lastDir should
> > >> contain the oid of the underlying database. Wouldn't it be enough to
> > >> call sendFile() using this, something like (untested):
> > >>
> > >> if (!sizeonly)
> > >> - sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf, true);
> > >> + sent = sendFile(pathbuf, pathbuf + basepathlen + 1, &statbuf, true,
> > >> isDbDir ? pg_atoi(lastDir+1, 4) : InvalidOid);
> > >>
> > >> and accordingly report any checksum error from sendFile()?
> > >
> > > That seems it was easy enough. PFA an updated patch that does this,
> and also rebased so it doesn't conflict on oid.
> > >
>
> Sorry, I have again new comments after a little bit more thinking.
> I'm wondering if we can do something about shared objects while we're
> at it. They don't belong to any database, so it's a little bit
> orthogonal to this proposal, but it seems quite important to track
> error on those too!
>
> What about adding a new field in PgStat_GlobalStats for that? We can
> use the same lastDir to easily detect such objects and slightly adapt
> sendFile again, which seems quite straightforward.
>

Ah, didn't spot that one until after I pushed :/ Sorry about that.

Hmm. That's an interesting thought. And then add a column to
pg_stat_bgwriter, I assume? (Which is an ever increasingly bad name for the
view, but that's unrelated to this)

Question is then what number that should show -- only the checksum counter
in non-database-fields, or the total number across the cluster?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2019-03-09 18:53:41 Re: Checksum errors in pg_stat_database
Previous Message Magnus Hagander 2019-03-09 18:48:07 Re: Checksum errors in pg_stat_database