Re: Losing records in PostgreSQL 9.6

From: A G <andreas(dot)grill(at)gmail(dot)com>
To: Robert Treat <rob(at)xzilla(dot)net>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: Losing records in PostgreSQL 9.6
Date: 2022-06-04 09:19:21
Message-ID: CAJ_geA9BWqzG4d6agwkYQahMs2V=xb=cL7bqnDURXUkvPuCQYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Thank you for your helpful input and sorry my late answer, I wanted to
check first the database statistics tables you suggested.
Unfortunately, we don't have access to them.
But that was great input. We are planning to improve the backup system to
include more information that will help us in the future to analyse what
happened on a customer database.

Best regards,
Andreas

On Sat, May 14, 2022 at 6:06 AM Robert Treat <rob(at)xzilla(dot)net> wrote:

> If you are asking if it is within the realm of possibility that an old
> version of Postgres with known bugs running on a presumably old
> version of Linux which probably also has known bugs that happens to
> also have some form of hardware failure that could include corrupted
> memory and/or corrupted storage (did I mention filesystem bugs?) could
> possibly lose data... well I reckon that probably is possible.
>
> However, because the pg_dumps produce what sound like working / proper
> output, I don't think it'd be any of those things. You didn't mention
> if you have any ability to look at any of the database statistics
> tables, which would be a first place to look to see if there are any
> DML statements or DML activity tracked. Beyond that, if I had access
> to the server in question (or better an exact physical copy, which
> might be difficult in your case), I'd want to see if I could find the
> old rows which might still be on-disk even if not visible due to
> deletion or transaction semantics, and would also want to rule out
> things like index corruption that might cause some weird side effects.
>
>
> Robert Treat
>
> On Fri, May 13, 2022 at 1:53 PM A G <andreas(dot)grill(at)gmail(dot)com> wrote:
> >
> > Thanks for your input!
> >
> > We checked the application that has access to the database, but it would
> never delete rows from that table. The missing rows in the database were
> stored at some point through committed transactions and had a lower
> sequential primary key. We don't think the transactions were rolled back
> since they were part of an older backup.
> >
> > We believe that there was probably a manual access through the customer
> or a service partner, but wanted to make sure that there is no other way
> that Postgres would lose rows during a pg_dump because of something like a
> hardware failure, for instance.
> >
> > Best regards,
> > Andreas
> >
> > On Sat, May 7, 2022 at 4:03 PM Ron <ronljohnsonjr(at)gmail(dot)com> wrote:
> >>
> >> On 5/4/22 09:55, A G wrote:
> >>
> >> Hi,
> >> thanks for your help.
> >>
> >> My team is using Postgres 9.6.10 for an on-premise application (we are
> planing on upgrading to a newer Postgres version). Our application comes
> with Postgres running in a docker container with its data stored in a
> docker volume. Our software uses pg_dump / pg_restore to backup and restore
> the database.
> >>
> >> Now we got a ticket from a customer where their database is missing
> rows from a table. There are 971 consecutive rows missing from the
> beginning of the table. The missing rows were inserted first. We find it
> also strange, that all the other tables don’t seem to be affected at all.
> It appears that there is only data loss in this single table.
> >> Unfortunately, we don’t have access to the original database anymore
> and need to find out what happened through the backups the customer
> provides. We have one backup right after they installed and initially
> configured the application, which seems complete. Then there is another
> backup 10 months later where the first 971 rows are already missing in this
> one table.
> >>
> >> If we exclude a manual deletion, which the customer denies,
> >>
> >>
> >> There's more to PEBKAC than manual deletion.
> >>
> >> we are wondering if it’s possible that Postgres 9.6 could lose some of
> its data through a storage or memory error and would create a “successful”
> pg_dump with only partial data? Is such a behaviour even thinkable with
> Postgres?
> >>
> >> Do you have an idea what else could cause this issue?
> >>
> >>
> >> Uncommitted transactions?
> >> * Purge job with a bug in it?
> >> * Two different date columns (for example "transaction_date" and
> "posted_date") which are expected to be the same apparently not always.
> Since the errors apparently happen at the beginning of the month, the purge
> job might have seen them as the previous month's records.
> >>
> >> These are our dump and restore commands:
> >> pg_dump -Fc --no-acl --no-owner -U acme -h 127.0.0.1 acme > acme.dump
> >> pg_restore -d acme -n public -U acme -h 127.0.0.1 --jobs=4 acme.dump
> >>
> >> We use just a single db user to access the database and we don’t use
> RLS.
> >>
> >> Thank you.
> >>
> >> Best regards,
> >> Andreas
> >>
> >>
> >> --
> >> Angular momentum makes the world go 'round.
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Nikhil Shetty 2022-06-04 09:38:58 Re: Fwd: Adding more detail to pg_upgrade documentation
Previous Message Raj kumar 2022-06-03 21:49:02 Changing Data Type on Production Database - Impact