Re: Mysterious DB reset

From: Israel Brewster <israel(at)eraalaska(dot)net>
To: Thom Brown <thom(at)linux(dot)com>
Cc: PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Mysterious DB reset
Date: 2014-03-06 17:48:33
Message-ID: 72A5D44B-2E05-4AD2-8388-9151B89C2475@eraalaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mar 5, 2014, at 10:01 AM, Thom Brown <thom(at)linux(dot)com> wrote:

> On 5 March 2014 18:22, Israel Brewster <israel(at)eraalaska(dot)net> wrote:
> I have a Postgresql 9.2.4 database containing real-time tracking data for our aircraft for the past week (theoretically). It is populated by two different processes: one that runs every few minutes, retrieving data from a number of sources and storing it in the DB, and one that has an "always on" connection to the DB streaming data into the database in realtime (often several records per second). To keep the database size manageable I have a cron job that runs every evening to delete all records that are more than a week old, after archiving a subset of them in permanent storage.
>
> This morning my boss e-mailed me, complaining that only a couple of aircraft were showing up in the list (SELECT distinct(tail) FROM data being the command that populates the list). Upon looking at the data I saw that it only went back to 4am this morning, rather than the week I was expecting. My first thought was "Oh, I must have a typo in my cleanup routine, such that it is deleting all records rather than only those a week old, and it's just that no one has noticed until now". So I looked at that, but changing the delete to a select appeared to produce the proper results, in that no records were selected:
>
> DELETE FROM data WHERE pointtime<now() AT TIME ZONE 'UTC'-interval '7 days';
>
> Then I noticed something even more odd. My database has an id column, which is defined as a SERIAL. As we all know, a serial is a monotonically increasing number that is not affected by deletes. However, the oldest record in my database, from 4am this morning, had an id of 1. Even though I KNOW there was data in the system yesterday. Even if my DELETE command was wrong and deleted ALL records, that shouldn't have reset the SERIAL column to 1! I also know that I have not been in the database mucking around with the sequence value - to be completely honest, I don't even know the exact command to reset it - I'd have to google it if I wanted to.
>
> Also odd is that my cleanup script runs at 1am. I have records of there being new data in the database up to 3:51am, but the oldest record currently in the DB is from 4:45am (as specified by the default of now() on the column). So I know records were added after my delete command ran, but before this reset occurred.
>
> So my question is, aside from someone going in and mucking about in the wee hours of the morning, what could possibly cause this behavior? What sort of event could cause all data to be deleted from the table, and the sequence to be reset? Especially while there is an active connection? Thanks for any ideas, however wild or off the wall :-)
>
> That is odd. Even if it were an unlogged table, and there was a crash, the sequence wouldn't reset, and even if it was running in a very long-running transaction held open by a buggy connection pooler, the sequence would still progress as it's immune to the effects of transactions.

Not famillar with a logged vs unlogged table (still learning all the features of PostgreSQL), but as you said the sequence resetting is rather odd.

>
> So if all the data went missing, and the sequence reset, the only thing I can think of is:
>
> Someone ran:
>
> TRUNCATE data RESTART IDENTITY;

Considering that I'm the only one in the company that knows SQL at all beyond a simple single-table select (I keep having to explain joins and how they aren't evil to the other programmer here), not likely :-)

>
> or someone restored the table structure from a backup that deleted the original table.

Now that's a thought...Maybe my backup routine is working backwards (pushing from the backup archive to the primary machine rather than from the primary to the backup). I did switch primary and secondary machines a while ago, but I thought I had checked that. What would be the effect of overwriting the data files while the database is active?

>
> Do you log DDL?

Not sure what that is, so I'll assume no :-)

>
> Was the table partitioned?

Nope.

>
> You should also really be on 9.2.7, although I can't think of any bug that's been fixed which could be responsible for this issue.

I'll look at updating. Thanks.

>
> --
> Thom

-----------------------------------------------
Israel Brewster
Computer Support Technician II
Era Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7250 x7293
-----------------------------------------------

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Michael Nolan 2014-03-06 17:57:11 Re: Mysterious DB reset
Previous Message Israel Brewster 2014-03-06 17:41:50 Re: Mysterious DB reset