Re: Mysterious DB reset

From: Israel Brewster <israel(at)eraalaska(dot)net>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Mysterious DB reset
Date: 2014-03-06 17:33:45
Message-ID: 0CD2D7E3-19A8-4A41-832C-49611375B5CD@eraalaska.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

For starters, this happened again this morning (no data prior to 4:45 am and sequence reset), so whatever is going on appears to be reoccurring. Also, I forgot to mention if it is significant: this is running on slackware liunux 14.0

On Mar 5, 2014, at 1:00 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:

> On 03/05/2014 10:22 AM, Israel Brewster wrote:
>> My first thought was "Oh, I must have a typo in my
>> cleanup routine, such that it is deleting all records rather than only
>> those a week old, and it's just that no one has noticed until now". So I
>> looked at that, but changing the delete to a select appeared to produce
>> the proper results, in that no records were selected:
>
> Well it would, if the records only go back to 4 AM this morning. In other words if no records exist before 4 AM today, no records exist before 7 days ago also or am I missing something?

If the delete is correct, you are absolutely right. My first theory, however, was that I made a typo, and the delete was deleting ALL records, in which case changing it to a select would select all records. As it did not, that seems to confirm the delete is correct, and therefore not the problem.

> A sequence is just a special table.
>
> So what does SELECT * from the sequence show?

tracking=> SELECT * FROM data_id_seq;
sequence_name | last_value | start_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called
---------------+------------+-------------+--------------+---------------------+-----------+-------------+---------+-----------+-----------
data_id_seq | 1184 | 1 | 1 | 9223372036854775807 | 1 | 1 | 16 | f | t

>
>>
>> Also odd is that my cleanup script runs at 1am. I have records of there
>> being new data in the database up to 3:51am, but the oldest record
>> currently in the DB is from 4:45am (as specified by the default of now()
>> on the column). So I know records were added after my delete command
>> ran, but before this reset occurred.
>
> I am not sure what you are calling the 'reset'?
> Did something happen between 3:51 AM and 4:45 AM?

Yes: All my data was deleted and the sequence reset to 1.

> Also not sure why you call the 4:45 AM record the oldest, when you say you can identify records from 3:51 AM?

As I mentioned, I archive the records to permanent storage. This archive process happens every hour (for various reasons). That is how I know we had records for 3:51 am: they exist in the permanent archive. However, they don't exist in the local database any more.

>
>>
>> So my question is, aside from someone going in and mucking about in the
>> wee hours of the morning, what could possibly cause this behavior? What
>> sort of event could cause all data to be deleted from the table, and the
>> sequence to be reset? Especially while there is an active connection?
>> Thanks for any ideas, however wild or off the wall :-)
>
> What is in the Postgres/system logs for the time period(s) you mention?

The postgres log has a lot of errors in it, some of which MAY explain the issue. For example:

cp: cannot create regular file '/mnt/pglogs/000000010000000400000094': Permission denied
LOG: archive command failed with exit code 1
DETAIL: The failed archive command was: test ! -f /mnt/pglogs/000000010000000400000094 && cp pg_xlog/000000010000000400000094 /mnt/pglogs/000000010000000400000094
WARNING: transaction log file "000000010000000400000094" could not be archived: too many failures
LOG: received smart shutdown request
LOG: autovacuum launcher shutting down
LOG: shutting down
LOG: database system is shut down

However, there are no timestamps on any of the entries (can I fix that?), so I don't know if those are current entries, or from back before I got the mount for the logs working. At this time, the mount point IS working correctly, and from what I can tell so is the archive command. The latest entry is from yesterday (modify date on the log shows Mar 5, 9:21, when I was messing with it yesterday), however, so there are no entries from this morning when it happened again. I don't see anything of interest in the syslog or messages log.

>
>>
>> -----------------------------------------------
>> Israel Brewster
>> Computer Support Technician II
>> Era Alaska
>> 5245 Airport Industrial Rd
>> Fairbanks, AK 99709
>> (907) 450-7250 x7293
>> -----------------------------------------------
>>
>>
>>
>>
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Israel Brewster 2014-03-06 17:41:50 Re: Mysterious DB reset
Previous Message Tom Lane 2014-03-06 17:09:26 Re: extract psql meta-commands into library?