Re: Mysterious DB reset

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Israel Brewster <israel(at)eraalaska(dot)net>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Mysterious DB reset
Date: 2014-03-06 18:03:06
Message-ID: 5318B85A.6030007@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 03/06/2014 09:33 AM, Israel Brewster wrote:
> For starters, this happened again this morning (no data prior to 4:45 am and sequence reset), so whatever is going on appears to be reoccurring. Also, I forgot to mention if it is significant: this is running on slackware liunux 14.0
>
> On Mar 5, 2014, at 1:00 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:
>
>> On 03/05/2014 10:22 AM, Israel Brewster wrote:
>>> My first thought was "Oh, I must have a typo in my
>>> cleanup routine, such that it is deleting all records rather than only
>>> those a week old, and it's just that no one has noticed until now". So I
>>> looked at that, but changing the delete to a select appeared to produce
>>> the proper results, in that no records were selected:
>>
>> Well it would, if the records only go back to 4 AM this morning. In other words if no records exist before 4 AM today, no records exist before 7 days ago also or am I missing something?
>
> If the delete is correct, you are absolutely right. My first theory, however, was that I made a typo, and the delete was deleting ALL records, in which case changing it to a select would select all records. As it did not, that seems to confirm the delete is correct, and therefore not the problem.
>
>> A sequence is just a special table.
>>
>> So what does SELECT * from the sequence show?
>
> tracking=> SELECT * FROM data_id_seq;
> sequence_name | last_value | start_value | increment_by | max_value | min_value | cache_value | log_cnt | is_cycled | is_called
> ---------------+------------+-------------+--------------+---------------------+-----------+-------------+---------+-----------+-----------
> data_id_seq | 1184 | 1 | 1 | 9223372036854775807 | 1 | 1 | 16 | f | t
>
>
>>
>>>
>>> Also odd is that my cleanup script runs at 1am. I have records of there
>>> being new data in the database up to 3:51am, but the oldest record
>>> currently in the DB is from 4:45am (as specified by the default of now()
>>> on the column). So I know records were added after my delete command
>>> ran, but before this reset occurred.
>>
>> I am not sure what you are calling the 'reset'?
>> Did something happen between 3:51 AM and 4:45 AM?
>
> Yes: All my data was deleted and the sequence reset to 1.
>
>> Also not sure why you call the 4:45 AM record the oldest, when you say you can identify records from 3:51 AM?
>
> As I mentioned, I archive the records to permanent storage. This archive process happens every hour (for various reasons). That is how I know we had records for 3:51 am: they exist in the permanent archive. However, they don't exist in the local database any more.

Well something is happening. See my notes on logging below to help track
down the cause.

>
>>
>>>
>>> So my question is, aside from someone going in and mucking about in the
>>> wee hours of the morning, what could possibly cause this behavior? What
>>> sort of event could cause all data to be deleted from the table, and the
>>> sequence to be reset? Especially while there is an active connection?
>>> Thanks for any ideas, however wild or off the wall :-)
>>
>> What is in the Postgres/system logs for the time period(s) you mention?
>
> The postgres log has a lot of errors in it, some of which MAY explain the issue. For example:
>
> cp: cannot create regular file '/mnt/pglogs/000000010000000400000094': Permission denied
> LOG: archive command failed with exit code 1
> DETAIL: The failed archive command was: test ! -f /mnt/pglogs/000000010000000400000094 && cp pg_xlog/000000010000000400000094 /mnt/pglogs/000000010000000400000094
> WARNING: transaction log file "000000010000000400000094" could not be archived: too many failures
> LOG: received smart shutdown request
> LOG: autovacuum launcher shutting down
> LOG: shutting down
> LOG: database system is shut down

Might be good to explain your archive setup.

>
> However, there are no timestamps on any of the entries (can I fix that?), so I don't know if those are current entries, or from back before I got the mount for the logs working. At this time, the mount point IS working correctly, and from what I can tell so is the archive command. The latest entry is from yesterday (modify date on the log shows Mar 5, 9:21, when I was messing with it yesterday), however, so there are no entries from this morning when it happened again. I don't see anything of interest in the syslog or messages log.

Yes you can, timestamps and a lot more. For all the details go here:

http://www.postgresql.org/docs/9.2/interactive/runtime-config-logging.html

At the least I would:

log_statement = 'mod'

log_connections = on
log_disconnections = on

log_line_prefix = '%u-%m-%x'

>
>>
>>>
>>> -----------------------------------------------
>>> Israel Brewster
>>> Computer Support Technician II
>>> Era Alaska
>>> 5245 Airport Industrial Rd
>>> Fairbanks, AK 99709
>>> (907) 450-7250 x7293
>>> -----------------------------------------------
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian(dot)klaver(at)aklaver(dot)com
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Israel Brewster 2014-03-06 18:43:32 Re: Mysterious DB reset
Previous Message Michael Nolan 2014-03-06 17:57:11 Re: Mysterious DB reset