From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | meixiangming(at)huawei(dot)com |
Cc: | pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: BUG #6748: sequence value may be conflict in some cases |
Date: | 2012-07-23 18:43:34 |
Message-ID: | 4360.1343069014@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs pgsql-hackers |
meixiangming(at)huawei(dot)com writes:
> [ freshly-created sequence has wrong state after crash ]
I didn't believe this at first, but sure enough, it fails just as
described if you force a crash between the first and second nextval
calls for the sequence. This used to work ...
The change that broke it turns out to be the ALTER SEQUENCE OWNED BY
call that we added to serial-column creation circa 8.2; although on
closer inspection I think any ALTER SEQUENCE before the first nextval
call would be problematic. The real issue is the ancient kluge in
sequence creation that writes something different into the WAL log
than what it leaves behind in shared buffers:
/* We do not log first nextval call, so "advance" sequence here */
/* Note we are scribbling on local tuple, not the disk buffer */
newseq->is_called = true;
newseq->log_cnt = 0;
The tuple in buffers has log_cnt = 1, is_called = false, but the initial
XLOG_SEQ_LOG record shows log_cnt = 0, is_called = true. So if we crash
at this point, after recovery it looks like one nextval() has already
been done. However, AlterSequence generates another XLOG_SEQ_LOG record
based on what's in shared buffers, so after replay of that, we're back
to the "original" state where it does not appear that any nextval() has
been done.
I'm of the opinion that this kluge needs to be removed; it's just insane
that we're not logging the same state we leave in our buffers. To do
that, we need to fix nextval() so that the first nextval call generates
an xlog entry; that is, if we are changing is_called to true we ought to
consider that as a reason to force an xlog entry. I think way back when
we thought it was a good idea to avoid making two xlog entries during
creation and immediate use of a sequence, but considering all the other
xlog entries involved in creation of a sequence object, this is a pretty
silly "optimization". (Besides, it merely postpones the first
nextval-driven xlog entry from the first to the second nextval call.)
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2012-07-24 00:16:01 | event triggers patch breaks with -DCLOBBER_CACHE_ALWAYS |
Previous Message | Pavel Stehule | 2012-07-23 06:33:18 | Re: Duplicate rows primary key bug |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2012-07-23 18:45:01 | Re: pgbench -i order of vacuum |
Previous Message | Adam Crews | 2012-07-23 18:23:32 | postgres 9 bind address for replication |