| From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> | 
|---|---|
| To: | pgsql-hackers(at)postgreSQL(dot)org | 
| Subject: | Re: Inadequate thought about buffer locking during hot standby replay | 
| Date: | 2012-11-10 18:16:46 | 
| Message-ID: | 25470.1352571406@sss.pgh.pa.us | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
I wrote:
> I'm inclined to think that we need to fix this by getting rid of
> RestoreBkpBlocks per se, and instead having the per-WAL-record restore
> routines dictate when each full-page image is restored (and whether or
> not to release the buffer lock immediately).  That's not going to be a
> small change unfortunately :-(
Here's a WIP patch that attacks it in this way.  I've only gone as far as
fixing btree_xlog_split() for the moment, since the main point is to see
whether the coding change seems reasonable.  At least in this function,
it seems that locking considerations are handled correctly as long as no
full-page image is used, so it's pretty straightforward to tweak it to
handle the case with full-page image(s) as well.  I've not looked
through any other replay functions yet --- some of them may need more
invasive hacking.  But so far this looks pretty good.
Note that this patch isn't correct yet even for btree_xlog_split(),
because I've not removed the RestoreBkpBlocks() call in btree_redo().
All the btree replay routines will have to get fixed before it's
testable at all.
One thing that seems a bit annoying is the use of zero-based backup
block indexes in RestoreBackupBlock, when most (not all) of the callers
are referencing macros with one-based names (XLR_BKP_BLOCK_1 etc).
That's a bug waiting to happen.  We could address it by changing
RestoreBackupBlock to accept a one-based argument, but what I'm thinking
of doing instead is getting rid of the one-based convention entirely;
that is, remove XLR_BKP_BLOCK_1 through XLR_BKP_BLOCK_4 and instead have
all code use the XLR_SET_BKP_BLOCK() macro, which is zero-based.  One
advantage of doing that is that it'll force inspection of all code
related to this.
Comments, opinions?
regards, tom lane
| Attachment | Content-Type | Size | 
|---|---|---|
| restore-backup-blocks-individually-0.patch | text/x-patch | 10.7 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Simon Riggs | 2012-11-10 20:59:13 | Re: Inadequate thought about buffer locking during hot standby replay | 
| Previous Message | Bruce Momjian | 2012-11-10 17:41:44 | Re: Further pg_upgrade analysis for many tables |