From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: WAL format and API changes (9.5) |
Date: | 2014-08-19 08:07:30 |
Message-ID: | CAB7nPqQreciciXVLpQ8Sk_HZM4_6twEtb79TCXCbK00CF423Eg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Aug 18, 2014 at 10:55 PM, Heikki Linnakangas <
hlinnakangas(at)vmware(dot)com> wrote:
> All rightey.
>
> Here's the next version of this work. It now comes as two patches. The
first
> one refactors adds the XLogReplayBuffer() function and refactors all the
> redo functions to use it. It doesn't change the WAL record format in any
> way. The second patch applies over the first one, and changes the WAL
> format, and all the WAL-creating functions to use the new API for
> constructing WAL records. The second patch removes the relfilenode and
block
> number arguments from XLogReplayBuffer, because they're no longer needed
> when that information is included in the record format.
>
> Todo:
>
> * Performance testing. Do the new WAL construction functions add overhead
to
> WAL insertion?
> * Compare WAL record sizes before and after. I've tried to keep it as
> compact as possible, but I know that some records have gotten bigger. Need
> to do a more thorough analysis.
> * Rename XLogReplayBuffer. I don't particularly like any of the
suggestions
> so far, XLogReplayBuffer included. Discussion continues..
> * Anything else?
Patch 1 looks good. The main difference with the v1 submitted a couple of
days back is that the global variable approach is replaced with additional
arguments for the LSN position and record pointer in XLogReplayBuffer. I
have as well run a couple of tests with the page comparison tool, done some
tests based on installcheck-world with a slave replaying WAL behind, and
found no issues with it.
Perhaps we could consider pushing it to facilitate the next work? Even if
the second patch is dropped it is still a win IMO to have backup block
replay managed within a single function (being it named XLogReplayBuffer in
latest patch), and having it return a status flag.
Regarding patch 2:
- The main differences with the latest version are the modifications for
XLogReplayBuffer having new arguments (LSN position and record pointer).
XLogRecGetBlockRefIds has been changed to return a palloc'd array of block
IDs. xloginsert.h, containing all the functions for xlog record
construction is introduced as well.
- Tiny thing, be aware of tab padding. Here is heapam.c:
page = BufferGetPage(buffer);
PageSetAllVisible(page);
MarkBufferDirty(buffer);
- XLogRecGetBlockRefIds is not described in
src/backend/access/transam/README. Btw, pg_xlogdump drops a core dump when
using it:
--Output:
Assertion failed: (n == *num_refs), function XLogRecGetBlockRefIds, file
xlogreader.c, line 1157.
rmgr: Heap len (rec/tot): 14/ 12912, tx: 3, lsn:
0/01000148, prev 0/01000120, Abort trap: 6 (core dumped)
-- Backtrace:
frame #4: 0x0000000103870363
pg_xlogdump`XLogRecGetBlockRefIds(record=0x00007ff38a003200,
num_refs=0x00007fff5c394464) + 435 at xlogreader.c:1157
frame #5: 0x000000010386d610
pg_xlogdump`XLogDumpDisplayRecord(config=0x00007fff5c3945c8,
ReadRecPtr=16777544, record=0x00007ff38a003200) + 272 at pg_xlogdump.c:357
frame #6: 0x000000010386cad8 pg_xlogdump`main(argc=2,
argv=0x00007fff5c394658) + 3160 at pg_xlogdump.c:749
In order to reproduce that, simply run regression tests, followed by
pg_xlogdump on one of the WAL files generated.
- This patch includes some diffs from pg_receivexlog.c taken from 52bffe3.
- I have run installcheck-world and compared the size of the WAL generated
on HEAD and the patch (any hints to improve such analysis are of course
welcome)
name | start | stop | diff
----------------+-----------+------------+-----------
HEAD (8605bc7) | 0/16C6808 | 0/11A2C670 | 271998568
Patch 1+2 | 0/16D45D8 | 0/1267A4B0 | 284843736
(2 rows)
So that's a diff of more or less 13MB for this test set.
Looking forward for some performance numbers as well as more precise
comparison of WAL record length.
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2014-08-19 08:25:13 | Re: 9.5: Better memory accounting, towards memory-bounded HashAgg |
Previous Message | Jeff Davis | 2014-08-19 07:52:45 | Re: 9.5: Memory-bounded HashAgg |