Re: Excessive WAL generation and related performance issue

From: Joe Conway <mail(at)joeconway(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Hackers (PostgreSQL)" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Excessive WAL generation and related performance issue
Date: 2014-04-15 20:53:29
Message-ID: 534D9C49.5080007@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/14/2014 04:34 PM, Joe Conway wrote:
> On 04/14/2014 04:25 PM, Andres Freund wrote:
>> On 2014-04-14 16:22:48 -0700, Joe Conway wrote:
>>> That'll help performance, but lets say I generally keep WAL
>>> files for PITR and don't turn that off before starting --
>>> shouldn't I be very surprised to need over 3TB of archive
>>> storage when loading a 50GB table with a couple of indexes?
>
>> The point is that more frequent checkpoints will increase the
>> WAL volume *significantly* because more full page writes will
>> have to be generated.
>
> OK, I'll see how much it can be brought down through checkpoint
> tuning and report back.

One more question before I get to that. I had applied the following
patch to XLogInsert

8<--------------------------
diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index 2f71590..e39cd37 100644
- --- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -737,10 +737,12 @@ XLogInsert(RmgrId rmid, uint8 info, XLogRecData
*rdata)
uint32 len,
write_len;
unsigned i;
+ unsigned inorm;
bool updrqst;
bool doPageWrites;
bool isLogSwitch = (rmid == RM_XLOG_ID && info ==
XLOG_SWITCH);
uint8 info_orig = info;
+ uint32 xl_tot_len;

/* cross-check on whether we should be here or not */
if (!XLogInsertAllowed())
@@ -924,8 +926,23 @@ begin:;
* header.
*/
INIT_CRC32(rdata_crc);
+ i = 0;
+ inorm = 0;
for (rdt = rdata; rdt != NULL; rdt = rdt->next)
+ {
COMP_CRC32(rdata_crc, rdt->data, rdt->len);
+
+ if (rdt_lastnormal == rdt)
+ {
+ inorm = i;
+ i = 0;
+ }
+ else
+ i++;
+ }
+ xl_tot_len = SizeOfXLogRecord + write_len;
+ if ((inorm + i) > 4 || xl_tot_len > 2000)
+ elog(LOG,
"XLogInsert;tot_nml_blks;%d;tot_bkp_blks;%d;tot_Xlog_Len;%d", inorm,
i, xl_tot_len);

START_CRIT_SECTION();
8<--------------------------

The idea was to record number of normal and backup blocks, and total
size of the record. I have quite a few entries in the log from the
test run which are like:

8<--------------------------
2014-04-11 08:42:06.904 PDT;LOG:
XLogInsert;tot_nml_blks;4;tot_bkp_blks;5;tot_Xlog_Len;16168
2014-04-11 09:03:12.790 PDT;LOG:
XLogInsert;tot_nml_blks;4;tot_bkp_blks;5;tot_Xlog_Len;16172
2014-04-11 10:16:57.949 PDT;LOG:
XLogInsert;tot_nml_blks;3;tot_bkp_blks;5;tot_Xlog_Len;16150
8<--------------------------

and

8<--------------------------
2014-04-11 11:17:08.313 PDT;LOG:
XLogInsert;tot_nml_blks;4;tot_bkp_blks;6;tot_Xlog_Len;12332
2014-04-11 11:17:08.338 PDT;LOG:
XLogInsert;tot_nml_blks;4;tot_bkp_blks;6;tot_Xlog_Len;16020
2014-04-11 11:17:08.389 PDT;LOG:
XLogInsert;tot_nml_blks;4;tot_bkp_blks;6;tot_Xlog_Len;12332
8<--------------------------

In other words, based on my inserted logic, it appears that there are
5 and 6 backup blocks on a fairly regular basis.

However in xlog.h it says:
8<--------------------------
* If we backed up any disk blocks with the XLOG record, we use flag
* bits in xl_info to signal it. We support backup of up to 4 disk
* blocks per XLOG record.
8<--------------------------

So is my logic to record number of backup blocks wrong, or is the
comment wrong, or am I otherwise misunderstanding something?

Thanks,

Joe

- --
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTTZxJAAoJEDfy90M199hl15MQAKTcv9BoZTsXDleSu9JrU1ha
UhHUnDALRmxWLgyPYsgtifxMQ3jLp5eLrkMHGnQbVD17619OgHckuOiEphc2bdQp
MfZlv3jrEqxnmsh6qKhK1J23mHj0cohWXQ9EUoyjE6tlZueLPyMigaIV662KP1d2
pUXCh6IEJYMMaPfqhR5Mxi62s+HMkpAULhafWeEeAwcU1eYNijFWlyxJWlsv7D6X
9ZuDSmRtqnAP0g23GcbxNkL/I9Yv090Uxar7um2Rw5SEUV+Uv1kMY0GVCjHluE0k
qZhSF1tE2jypThhSnv5xRHT3ZzdKoJtNmfLekjws7+dFZbSBLgNOj4EdV0H/wUgf
NqO71kkeRhd44uMRzii0cr03LwBiwqC2apCYoZy7s0X3rl10hZfKgVEKkyhaZ4VJ
QdfR1WdY/hC7mKW7NPnkycF+Es1ykEfuPnKHHsyJ3fHeFGxkKD3I6A8jGnNnS6VL
ba+jx+t3qnrcKQAW8lqQ3rAij5Jkb97Ljibc7o6w8cgnGA4S0tqsE6jDrdDR1FO4
ns5uULTs4REU8clFwiKNZnQfINRUUfqY1mtlRneJMANeafm0j2CyIzvqLqB2mdOH
YL9SS2lIngQlVSfgpu7EiSS7sJx8XGe3a3YFE9DoTBpq009scrscH40+kuN823wp
yruufkzaBN6lyAjo3zoR
=GQDN
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-04-15 21:13:10 Re: Excessive WAL generation and related performance issue
Previous Message Tom Lane 2014-04-15 19:34:40 Re: test script, was Re: [COMMITTERS] pgsql: psql: conditionally display oids and replication identity