| From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> | 
|---|---|
| To: | Andres Freund <andres(at)2ndquadrant(dot)com> | 
| Cc: | "Syed, Rahila" <Rahila(dot)Syed(at)nttdata(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: [REVIEW] Re: Compression of full-page-writes | 
| Date: | 2014-11-28 06:48:26 | 
| Message-ID: | CAB7nPqRv6RaSx7hTnp=g3dYqOu++FeL0UioYqPLLBdbhAyB_jQ@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
So, I have been doing some more tests with this patch. I think the
compression numbers are in line with the previous tests.
Configuration
==========
3 sets are tested:
- HEAD (a5eb85e) + fpw = on
- patch + fpw = on
- patch + fpw = compress
With the following configuration:
shared_buffers=512MB
checkpoint_segments=1024
checkpoint_timeout = 5min
fsync=off
WAL quantity
===========
pgbench -s 30 -i (455MB of data)
pgbench -c 32 -j 32 -t 45000 -M prepared (roughly 11 min of run on
laptop, two checkpoints kick in)
1) patch + fdw = compress
tps = 2086.893948 (including connections establishing)
tps = 2087.031543 (excluding connections establishing)
start LSN: 0/19000090
stop LSN: 0/49F73D78
difference: 783MB
2) patch + fdw = on
start LSN: 0/1B000090
stop LSN: 0/8F4E1BD0
difference: 1861 MB
tps = 2106.812454 (including connections establishing)
tps = 2106.953329 (excluding connections establishing)
3) HEAD + fdw = on
start LSN: 0/1B0000C8
stop LSN:
difference:
WAL replay performance
===================
Then tested replay time of a standby after replaying WAL files
generated by previous pgbench runs and by tracking "redo start" and
"redo stop". Goal here is to check for the same amount of activity how
much block decompression plays on replay. The replay includes the
pgbench initialization phase.
1) patch + fdw = compress
1-1) Try 1.
2014-11-28 14:09:27.287 JST: LOG:  redo starts at 0/3000380
2014-11-28 14:10:19.836 JST: LOG:  redo done at 0/49F73E18
Result: 52.549
1-2) Try 2.
2014-11-28 14:15:04.196 JST: LOG:  redo starts at 0/3000380
2014-11-28 14:15:56.238 JST: LOG:  redo done at 0/49F73E18
Result: 52.042
1-3) Try 3
2014-11-28 14:20:27.186 JST: LOG:  redo starts at 0/3000380
2014-11-28 14:21:19.350 JST: LOG:  redo done at 0/49F73E18
Result: 52.164
2) patch + fdw = on
2-1) Try 1
2014-11-28 14:42:54.670 JST: LOG:  redo starts at 0/3000750
2014-11-28 14:43:56.221 JST: LOG:  redo done at 0/8F4E1BD0
Result: 61.5s
2-2) Try 2
2014-11-28 14:46:03.198 JST: LOG:  redo starts at 0/3000750
2014-11-28 14:47:03.545 JST: LOG:  redo done at 0/8F4E1BD0
Result: 60.3s
2-3) Try 3
2014-11-28 14:50:26.896 JST: LOG:  redo starts at 0/3000750
2014-11-28 14:51:30.950 JST: LOG:  redo done at 0/8F4E1BD0
Result: 64.0s
3) HEAD + fdw = on
3-1) Try 1
2014-11-28 15:21:48.153 JST: LOG:  redo starts at 0/3000750
2014-11-28 15:22:53.864 JST: LOG:  redo done at 0/8FFFFFA8
Result: 65.7s
3-2) Try 2
2014-11-28 15:27:16.271 JST: LOG:  redo starts at 0/3000750
2014-11-28 15:28:20.677 JST: LOG:  redo done at 0/8FFFFFA8
Result: 64.4s
3-3) Try 3
2014-11-28 15:36:30.434 JST: LOG:  redo starts at 0/3000750
2014-11-28 15:37:33.208 JST: LOG:  redo done at 0/8FFFFFA8
Result: 62.7s
So we are getting an equivalent amount of WAL when compression is not
enabled with both HEAD and the patch, aka a reduction of 55% at
constant number of transactions with pgbench. The difference seems to
be some noise. Note that basically as the patch adds a uint16 in
XLogRecordBlockImageHeader to store the length of the block compressed
and achieve a double level of compression (1st level being the removal
of the page hole), the records are 2 bytes longer per block image, it
does not seem to be much a problem in those tests. Regarding the WAL
replay, compressed blocks need extra CPU for decompression in exchange
of having less WAL to replay in quantity, this is actually reducing by
~15% the replay time, so the replay plays in favor of putting the load
on the CPU. Also, I haven't seen any difference with or without the
patch when compression is disabled.
Updated patches attached, I found a couple of issues with the code
this morning (issues more or less pointed out as well by Rahila
earlier) before running those tests.
Regards,
Regards,
-- 
Michael
| Attachment | Content-Type | Size | 
|---|---|---|
| 0001-Move-pg_lzcompress.c-to-src-common.patch.gz | application/x-gzip | 12.4 KB | 
| 0002-Support-compression-for-full-page-writes-in-WAL.patch.gz | application/x-gzip | 9.8 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Michael Paquier | 2014-11-28 06:51:02 | Re: [REVIEW] Re: Compression of full-page-writes | 
| Previous Message | Michael Paquier | 2014-11-28 06:35:42 | Allocation in critical section after node exits archive recovery |