From: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Compression of full-page-writes |
Date: | 2014-12-10 08:36:20 |
Message-ID: | CAB7nPqRN2rPkqv6jd=xg1Yx+oO8vwRj98x8+ZnZhfZ7WR2HukA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 9, 2014 at 2:15 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Mon, Dec 8, 2014 at 3:17 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >
> > On 8 December 2014 at 11:46, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
> wrote:
> > > I don't really like those new names, but I'd prefer
> > > wal_compression_level if we go down that road with 'none' as default
> > > value. We may still decide in the future to support compression at the
> > > record level instead of context level, particularly if we have an API
> > > able to do palloc_return_null_at_oom, so the idea of WAL compression
> > > is not related only to FPIs IMHO.
> >
> > We may yet decide, but the pglz implementation is not effective on
> > smaller record lengths. Nor has any testing been done to show that is
> > even desirable.
> >
>
> It's even much worse for non-compressible (or less-compressible)
> WAL data. I am not clear here that how a simple on/off switch
> could address such cases because the data could be sometimes
> dependent on which table user is doing operations (means schema or
> data in some tables are more prone for compression in which case
> it can give us benefits). I think may be we should think something on
> lines what Robert has touched in one of his e-mails (context-aware
> compression strategy).
>
So, I have been doing some measurements using the patch compressing FPWs
and had a look at the transaction latency using pgbench -P 1 with those
parameters on my laptop:
shared_buffers=512MB
checkpoint_segments=1024
checkpoint_timeout = 5min
fsync=off
A checkpoint was executed just before a 20-min run, so 3 checkpoints at
least kicked in during each measurement, roughly that:
pgbench -i -s 100
psql -c 'checkpoint;'
date > ~/report.txt
pgbench -P 1 -c 16 -j 16 -T 1200 2>> ~/report.txt &
1) Compression of FPW:
latency average: 9.007 ms
latency stddev: 25.527 ms
tps = 1775.614812 (including connections establishing)
Here is the latency when a checkpoint that wrote 28% of the buffers begun
(570s):
progress: 568.0 s, 2000.9 tps, lat 8.098 ms stddev 23.799
progress: 569.0 s, 1873.9 tps, lat 8.442 ms stddev 22.837
progress: 570.2 s, 1622.4 tps, lat 9.533 ms stddev 24.027
progress: 571.0 s, 1633.4 tps, lat 10.302 ms stddev 27.331
progress: 572.1 s, 1588.4 tps, lat 9.908 ms stddev 25.728
progress: 573.1 s, 1579.3 tps, lat 10.186 ms stddev 25.782
All the other checkpoints have the same profile, giving that the
transaction latency increases by roughly 1.5~2ms to 10.5~11ms.
2) No compression of FPW:
latency average: 8.507 ms
latency stddev: 25.052 ms
tps = 1870.368880 (including connections establishing)
Here is the latency for a checkpoint that wrote 28% of buffers:
progress: 297.1 s, 1997.9 tps, lat 8.112 ms stddev 24.288
progress: 298.1 s, 1990.4 tps, lat 7.806 ms stddev 21.849
progress: 299.0 s, 1986.9 tps, lat 8.366 ms stddev 22.896
progress: 300.0 s, 1648.1 tps, lat 9.728 ms stddev 25.811
progress: 301.0 s, 1806.5 tps, lat 8.646 ms stddev 24.187
progress: 302.1 s, 1810.9 tps, lat 8.960 ms stddev 24.201
progress: 303.0 s, 1831.9 tps, lat 8.623 ms stddev 23.199
progress: 304.0 s, 1951.2 tps, lat 8.149 ms stddev 22.871
Here is another one that began around 600s (20% of buffers):
progress: 594.0 s, 1738.8 tps, lat 9.135 ms stddev 25.140
progress: 595.0 s, 893.2 tps, lat 18.153 ms stddev 67.186
progress: 596.1 s, 1671.0 tps, lat 9.470 ms stddev 25.691
progress: 597.1 s, 1580.3 tps, lat 10.189 ms stddev 26.430
progress: 598.0 s, 1570.9 tps, lat 10.089 ms stddev 23.684
progress: 599.2 s, 1657.0 tps, lat 9.385 ms stddev 23.794
progress: 600.0 s, 1665.5 tps, lat 10.280 ms stddev 25.857
progress: 601.1 s, 1571.7 tps, lat 9.851 ms stddev 25.341
progress: 602.1 s, 1577.7 tps, lat 10.056 ms stddev 25.331
progress: 603.0 s, 1600.1 tps, lat 10.329 ms stddev 25.429
progress: 604.0 s, 1593.8 tps, lat 10.004 ms stddev 26.816
Not sure what happened here, the burst has been a bit higher.
However roughly the latency was never higher than 10.5ms for the
non-compression case. With those measurements I am getting more or less 1ms
of latency difference between the compression and non-compression cases
when checkpoint show up. Note that fsync is disabled.
Also, I am still planning to hack a patch able to compress directly records
with a scratch buffer up 32k and see the difference with what I got here.
For now, the results are attached.
Comments welcome.
--
Michael
Attachment | Content-Type | Size |
---|---|---|
fpw_results.tar.gz | application/x-gzip | 29.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2014-12-10 09:53:15 | Re: GSSAPI, SSPI - include_realm default |
Previous Message | Heikki Linnakangas | 2014-12-10 08:32:42 | Re: Small TRUNCATE glitch |