From: | Rahila Syed <rahilasyed90(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Cc: | Andres Freund <andres(at)2ndquadrant(dot)com>, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [REVIEW] Re: Compression of full-page-writes |
Date: | 2014-12-08 06:42:30 |
Message-ID: | CAH2L28tJL7TBVZbFNzN7M1SjazSUu7b0Qtt0wepyKEmXYJfNOQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
>The important point to consider for this patch is the use of the
additional 2-bytes as uint16 in the block information structure to save the
length of a compressed
>block, which may be compressed without its hole to achieve a double level
of compression (image compressed without its hole). We may use a simple
flag on
>one or two bits using for example a bit from hole_length, but in this case
we would need to always compress images with their hole included, something
more
>expensive as the compression would take more time.
As you have mentioned here the idea to use bits from existing fields rather
than adding additional 2 bytes in header,
FWIW elaborating slightly on the way it was done in the initial patches,
We can use the following struct
unsigned hole_offset:15,
compress_flag:2,
hole_length:15;
Here compress_flag can be 0 or 1 depending on status of compression. We
can reduce the compress_flag to just 1 bit flag.
IIUC, the purpose of adding compress_len field in the latest patch is
to store length of compressed blocks which is used at the time of decoding
the blocks.
With this approach, length of compressed block can be stored in hole_length
as,
hole_length = BLCKSZ - compress_len.
Thus, hole_length can serve the purpose of storing length of a compressed
block without the need of additional 2-bytes. In DecodeXLogRecord,
hole_length can be used for tracking the length of data received in cases
of both compressed as well as uncompressed blocks.
As you already mentioned, this will need compressing images with hole but
we can MemSet hole to 0 in order to make compression of hole less
expensive and effective.
Thank you,
Rahila Syed
On Sat, Dec 6, 2014 at 7:37 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
>
> On Sat, Dec 6, 2014 at 12:17 AM, Andres Freund <andres(at)2ndquadrant(dot)com>
> wrote:
>
>> On 2014-12-06 00:10:11 +0900, Michael Paquier wrote:
>> > On Sat, Dec 6, 2014 at 12:06 AM, Michael Paquier <
>> michael(dot)paquier(at)gmail(dot)com>
>> > wrote:
>> >
>> > >
>> > >
>> > >
>> > > On Fri, Dec 5, 2014 at 11:10 PM, Rahila Syed <rahilasyed(dot)90(at)gmail(dot)com
>> >
>> > > wrote:
>> > >
>> > >> I attempted quick review and could not come up with much except this
>> > >>
>> > >> + /*
>> > >> + * Calculate the amount of FPI data in the record. Each backup
>> block
>> > >> + * takes up BLCKSZ bytes, minus the "hole" length.
>> > >> + *
>> > >> + * XXX: We peek into xlogreader's private decoded backup blocks
>> for
>> > >> the
>> > >> + * hole_length. It doesn't seem worth it to add an accessor
>> macro for
>> > >> + * this.
>> > >> + */
>> > >> + fpi_len = 0;
>> > >> + for (block_id = 0; block_id <= record->max_block_id; block_id++)
>> > >> + {
>> > >> + if (XLogRecHasCompressedBlockImage(record, block_id))
>> > >> + fpi_len += BLCKSZ -
>> record->blocks[block_id].compress_len;
>> > >>
>> > >>
>> > >> IIUC, fpi_len in case of compressed block image should be
>> > >>
>> > >> fpi_len = record->blocks[block_id].compress_len;
>> > >>
>> > > Yep, true. Patches need a rebase btw as Heikki fixed a commit related
>> to
>> > > the stats of pg_xlogdump.
>> > >
>> >
>> > In any case, any opinions to switch this patch as "Ready for committer"?
>>
>> Needing a rebase is a obvious conflict to that... But I guess some wider
>> looks afterwards won't hurt.
>>
>
> Here are rebased versions, which are patches 1 and 2. And I am switching
> as well the patch to "Ready for Committer". The important point to consider
> for this patch is the use of the additional 2-bytes as uint16 in the block
> information structure to save the length of a compressed block, which may
> be compressed without its hole to achieve a double level of compression
> (image compressed without its hole). We may use a simple flag on one or two
> bits using for example a bit from hole_length, but in this case we would
> need to always compress images with their hole included, something more
> expensive as the compression would take more time.
>
> Robert wrote:
> > What I would suggest is instrument the backend with getrusage() at
> > startup and shutdown and have it print the difference in user time and
> > system time. Then, run tests for a fixed number of transactions and
> > see how the total CPU usage for the run differs.
> That's a nice idea, which is done with patch 3 as a simple hack calling
> twice getrusage at the beginning of PostgresMain and before proc_exit,
> calculating the difference time and logging it for each process (used as
> well log_line_prefix with %p).
>
> Then I just did a small test with a load of a pgbench-scale-100 database
> on fresh instances:
> 1) Compression = on:
> Stop LSN: 0/487E49B8
> getrusage: proc 11163: LOG: user diff: 63.071127, system diff: 10.898386
> pg_xlogdump: FPI size: 122296653 [90.52%]
> 2) Compression = off
> Stop LSN: 0/4E54EB88
> Result: proc 11648: LOG: user diff: 43.855212, system diff: 7.857965
> pg_xlogdump: FPI size: 204359192 [94.10%]
> And the CPU consumption is showing quite some difference... I'd expect as
> well pglz_compress to show up high in a perf profile for this case (don't
> have the time to do that now, but a perf record -a -g would be fine I
> guess).
> Regards,
> --
> Michael
>
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2014-12-08 06:48:52 | Re: [REVIEW] Re: Compression of full-page-writes |
Previous Message | Amit Kapila | 2014-12-08 06:23:28 | Re: On partitioning |