From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Mike Blackwell <mike(dot)blackwell(at)rrd(dot)com> |
Subject: | Re: Performance Improvement by reducing WAL for Update Operation |
Date: | 2014-01-29 10:11:50 |
Message-ID: | 52E8D3E6.3020505@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 01/28/2014 07:01 PM, Heikki Linnakangas wrote:
> On 01/27/2014 07:03 PM, Amit Kapila wrote:
>> I have tried to improve algorithm in another way so that we can get
>> benefit of same chunks during find match (something similar to lz).
>> The main change is to consider chunks at fixed boundary (4 byte)
>> and after finding match, try to find if there is a longer match than
>> current chunk. While finding longer match, it still takes care that
>> next bigger match should be at chunk boundary. I am not
>> completely sure about the chunk boundary may be 8 or 16 can give
>> better results.
>
> Since you're only putting a value in the history every 4 bytes, you
> wouldn't need to calculate the hash in a rolling fashion. You could just
> take next four bytes, calculate hash, put it in history table. Then next
> four bytes, calculate hash, and so on. Might save some cycles when
> building the history table...
On a closer look, you're putting a chunk in the history table only every
four bytes, but you're *also* checking the history table for a match
only every four bytes. That completely destroys the shift-resistence of
the algorithm. For example, if the new tuple is an exact copy of the old
tuple, except for one additional byte in the beginning, the algorithm
would fail to recognize that. It would be good to add a test case like
that in the test suite.
You can skip bytes when building the history table, or when finding
matches, but not both. Or you could skip N bytes, and then check for
matches for the next four bytes, then skip again and so forth, as long
as you always check four consecutive bytes (because the hash function is
calculated from four bytes).
I couldn't resist the challenge, and started hacking this. First, some
observations from your latest patch (pgrb_delta_encoding_v5.patch):
1. There are a lot of comments and code that refers to "chunks", which
seem obsolete. For example, ck_size field in PGRB_HistEntry is always
set to a constant, 4, except maybe at the very end of the history
string. The algorithm has nothing to do with Rabin-Karp anymore.
2. The 'hindex' field in PGRB_HistEntry is unused. Also, ck_start_pos is
redundant with the index of the entry in the array, because the array is
filled in order. That only leaves us just the 'next' field, and that can
be represented as a int16 rather than a pointer. So, we really only need
a simple int16 array as the history entries array.
3. You're not gaining much by calculating the hash in a rolling fashion.
A single rolling step requires two multiplications and two sums, plus
shifting the variables around. Calculating the hash from scratch
requires three multiplications and three sums.
4. Given that we're not doing the Rabin-Karp variable-length chunk
thing, we could use a cheaper hash function to begin with. Like, the one
used in pglz. The multiply-by-prime method probably gives fewer
collisions than pglz's shift/xor method, but I don't think that matters
as much as computation speed. No-one has mentioned or measured the
effect of collisions in this thread; that either means that it's a
non-issue or that no-one's just realized how big a problem it is yet.
I'm guessing that it's not a problem, and if it is, it's mitigated by
only trying to find matches every N bytes; collisions would make finding
matches slower, and that's exactly what skipping helps with.
After addressing the above, we're pretty much back to PGLZ approach. I
kept the change to only find matches every four bytes, that does make
some difference. And I like having this new encoding code in a separate
file, not mingled with pglz stuff, it's sufficiently different that
that's better. I haven't done all much testing with this, so take it
with a grain of salt.
I don't know if this is better or worse than the other patches that have
been floated around, but I though I might as well share it..
- Heikki
Attachment | Content-Type | Size |
---|---|---|
back-to-pglz-like-delta-encoding-1.patch | text/x-diff | 31.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jeevan Chalke | 2014-01-29 10:29:01 | Re: patch: option --if-exists for pg_dump |
Previous Message | salah jubeh | 2014-01-29 09:56:32 | Re: Add force option to dropdb |