Re: pglz performance

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Tels <nospam-pg-abuse(at)bloodgate(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Gasper Zejn <zejn(at)owca(dot)info>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pglz performance
Date: 2019-11-04 07:14:35
Message-ID: 63DB0583-9789-402A-BB42-E827DF1889F2@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tels!
Thanks for your interest in fast decompression.

> 3 нояб. 2019 г., в 12:24, Tels <nospam-pg-abuse(at)bloodgate(dot)com> написал(а):
>
> I wonder if you agree and what would happen if you try this variant on your corpus tests.

I've tried some different optimization for literals. For example loop unrolling[0] and literals bulk-copying.
This approaches were brining some performance improvement. But with noise. Statistically they were somewhere better, somewhere worse, net win, but that "net win" depends on what we consider important data and important platform.

Proposed patch makes clearly decompression faster on any dataset, and platform.
I believe improving pglz further is viable, but optimizations like common data prefix seems more promising to me.
Also, I think we actually need real codecs like lz4, zstd and brotli instead of our own invented wheel.

If you have some spare time - Pull Requests to test_pglz are welcome, lets benchmark more micro optimizations, it brings a lot of fun :)

--
Andrey Borodin
Open source RDBMS development team leader
Yandex.Cloud

[0] https://github.com/x4m/test_pglz/blob/master/pg_lzcompress_hacked.c#L166

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2019-11-04 07:30:01 Re: [HACKERS] Block level parallel vacuum
Previous Message Amit Kapila 2019-11-04 06:54:35 cost based vacuum (parallel)