Re: A varint implementation for PG?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
Subject: Re: A varint implementation for PG?
Date: 2021-08-04 23:05:25
Message-ID: 20210804230525.o3wtk4psy3dovyyq@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2021-08-04 23:44:10 +0200, Tomas Vondra wrote:
> How is that better than the two varint flavors that are already out there,
> i.e. the bitcoin [1] and protocol buffers [2]?

The protobuf one is *terrible* for CPU efficiency. You need to go through each
byte, do masking and shifting for each byte and then have a conditional
branch. That's bad from the the amount of instructions you need to execute,
and *really* bad for the branch predictor.

> The first one seems quite efficient in how it encodes the length into very
> few bits (which matters especially for small values). It's designed for
> integers with 1B, 2B, 4B or 8B, but it can be extended to arbitrary lengths
> fairly easily, I think:

> Look at the first byte, and
>
> 0 - 243 - encoded as is
> 244 - 1 byte
> 245 - 2 bytes
> 246 - 3 bytes
> 247 - 4 bytes
> 248 - 5 bytes
> 249 - 6 bytes
> 250 - 7 bytes
> 251 - 8 bytes
> 252 - next 1 byte is length
> 253 - next 2 bytes are length
> 254 - next 3 bytes are length
> 255 - next 4 bytes are length

> If we want to support longer lengths, we'd have to reserve an extra value
> (which reduces the number of values that require a single byte).

I think that's not a bad scheme. I think it may end up being a bit more
expensive to decode because you need more branches instead of using
find-first-set type instructions.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-08-04 23:21:18 Re: A varint implementation for PG?
Previous Message Tomas Vondra 2021-08-04 22:32:33 Re: Bug fix for cache lookup failure for statistic_ext type