From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PoC] Improve dead tuple storage for lazy vacuum |
Date: | 2021-07-09 17:17:49 |
Message-ID: | 20210709171749.pt7ztzlsvpbbk7yj@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2021-07-07 20:46:38 +0900, Masahiko Sawada wrote:
> Currently, the TIDs of dead tuples are stored in an array that is
> collectively allocated at the start of lazy vacuum and TID lookup uses
> bsearch(). There are the following challenges and limitations:
> So I prototyped a new data structure dedicated to storing dead tuples
> during lazy vacuum while borrowing the idea from Roaring Bitmap[2].
> The authors provide an implementation of Roaring Bitmap[3] (Apache
> 2.0 license). But I've implemented this idea from scratch because we
> need to integrate it with Dynamic Shared Memory/Area to support
> parallel vacuum and need to support ItemPointerData, 6-bytes integer
> in total, whereas the implementation supports only 4-bytes integers.
> Also, when it comes to vacuum, we neither need to compute the
> intersection, the union, nor the difference between sets, but need
> only an existence check.
>
> The data structure is somewhat similar to TIDBitmap. It consists of
> the hash table and the container area; the hash table has entries per
> block and each block entry allocates its memory space, called a
> container, in the container area to store its offset numbers. The
> container area is actually an array of bytes and can be enlarged as
> needed. In the container area, the data representation of offset
> numbers varies depending on their cardinality. It has three container
> types: array, bitmap, and run.
How are you thinking of implementing iteration efficiently for rtbm? The
second heap pass needs that obviously... I think the only option would
be to qsort the whole thing?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2021-07-09 17:36:22 | Re: Parallel scan with SubTransGetTopmostTransaction assert coredump |
Previous Message | Zhihong Yu | 2021-07-09 16:49:53 | Re: short circuit suggestion in find_hash_columns() |