Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Andreas Karlsson <andreas(at)proxel(dot)se>, Abdoulaye Ba <abdoulayeba29(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size
Date: 2024-08-28 18:01:59
Message-ID: 746a3813-e450-48a6-be90-17718a9c2203@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/28/24 17:53, Andreas Karlsson wrote:
> On 8/9/24 6:59 PM, Abdoulaye Ba wrote:>     The primary use case for
> this hook is to allow extensions to account
>>     for additional storage mechanisms that are not covered by the
>>     default PostgreSQL relation size calculations. For instance, in our
>>     project, we are working with an external indexing system (Tantivy)
>>     that maintains additional data structures outside the standard
>>     PostgreSQL storage. This hook allows us to include the size of these
>>     additional structures in the total relation size calculations.
>>
>>     While I understand your suggestion about custom index AMs, the
>>     intent behind this hook is broader. It is not limited to custom
>>     index types but can also be used for other forms of external storage
>>     that need to be accounted for in relation size calculations. This is
>>     why a generic callback hook was chosen over extending the index AM
>>     interface.
>>
>>     However, if there is a consensus that such a hook would be better
>>     suited within the index AM interface for cases involving custom
>>     index storage, I'm open to discussing this further and exploring how
>>     it could be integrated more tightly with the existing PostgreSQL AM
>>     framework.
>
> Yeah, I strongly suspected it was ParadeDB. :)
>
> I am only one developer but I really do not like solving this with a
> hook, instead I think the proper solution is to integrate this properly
> with custom AMs and storage managers. I think we should do it properly
> or not at all.
>

Not sure. I'd agree if the index was something that could be implemented
through index AM - then that'd be the way to go. It might require some
improvements to the index AM to use the correct index size, haven't checked.

But it seems pg_search (which AFAIK is what paradedb uses to integrate
tantivy indexes) uses the term "index" for something very different. I'm
not sure that's something that could be conveniently implemented as
index AM, but I haven't checked. But that just raises the question why
should that be included in pg_total_relation_size and pg_indexes_size at
all, if it's not what we'd call an index.

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Karlsson 2024-08-28 18:41:22 Re: Little cleanup of ShmemInit function names
Previous Message Alexander Lakhin 2024-08-28 18:00:00 Re: [EXTERNAL] Re: Add non-blocking version of PQcancel