Re: Index AM API cleanup

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(dot)dunstan(at)enterprisedb(dot)com>, Alex Wang <alex(dot)wang(at)enterprisedb(dot)com>
Subject: Re: Index AM API cleanup
Date: 2024-09-03 16:52:35
Message-ID: e0e699d4-516a-44b6-b011-079c32d5d078@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26.08.24 17:10, Mark Dilger wrote:
>> To make a dent, I picked out something that should be mostly harmless: Stop calling directly into _bt_getrootheight() (patch 0004). I think this patch is ok, but I might call the API function amgettreeheight instead of amgetrootheight. The former seems more general.
> Peter, your proposed rename seems fine for the current implementation, but your suggestion below might indicate a different naming.
>
>> I notice that _bt_getrootheight() is called only to fill in the IndexOptInfo tree_height field, which is only used by btcostestimate(), so in some sense this is btree-internal data. But making it so that btcostestimate() calls _bt_getrootheight() directly to avoid all that intermediate business seems too complicated, and there was probably a reason that the cost estimation functions don't open the index.
>>
>> Interestingly, the cost estimation functions for gist and spgist also look at the tree_height field but nothing ever fills it on. So with your API restructuring, someone could provide the missing API functions for those index types. Might be interesting.
>>
>> That said, there might be value in generalizing this a bit. If you look at the cost estimation functions in pgvector (hnswcostestimate() and ivfflatcostestimate()), they both have this pattern that btcostestimate() tries to avoid: They open the index, look up some number, close the index, then make a cost estimate computation with the number looked up. So another idea would be to generalize the tree_height field to some "index size data" or even "internal data for cost estimation". This wouldn't need to change the API much, since these are all just integer values, but we'd label the functions and fields a bit differently.
> Would they be just integers? They could also be void*, with amgetrootheight_function returning data allocated in the current memory context. For btree, that would just be a four byte integer, but other indexes could return whatever they like. If you like that idea, I can code that up for v18, naming the field something like amgetcostestimateinfo_function.

Here is a cleaned-up version of the v17-0004 patch. I have applied the
renaming discussed above. I have also made a wrapper function
btgettreeheight() that calls _bt_getrootheight(). That way, if we ever
want to change the API, we don't have to change _bt_getrootheight(), or
disentangle it then. I've also added documentation and put in a source
code comment that the API could be expanded for additional uses. Also,
I have removed the addition to the IndexOptInfo struct; that didn't seem
necessary.

Attachment Content-Type Size
v17.1-0004-Add-amgettreeheight-index-AM-API-routine.patch text/plain 9.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2024-09-03 17:20:31 Re: Proposal for implementing OCSP Stapling in PostgreSQL
Previous Message Tom Lane 2024-09-03 16:42:01 Re: Inline non-SQL SRFs using SupportRequestSimplify