| From: | "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com> | 
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> | 
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Re: Split index and table statistics into different types of stats | 
| Date: | 2022-11-02 08:58:36 | 
| Message-ID: | 4cc5476f-4360-1bed-c60d-2c21adf7db13@gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi,
On 11/1/22 1:30 AM, Andres Freund wrote:
> Hi,
> 
> On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:
>> Please find attached a patch proposal to split index and table statistics
>> into different types of stats.
>>
>> This idea has been proposed by Andres in a couple of threads, see [1] and
>> [2].
> 
> Thanks for working on this!
> 
Thanks for looking at it!
> 
> 
>> diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
>> index 5b49cc5a09..8a715db82e 100644
>> --- a/src/backend/catalog/heap.c
>> +++ b/src/backend/catalog/heap.c
>> @@ -1853,7 +1853,7 @@ heap_drop_with_catalog(Oid relid)
>>   		RelationDropStorage(rel);
>>   
>>   	/* ensure that stats are dropped if transaction commits */
>> -	pgstat_drop_relation(rel);
>> +	pgstat_drop_heap(rel);
> 
> I don't think "heap" is a good name for these, even if there's some historical
> reasons for it. Particularly because you used "table" in some bits and pieces
> too.
> 
Agree, replaced by "table" where appropriate in V3 attached.
> 
>>   /*
>> @@ -168,39 +210,55 @@ pgstat_unlink_relation(Relation rel)
>>   void
>>   pgstat_create_relation(Relation rel)
>>   {
>> -	pgstat_create_transactional(PGSTAT_KIND_RELATION,
>> -								rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> -								RelationGetRelid(rel));
>> +	if (rel->rd_rel->relkind == RELKIND_INDEX)
>> +		pgstat_create_transactional(PGSTAT_KIND_INDEX,
>> +									rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> +									RelationGetRelid(rel));
>> +	else
>> +		pgstat_create_transactional(PGSTAT_KIND_TABLE,
>> +									rel->rd_rel->relisshared ? InvalidOid : MyDatabaseId,
>> +									RelationGetRelid(rel));
>> +}
> 
> Hm - why is this best handled on this level, rather than at the caller?
> 
> 
Agree that it should be split in 
pgstat_create_table()/pgstat_create_index() (also as it was already 
split for the "drop" case): done in V3.
>> +/*
>> + * Support function for the SQL-callable pgstat* functions. Returns
>> + * the collected statistics for one index or NULL. NULL doesn't mean
>> + * that the index doesn't exist, just that there are no statistics, so the
>> + * caller is better off to report ZERO instead.
>> + */
>> +PgStat_StatIndEntry *
>> +pgstat_fetch_stat_indentry(Oid relid)
>> +{
>> +	PgStat_StatIndEntry *indentry;
>> +
>> +	indentry = pgstat_fetch_stat_indentry_ext(false, relid);
>> +	if (indentry != NULL)
>> +		return indentry;
>> +
>> +	/*
>> +	 * If we didn't find it, maybe it's a shared index.
>> +	 */
>> +	indentry = pgstat_fetch_stat_indentry_ext(true, relid);
>> +	return indentry;
>> +}
>> +
>> +/*
>> + * More efficient version of pgstat_fetch_stat_indentry(), allowing to specify
>> + * whether the to-be-accessed index is shared or not.
>> + */
>> +PgStat_StatIndEntry *
>> +pgstat_fetch_stat_indentry_ext(bool shared, Oid reloid)
>> +{
>> +	Oid			dboid = (shared ? InvalidOid : MyDatabaseId);
>> +
>> +	return (PgStat_StatIndEntry *)
>> +		pgstat_fetch_entry(PGSTAT_KIND_INDEX, dboid, reloid);
>>   }
> 
> Do we need this split anywhere for now? I suspect not, the table case is
> mainly for the autovacuum launcher, which won't look at indexes "in isolation".
> 
Yes I think so as pgstat_fetch_stat_indentry_ext() has its use case in 
pgstat_copy_index_stats() (previously pgstat_copy_relation_stats()).
> 
> 
>> @@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
>>   	PG_RETURN_INT64(result);
>>   }
>>   
>> +Datum
>> +pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
>> +{
>> +	Oid			relid = PG_GETARG_OID(0);
>> +	int64		result;
>> +	PgStat_StatIndEntry *indentry;
>> +
>> +	if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
>> +		result = 0;
>> +	else
>> +		result = (int64) (indentry->blocks_fetched);
>> +
>> +	PG_RETURN_INT64(result);
>> +}
> 
> We have so many copies of this by now - perhaps we first should deduplicate
> them somehow? Even if it's just a macro or such.
> 
Yeah good point, a new macro has been defined for the "int64" return 
case in V3 attached.
Regards,
-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
| Attachment | Content-Type | Size | 
|---|---|---|
| v3-0001-split_tables_indexes_stats.patch | text/plain | 100.8 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Yuya Watari | 2022-11-02 09:27:52 | Re: [PoC] Reducing planning time when tables have many partitions | 
| Previous Message | Aleksander Alekseev | 2022-11-02 08:53:26 | Re: Adding doubly linked list type which stores the number of items in the list |