From: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: some namespace.c refactoring |
Date: | 2023-02-23 11:07:58 |
Message-ID: | 064767e5-c6af-a7e5-36a5-14013e6795f3@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 20.02.23 15:03, Peter Eisentraut wrote:
> On 15.02.23 19:04, Alvaro Herrera wrote:
>> That said, I think most of this code is invoked for DDL, where
>> performance is not so critical; probably just fixing
>> get_object_property_data to not be so naïve would suffice.
>
> Ok, I'll look into that.
I did a variety of performance testing on this now.
I wrote a C function that calls the "is visible" functions in a tight loop:
Datum
pg_test_visible(PG_FUNCTION_ARGS)
{
int32 count = PG_GETARG_INT32(0);
Oid relid = PG_GETARG_OID(1);
Oid typid = PG_GETARG_OID(2);
for (int i = 0; i < count; i++)
{
RelationIsVisible(relid);
TypeIsVisible(typid);
//ObjectIsVisible(RelationRelationId, relid);
//ObjectIsVisible(TypeRelationId, typid);
}
PG_RETURN_VOID();
}
(It's calling two different ones to defeat the caching in
get_object_property_data().)
Here are some run times:
unpatched:
select pg_test_visible(100_000_000, 'pg_class', 'int4');
Time: 4536.747 ms (00:04.537)
select pg_test_visible(100_000_000, 'tenk1', 'widget');
Time: 10828.802 ms (00:10.829)
(Note that the "is visible" functions special case system catalogs.)
patched:
select pg_test_visible(100_000_000, 'pg_class', 'int4');
Time: 11409.948 ms (00:11.410)
select pg_test_visible(100_000_000, 'tenk1', 'widget');
Time: 18649.496 ms (00:18.649)
So, it's slower, but it's not clear whether it matters in practice,
considering this test.
I also wondered if this is visible through a normal external function
call, so I tried
do $$ begin perform pg_get_statisticsobjdef(28999) from
generate_series(1, 1_000_000); end $$;
(where that is the OID of the first object from select * from
pg_statistic_ext; in the regression database).
unpatched:
Time: 6952.259 ms (00:06.952)
patched (first patch only):
Time: 6993.655 ms (00:06.994)
patched (both patches):
Time: 7114.290 ms (00:07.114)
So there is some visible impact, but again, the test isn't realistic.
Then I tried a few ways to make get_object_property_data() faster. I
tried building a class_id+index cache that is qsort'ed (once) and then
bsearch'ed, that helped only minimally, not enough to make up the
difference. I also tried just searching the class_id+index cache
linearly, hoping maybe that if the cache is smaller it would be more
efficient to access, but that actually made things (minimally) worse,
probably because of the indirection. So it might be hard to get much
more out of this. I also thought about PerfectHash, but I didn't code
that up yet.
Another way would be to not use get_object_property_data() at all but
write a "common" function that we pass in all it needs hardcodedly, like
bool
RelationIsVisible(Oid relid)
{
return IsVisible_common(RELOID,
Anum_pg_class_relname
Anum_pg_class_relnamespace);
}
This would still save a lot of duplicate code.
But again, I don't think the micro-performance really matters here.
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2023-02-23 11:20:23 | Re: Proposal: %T Prompt parameter for psql for current time (like Oracle has) |
Previous Message | marekmosiewicz | 2023-02-23 11:04:05 | Disable vacuuming to provide data history |