Re: pg_get_indexdef() modification to use TxnSnapshot

From: kuroda(dot)keisuke(at)nttcom(dot)co(dot)jp
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_get_indexdef() modification to use TxnSnapshot
Date: 2023-10-06 03:01:13
Message-ID: b95d124276b9d6ba5ce4b6ec40f92262@nttcom.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

With '0001-pg_get_indexdef-modification-to-access-catalog-based.patch'
patch,
I confirmed that definition information
can be collected even if the index is droped during pg_dump.
The regression test (make check-world) has passed.

I also tested the view definition for a similar problem.
As per the attached patch and test case,
By using SetupHistoricSnapshot for pg_get_viewdef() as well,
Similarly, definition information can be collected
for VIEW definitions even if the view droped during pg_dump.
The regression test (make check-world) has passed,
and pg_dump's SERIALIZABLE results are improved.
However, in a SERIALIZABLE transaction,
If you actually try to access it, it will cause ERROR,
so it seems to cause confusion.
I think the scope of this improvement should be limited
to the functions listed as pg_get_*** functions at the moment.

---

# test2 pg_get_indexdef,pg_get_viewdef

## create table,index,view
drop table test1;
create table test1(id int);
create index idx1_test1 on test1(id);
create view view1_test1 as select * from test1;

## begin Transaction-A
begin transaction isolation level serializable;
select pg_current_xact_id();

## begin Transaction-B
begin transaction isolation level serializable;
select pg_current_xact_id();

## drop index,view in Transaction-A
drop index idx1_test1;
drop view view1_test1;
commit;

## SELECT pg_get_indexdef,pg_get_viewdef in Transaction-B
SELECT pg_get_indexdef(oid) FROM pg_class WHERE relname = 'idx1_test1';
SELECT pg_get_viewdef(oid) FROM pg_class WHERE relname = 'view1_test1';

## correct info from index and view
postgres=*# SELECT pg_get_indexdef(oid) FROM pg_class WHERE relname =
'idx1_test1';
pg_get_indexdef
----------------------------------------------------------
CREATE INDEX idx1_test1 ON public.test1 USING btree (id)
(1 row)

postgres=*# SELECT pg_get_viewdef(oid) FROM pg_class WHERE relname =
'view1_test1';
pg_get_viewdef
----------------
SELECT id +
FROM test1;
(1 row)

## However, SELECT * FROM view1_test1 cause ERROR because view does not
exist
postgres=*# SELECT * FROM view1_test1;
ERROR: relation "view1_test1" does not exist
LINE 1: SELECT * FROM view1_test1;

Best Regards,
Keisuke Kuroda
NTT COMWARE

On 2023-06-14 15:31, vignesh C wrote:
> On Tue, 13 Jun 2023 at 11:49, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> wrote:
>>
>> On Fri, May 26, 2023 at 6:55 PM shveta malik <shveta(dot)malik(at)gmail(dot)com>
>> wrote:
>> >
>> > I have attempted to convert pg_get_indexdef() to use
>> > systable_beginscan() based on transaction-snapshot rather than using
>> > SearchSysCache(). The latter does not have any old info and thus
>> > provides only the latest info as per the committed txns, which could
>> > result in errors in some scenarios. One such case is mentioned atop
>> > pg_dump.c. The patch is an attempt to fix the same pg_dump's issue.
>> > Any feedback is welcome.
>>
>> Even only in pg_get_indexdef_worker(), there are still many places
>> where we use syscache lookup: generate_relation_name(),
>> get_relation_name(), get_attname(), get_atttypetypmodcoll(),
>> get_opclass_name() etc.
>>
>> >
>> > There is a long list of pg_get_* functions which use SearchSysCache()
>> > and thus may expose similar issues. I can give it a try to review the
>> > possibility of converting all of them. Thoughts?
>> >
>>
>> it would be going to be a large refactoring and potentially make the
>> future implementations such as pg_get_tabledef() etc hard. Have you
>> considered changes to the SearchSysCache() family so that they
>> internally use a transaction snapshot that is registered in advance.
>> Since we are already doing similar things for catalog lookup in
>> logical decoding, it might be feasible. That way, the changes to
>> pg_get_XXXdef() functions would be much easier.
>
> I feel registering an active snapshot before accessing the system
> catalog as suggested by Sawada-san will be a better fix to solve the
> problem. If this approach is fine, we will have to similarly fix the
> other pg_get_*** functions like pg_get_constraintdef,
> pg_get_function_arguments, pg_get_function_result,
> pg_get_function_identity_arguments, pg_get_function_sqlbody,
> pg_get_expr, pg_get_partkeydef, pg_get_statisticsobjdef and
> pg_get_triggerdef.
> The Attached patch has the changes for the same.
>
> Regards,
> Vignesh

Attachment Content-Type Size
pg_get_indexdef-and-pg_get_viewdef.patch text/x-diff 1.6 KB
pg_get_indexdef-and-pg_get_viewdef-patch-test.txt text/plain 2.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-10-06 03:11:20 Re: pg_get_indexdef() modification to use TxnSnapshot
Previous Message Masahiro Ikeda 2023-10-06 02:02:18 Re: Rethink the wait event names for postgres_fdw, dblink and etc