Re: TRUNCATE on foreign table

From: Kohei KaiGai <kaigai(at)heterodb(dot)com>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Kazutaka Onishi <onishi(at)heterodb(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: TRUNCATE on foreign table
Date: 2021-04-09 03:33:07
Message-ID: CAOP8fzaWkX7O3QmKtJ9DEtBfwaR5+OL2KfycnYNrxPC9zEG2QQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2021年4月8日(木) 22:14 Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>:
>
> On 2021/04/08 22:02, Kohei KaiGai wrote:
> >> Anyway, attached is the updated version of the patch. This is still based on the latest Kazutaka-san's patch. That is, extra list for ONLY is still passed to FDW. What about committing this version at first? Then we can continue the discussion and change the behavior later if necessary.
>
> Pushed! Thank all involved in this development!!
> For record, I attached the final patch I committed.
>
>
> > Ok, it's fair enought for me.
> >
> > I'll try to sort out my thought, then raise a follow-up discussion if necessary.
>
> Thanks!
>
> The followings are the open items and discussion points that I'm thinking of.
>
> 1. Currently the extra information (TRUNCATE_REL_CONTEXT_NORMAL, TRUNCATE_REL_CONTEXT_ONLY or TRUNCATE_REL_CONTEXT_CASCADING) about how a foreign table was specified as the target to truncate in TRUNCATE command is collected and passed to FDW. Does this really need to be passed to FDW? Seems Stephen, Michael and I think that's necessary. But Kaigai-san does not. I also think that TRUNCATE_REL_CONTEXT_CASCADING can be removed because there seems no use case for that maybe.
>
> 2. Currently when the same foreign table is specified multiple times in the command, the extra information only for the foreign table found first is collected. For example, when "TRUNCATE ft, ONLY ft" is executed, TRUNCATE_REL_CONTEXT_NORMAL is collected and _ONLY is ignored because "ft" is found first. Is this OK? Or we should collect all, e.g., both _NORMAL and _ONLY should be collected in that example? I think that the current approach (i.e., collect the extra info about table found first if the same table is specified multiple times) is good because even local tables are also treated the same way. But Kaigai-san does not.
>
> 3. Currently postgres_fdw specifies ONLY clause in TRUNCATE command that it constructs. That is, if the foreign table is specified with ONLY, postgres_fdw also issues the TRUNCATE command for the corresponding remote table with ONLY to the remote server. Then only root table is truncated in remote server side, and the tables inheriting that are not truncated. Is this behavior desirable? Seems Michael and I think this behavior is OK. But Kaigai-san does not.
>
Prior to the discussion of 1-3, I like to clarify the role of foreign-tables.
(Likely, it will lead a natural conclusion for the above open items.)

As literal of SQL/MED (Management of External Data), a foreign table
is a representation of external data in PostgreSQL.
It allows to read and (optionally) write the external data wrapped by
FDW drivers, as if we usually read / write heap tables.
By the FDW-APIs, the core PostgreSQL does not care about the
structure, location, volume and other characteristics of
the external data itself. It expects FDW-APIs invocation will perform
as if we access a regular heap table.

On the other hands, we can say local tables are representation of
"internal" data in PostgreSQL.
A heap table is consists of one or more files (per BLCKSZ *
RELSEG_SIZE), and table-am intermediates
the on-disk data to/from on-memory structure (TupleTableSlot).
Here are no big differences in the concept. Ok?

As you know, ONLY clause controls whether TRUNCATE command shall run
on child-tables also, not only the parent.
If "ONLY parent_table" is given, its child tables are not picked up by
ExecuteTruncate(), unless child tables are not
listed up individually.
Then, once ExecuteTruncate() picked up the relations, it makes the
relations empty using table-am
(relation_set_new_filenode), and the callee
(heapam_relation_set_new_filenode) does not care about whether the
table is specified with ONLY, or not. It just makes the data
represented by the table empty (in transactional way).

So, how foreign tables shall perform?

Once ExecuteTruncate() picked up a foreign table, according to
ONLY-clause, does FDW driver shall consider
the context where the foreign tables are specified? And, what behavior
is consistent?
I think that FDW driver shall make the external data represented by
the foreign table empty, regardless of the
structure, location, volume and others.

Therefore, if we follow the above assumption, we don't need to inform
the context where foreign-tables are
picked up (TRUNCATE_REL_CONTEXT_*), so postgres_fdw shall not control
the remote TRUNCATE query
according to the flags. It always truncate the entire tables (if
multiple) on behalf of the foreign tables.

As an aside, if postgres_fdw maps are remote table with "ONLY" clause,
it is exactly a situation where we add
"ONLY" clause on the truncate command, because it is a representation
of the remote "ONLY parent_table" in
this case.

How about your thought?
--
HeteroDB, Inc / The PG-Strom Project
KaiGai Kohei <kaigai(at)heterodb(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2021-04-09 03:37:04 Re: WIP: WAL prefetch (another approach)
Previous Message Bharath Rupireddy 2021-04-09 03:20:29 Re: psql - add SHOW_ALL_RESULTS option