RE: Remove duplicate table scan in logical apply worker and code refactoring

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Remove duplicate table scan in logical apply worker and code refactoring
Date: 2024-07-31 09:07:07
Message-ID: TYAPR01MB5692EF150E63473FAC5FC0D2F5B12@TYAPR01MB5692.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Hou,

Thanks for creating a patch!

> When reviewing the code in logical/worker.c, I noticed that when applying a
> cross-partition update action, it scans the old partition twice.
> I am attaching the patch 0001 to remove this duplicate table scan.

Just to clarify, you meant that FindReplTupleInLocalRel() are called in
apply_handle_tuple_routing() and apply_handle_tuple_routing()->apply_handle_delete_internal(),
which requires the index or sequential scan, right? LGTM.

> Apart from above, I found there are quite a few duplicate codes related to partition
> handling(e.g. apply_handle_tuple_routing), so I tried to extract some
> common logic to simplify the codes. Please see 0002 for this refactoring.

IIUC, you wanted to remove the application code from apply_handle_tuple_routing()
and put only a part partition detection. Is it right? Anyway, here are comments.

01. apply_handle_insert()

```
+ targetRelInfo = edata->targetRelInfo;
+
/* For a partitioned table, insert the tuple into a partition. */
if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
- apply_handle_tuple_routing(edata,
- remoteslot, NULL, CMD_INSERT);
- else
- apply_handle_insert_internal(edata, edata->targetRelInfo,
- remoteslot);
+ remoteslot = apply_handle_tuple_routing(edata, CMD_INSERT, remoteslot,
+ &targetRelInfo);
+
+ /* For a partitioned table, insert the tuple into a partition. */
+ apply_handle_insert_internal(edata, targetRelInfo, remoteslot);
```

This part contains same comments, and no need to subsctitute in case of normal tables.
How about:

```
- /* For a partitioned table, insert the tuple into a partition. */
+ /*
+ * Find the actual target table if the table is partitioned. Otherwise, use
+ * the same table as the remote one.
+ */
if (rel->localrel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
- apply_handle_tuple_routing(edata,
- remoteslot, NULL, CMD_INSERT);
+ remoteslot = apply_handle_tuple_routing(edata, CMD_INSERT, remoteslot,
+ &targetRelInfo);
else
- apply_handle_insert_internal(edata, edata->targetRelInfo,
- remoteslot);
+ targetRelInfo = edata->targetRelInfo;
+
+ /* Insert a tuple to the target table */
+ apply_handle_insert_internal(edata, targetRelInfo, remoteslot);
```

02. apply_handle_tuple_routing()

```
/*
- * This handles insert, update, delete on a partitioned table.
+ * Determine the partition in which the tuple in slot is to be inserted, and
...
```

But this function is called from delete_internal(). How about "Determine the
partition to which the tuple in the slot belongs."?

03. apply_handle_tuple_routing()

Do you have a reason why this does not return `ResultRelInfo *` but `TupleTableSlot *`?
Not sure, but it is more proper for me to return the table info because this is a
routing function.

04. apply_handle_update()

```
+ targetRelInfo = edata->targetRelInfo;
+ targetrel = rel;
+ remoteslot_root = remoteslot;
```

Here I can say the same thing as 1.

05. apply_handle_update_internal()

It looks the ordering of function's implementations are changed. Is it intentaional?

before

apply_handle_update
apply_handle_update_internal
apply_handle_delete
apply_handle_delete_internal
FindReplTupleInLocalRel
apply_handle_tuple_routing

after

apply_handle_update
apply_handle_delete
apply_handle_delete_internal
FindReplTupleInLocalRel
apply_handle_tuple_routing
apply_handle_update_internal

06. apply_handle_delete_internal()

```
+ targetRelInfo = edata->targetRelInfo;
+ targetrel = rel;
+
```

Here I can say the same thing as 1.

Best regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2024-07-31 09:08:46 Re: Logical Replication of sequences
Previous Message Heikki Linnakangas 2024-07-31 08:47:48 Re: Remove last traces of HPPA support