From: | gkokolatos(at)pm(dot)me |
---|---|
To: | Amit Langote <amitlangote09(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)fujitsu(dot)com> |
Subject: | Re: Allow batched insert during cross-partition updates |
Date: | 2021-03-16 08:11:57 |
Message-ID: | NB-VVc_lc77vRhkSv4QA3Nzew1g-qiLD1zw6arRgwnDnpNKkKE77z123nmotxPU1e0UQV5v9DXXgMKHLXoxLiC9EFuHXlPO7U3vyJ2gb82o=@pm.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Tuesday, March 16, 2021 6:13 AM, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> Hi Georgios,
>
> On Fri, Mar 12, 2021 at 7:59 PM gkokolatos(at)pm(dot)me wrote:
>
> > On Friday, March 12, 2021 3:45 AM, Amit Langote amitlangote09(at)gmail(dot)com wrote:
> >
> > > On Thu, Mar 11, 2021 at 8:36 PM gkokolatos(at)pm(dot)me wrote:
> > >
> > > > On Thursday, March 11, 2021 9:42 AM, Amit Langote amitlangote09(at)gmail(dot)com wrote:
> > > >
> > > > > What we do support however is moving rows from a local partition to a
> > > > > remote partition and that involves performing an INSERT on the latter.
> > > > > This patch is for teaching those INSERTs to use batched mode if
> > > > > allowed, which is currently prohibited. So with this patch, if an
> > > > > UPDATE moves 10 rows from a local partition to a remote partition,
> > > > > then they will be inserted with a single INSERT command containing all
> > > > > 10 rows, instead of 10 separate INSERT commands.
> > > >
> > > > So, if I understand correctly then in my previously attached repro I
> > > > should have written instead:
> > > >
> > > > CREATE TABLE local_root_remote_partitions (a int) PARTITION BY LIST ( a );
> > > > CREATE TABLE
> > > > local_root_local_partition_1
> > > > PARTITION OF
> > > > local_root_remote_partitions FOR VALUES IN (1);
> > > >
> > > > CREATE FOREIGN TABLE
> > > > local_root_remote_partition_2
> > > > PARTITION OF
> > > > local_root_remote_partitions FOR VALUES IN (2)
> > > > SERVER
> > > > remote_server
> > > > OPTIONS (
> > > > table_name 'remote_partition_2',
> > > > batch_size '10'
> > > > );
> > > >
> > > > INSERT INTO local_root_remote_partitions VALUES (1), (1);
> > > > -- Everything should be on local_root_local_partition_1 and on the same transaction
> > > > SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
> > > >
> > > > UPDATE local_root_remote_partitions SET a = 2;
> > > > -- Everything should be on remote_partition_2 and on the same transaction
> > > > SELECT ctid, xmin, xmax, cmax, tableoid::regclass, a FROM local_root_remote_partitions;
> > > >
> > > >
> > > > I am guessing that I am still wrong because the UPDATE operation above will
> > > > fail due to the restrictions imposed in postgresBeginForeignInsert regarding
> > > > UPDATES.
> > >
> > > Yeah, for the move to work without hitting the restriction you
> > > mention, you will need to write the UPDATE query such that
> > > local_root_remote_partition_2 is not updated. For example, as
> > > follows:
> > > UPDATE local_root_remote_partitions SET a = 2 WHERE a <> 2;
> > > With this query, the remote partition is not one of the result
> > > relations to be updated, so is able to escape that restriction.
> >
> > Excellent. Thank you for the explanation and patience.
> >
> > > > Would it be too much to ask for the addition of a test case that will
> > > > demonstrate the change of behaviour found in patch.
> > >
> > > Hmm, I don't think there's a way to display whether the INSERT done on
> > > the remote partition as a part of an (tuple-moving) UPDATE used
> > > batching or not. That's because that INSERT's state is hidden from
> > > EXPLAIN. Maybe we should change EXPLAIN to make it show such hidden
> > > INSERT's state (especially its batch size) under the original UPDATE
> > > node, but I am not sure.
> >
> > Yeah, there does not seem to be a way for explain to do show that information
> > with the current code.
> >
> > > By the way, the test case added by commit 927f453a94106 does exercise
> > > the code added by this patch, but as I said in the last paragraph, we
> > > can't verify that by inspecting EXPLAIN output.
> >
> > I never doubted that. However, there is a difference. The current patch
> > changes the query to be executed in the remote from:
> > INSERT INTO <snip> VALUES ($1);
> > to:
> > INSERT INTO <snip> VALUES ($1), ($2) ... ($n);
> > When this patch gets in, it would be very helpful to know that subsequent
> > code changes will not cause regressions. So I was wondering if there is
> > a way to craft a test case that would break for the code in 927f453a94106
> > yet succeed with the current patch.
>
> The test case "works" both before and after the patch, with the
> difference being in the form of the remote query. It seems to me
> though that you do get that.
>
> > I attach version 2 of my small reproduction. I am under the impression that
> > in this version, examining the value of cmin in the remote table should
> > give an indication of whether the remote received a multiple insert queries
> > with a single value, or a single insert query with multiple values.
> > Or is this a wrong assumption of mine?
>
> No, I think you have a good idea here.
Thank you.
>
> I've adjusted that test case to confirm that the batching indeed works
> by checking cmin of the moved rows, as you suggest. Please check the
> attached updated patch.
Excellent. The patch in the current version with the added test seems
ready to me.
I would still vote to have accessor functions instead of exposing the
whole PartitionTupleRouting struct, but I am not going to hold a too
strong stance about it.
If you agree with me, please provide an updated version of the patch.
Otherwise let it be known and I will flag the patch as RfC in the
commitfest app.
Cheers,
//Georgios
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Amit Langote
> EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2021-03-16 08:17:08 | Re: shared-memory based stats collector |
Previous Message | Fujii Masao | 2021-03-16 08:10:28 | Re: fdatasync performance problem with large number of DB files |