From: | Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> |
---|---|
To: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie> |
Subject: | Re: ON CONFLICT DO UPDATE for partitioned tables |
Date: | 2018-03-22 04:34:51 |
Message-ID: | CABOikdOQ3iLmSWpgmYnOmMXo05cG8rRDWpajVmEE8jnXeQF-=A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 20, 2018 at 10:09 AM, Amit Langote <
Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2018/03/20 13:30, Amit Langote wrote:
> > I have incorporated your patch in the main patch after updating the
> > comments a bit. Also, now that 6666ee49f49 is in [1], the transition
> > table related tests I proposed yesterday pass nicely. Instead of posting
> > as a separate patch, I have merged it with the main patch. So now that
> > planner refactoring is unnecessary, attached is just one patch.
>
> Sorry, I forgot to remove a hunk in the patch affecting
> src/include/optimizer/prep.h. Fixed in the attached updated version.
>
Thanks for writing a new version. A few comments:
<listitem>
<para>
- Using the <literal>ON CONFLICT</literal> clause with partitioned
tables
- will cause an error if the conflict target is specified (see
- <xref linkend="sql-on-conflict" /> for more details on how the
clause
- works). Therefore, it is not possible to specify
- <literal>DO UPDATE</literal> as the alternative action, because
- specifying the conflict target is mandatory in that case. On the
other
- hand, specifying <literal>DO NOTHING</literal> as the alternative
action
- works fine provided the conflict target is not specified. In that
case,
- unique constraints (or exclusion constraints) of the individual leaf
- partitions are considered.
- </para>
- </listitem>
We should document it somewhere that partition key update is not supported
by
ON CONFLICT DO UPDATE
/*
* get_partition_parent
+ * Obtain direct parent or topmost ancestor of given relation
*
- * Returns inheritance parent of a partition by scanning pg_inherits
+ * Returns direct inheritance parent of a partition by scanning
pg_inherits;
+ * or, if 'getroot' is true, the topmost parent in the inheritance
hierarchy.
*
* Note: Because this function assumes that the relation whose OID is
passed
* as an argument will have precisely one parent, it should only be called
* when it is known that the relation is a partition.
*/
Given that most callers only look for immediate parent, I wonder if it makes
sense to have a new function, get_partition_root(), instead of changing
signature of the current function. That will reduce foot-print of this patch
quite a lot.
@@ -36,6 +38,7 @@ static char
*ExecBuildSlotPartitionKeyDescription(Relation rel,
Datum *values,
bool *isnull,
int maxfieldlen);
+static List *adjust_onconflictset_tlist(List *tlist, TupleConversionMap
*map);
We should name this function in a more generic way, given that it's going
to be
used for other things too. What about adjust_partition_tlist?
+
+ /*
+ * If partition's tuple descriptor differs from the root parent,
+ * we need to adjust the onConflictSet target list to account for
+ * differences in attribute numbers.
+ */
+ if (map != NULL)
+ {
+ /*
+ * First convert Vars to contain partition's atttribute
+ * numbers.
+ */
+
+ /* Convert Vars referencing EXCLUDED pseudo-relation. */
+ onconflset = map_partition_varattnos(node->onConflictSet,
+ INNER_VAR,
+ partrel,
+ firstResultRel, NULL);
Are we not modifying node->onConflictSet in place? Or does
map_partition_varattnos() create a fresh copy before scribbling on the
input?
If it's former then I guess that's a problem. If it's latter then we ought
to
have comments explaining that.
+ tupDesc = ExecTypeFromTL(onconflset, partrelDesc->tdhasoid);
+ PinTupleDesc(tupDesc);
Why do we need to pin the descriptor? If we do need, why don't we need
corresponding ReleaseTupDesc() call?
I am still confused if the partition_conflproj_tdescs really belongs to
PartitionTupleRouting or should it be a part of the ResultRelInfo. FWIW for
the
MERGE patch, I moved everything to a new struct and made it part of the
ResultRelInfo. If no re-mapping is necessary, we can just point to the
struct
in the root relation's ResultRelInfo. Otherwise create and populate a new
one
in the partition relation's ResultRelInfo.
+ leaf_part_rri->ri_onConflictSetWhere =
+ ExecInitQual(onconflwhere, &mtstate->ps);
+ }
So ri_onConflictSetWhere and ri_onConflictSetProj are part of the
ResultRelInfo. Why not move mt_conflproj_tupdesc,
partition_conflproj_tdescs,
partition_arbiter_indexes etc to the ResultRelInfo as well?
+
+/*
+ * Adjust the targetlist entries of an inherited ON CONFLICT DO UPDATE
+ * operation for a given partition
+ *
As I said above, we should disassociate this function from ON CONFLICT DO
UPDATE and rather have it as a general purpose facility.
+ * The expressions have already been fixed, but we have to make sure that
the
+ * target resnos match the partition. In some cases, this can force us to
+ * re-order the tlist to preserve resno ordering.
+ *
Can we have some explanation regarding how the targetlist is reordered? I
know
the function does that by updating the resno in place, but some explanation
would help. Also, should we add an assertion-build check to confirm that the
resultant list is actually ordered?
@@ -66,7 +67,8 @@ static TupleTableSlot
*ExecPrepareTupleRouting(ModifyTableState *mtstate,
EState *estate,
PartitionTupleRouting *proute,
ResultRelInfo *targetRelInfo,
- TupleTableSlot *slot);
+ TupleTableSlot *slot,
+ int *partition_index);
static ResultRelInfo *getTargetResultRelInfo(ModifyTableState *node);
static void ExecSetupChildParentMapForTcs(ModifyTableState *mtstate);
static void ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate);
@@ -264,6 +266,7 @@ ExecInsert(ModifyTableState *mtstate,
TupleTableSlot *slot,
TupleTableSlot *planSlot,
EState *estate,
+ int partition_index,
bool canSetTag)
{
HeapTuple tuple;
If we move the list of arbiter indexes and the tuple desc to ResultRelInfo,
as
suggested above, I think we can avoid making any API changes to
ExecPrepareTupleRouting and ExecInsert.
Thanks,
Pavan
--
Pavan Deolasee http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Ashutosh Bapat | 2018-03-22 04:36:07 | Re: [HACKERS] Partition-wise aggregation/grouping |
Previous Message | Amit Langote | 2018-03-22 04:07:05 | Re: [HACKERS] path toward faster partition pruning |