Re: Declarative partitioning - another take

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Declarative partitioning - another take
Date: 2016-11-28 14:42:21
Message-ID: CAFjFpRedgcToMdWTDvTD933Qm4H7gGFLsr3wHoF087BdgvXa6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some comments on 0003 patch.
1. In ALTER TABLE documentation we should refer to CREATE TABLE documentation
for partition_bound_spec syntax, similar ADD table_constraint note.

2. I think, "of the target table" is missing after all the ... constraints
+ match. Also, it must have all the <literal>NOT NULL</literal> and
+ <literal>CHECK</literal> constraints present in the target table.

3. I think, using "any of the" instead of "some" is preferred in the following
sentence. In the following sentence, we should use "such a constraint" instead
of "constraints" to avoid mixed singular and plural usage.
+ If some <literal>CHECK</literal> constraint of the table being attached

4. This construction is ambiguous. "are not considered" - considered where? for
what? Constraints on which object?
+ clause. Currently <literal>UNIQUE</literal>, <literal>PRIMARY
KEY</literal>,
+ and <literal>FOREIGN KEY</literal> constraints are not considered.

5. What is a partition constraint? Do we define that term anywhere in the
documentation? If not we should define that term and then use it here.
+ A full table scan is performed on the table being attached to check that
+ no existing row in the table violates the partition constraint. It is

6. The paragraph following
+ A full table scan is performed on the table
being attached to check that seems to be confusing. It says that the
potentially expensive scan can be avoided by adding a constraint equivalent to
partition constraint. That seems to be wrong. The table will be scanned anyway,
when adding a valid constraint per ALTER TABLE ADD table_constraint
documentation. Instead you might want to say that the table scan will not be
carried out if the existing constraints imply the partition constraints.

7. You might want to specify the fate of range or lists covered by the
partition being detached in DETACH PARTITION section.

8. Since partition bound specification is more than a few words, you might want
to refer to the actual syntax in CREATE TABLE command.
+ <term><replaceable
class="PARAMETER">partition_bound_spec</replaceable></term>
+ <listitem>
+ <para>
+ The partition bound specification for a new partition.
+ </para>
+ </listitem>
+ </varlistentry>

9. I think, we need to use "may be scanned" instead of "is scanned", given that
some other part of the documentation talks about "avoiding" the table scan. May
be refer to that documentation to clarify the uncertainty.
+ Similarly, when attaching a new partition it is scanned to verify that

10. The same paragraph talks about multiple ALTER TABLE commands being run
together to avoid multiple table rewrites. Do we want to document whether
ATTACH/DETACH partition can be run with any other command or not.

11. ATTACH partition documentation is not talking about the unlogged or
temporary tables being attached to is logged or non-temporary. What is current
code doing about these cases? Do we allow such mixed partition hierarchy?
+ existing rows meet the partition constraint.

I will continue to look at the patch.

On Thu, Nov 24, 2016 at 4:04 PM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> Hi Rushabh,
>
> On 2016/11/22 22:11, Rushabh Lathia wrote:
>> Hi Amit,
>>
>> I was just reading through your patches and here are some quick review
>> comments
>> for 0001-Catalog-and-DDL-for-partitioned-tables-17.patch.
>
> Thanks for the review!
>
>>
>> Review comments for 0001-Catalog-and-DDL-for-partitioned-tables-17.patch:
>>
>> 1)
>> @@ -1102,9 +1104,10 @@ heap_create_with_catalog(const char *relname,
>> {
>> /* Use binary-upgrade override for pg_class.oid/relfilenode? */
>> if (IsBinaryUpgrade &&
>> - (relkind == RELKIND_RELATION || relkind == RELKIND_SEQUENCE ||
>> - relkind == RELKIND_VIEW || relkind == RELKIND_MATVIEW ||
>> - relkind == RELKIND_COMPOSITE_TYPE || relkind ==
>> RELKIND_FOREIGN_TABLE))
>> + (relkind == RELKIND_RELATION || relkind ==
>> RELKIND_PARTITIONED_TABLE ||
>> + relkind == RELKIND_SEQUENCE || relkind == RELKIND_VIEW ||
>> + relkind == RELKIND_MATVIEW || relkind ==
>> RELKIND_COMPOSITE_TYPE ||
>> + relkind == RELKIND_FOREIGN_TABLE))
>>
>> You should add the RELKIND_PARTITIONED_TABLE at the end of if condition that
>> will make diff minimal. While reading through the patch I noticed that there
>> is inconsistency - someplace its been added at the end and at few places its
>> at the start. I think you can make add it at the end of condition and be
>> consistent with each place.
>
> OK, done.
>
>>
>> 2)
>>
>> + /*
>> + * We need to transform the raw parsetrees corresponding to
>> partition
>> + * expressions into executable expression trees. Like column
>> defaults
>> + * and CHECK constraints, we could not have done the transformation
>> + * earlier.
>> + */
>>
>>
>> Additional space before "Like column defaults".
>
> I think it's a common practice to add two spaces after a sentence-ending
> period [https://www.gnu.org/prep/standards/html_node/Comments.html] which
> it seems, is followed more or less regularly in the formatting of the
> comments in the PostgreSQL source code.
>
>> 3)
>> - char relkind;
>> + char relkind,
>> + expected_relkind;
>>
>> Newly added variable should be define separately with its type. Something
>> like:
>>
>> char relkind;
>> + char expected_relkind;
>
> OK, done.
>
>>
>> 4)
>>
>> a)
>> + /* Prevent partitioned tables from becoming inheritance parents */
>> + if (parent_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>> + ereport(ERROR,
>> + (errcode(ERRCODE_WRONG_OBJECT_TYPE),
>> + errmsg("cannot inherit from partitioned table \"%s\"",
>> + parent->relname)));
>> +
>>
>> need alignment for last line.
>
> Fixed.
>
>> b)
>> + atttuple = SearchSysCacheAttName(RelationGetRelid(rel),
>> pelem->name);
>> + if (!HeapTupleIsValid(atttuple))
>> + ereport(ERROR,
>> + (errcode(ERRCODE_UNDEFINED_COLUMN),
>> + errmsg("column \"%s\" named in partition key does
>> not exist",
>> + pelem->name)));
>> + attform = (Form_pg_attribute) GETSTRUCT(atttuple);
>> +
>> + if (attform->attnum <= 0)
>> + ereport(ERROR,
>> + (errcode(ERRCODE_UNDEFINED_COLUMN),
>> + errmsg("cannot use system column \"%s\" in
>> partition key",
>> + pelem->name)));
>>
>> need alignment for last line of ereport
>
> Fixed.
>
>> c)
>> + /* Disallow ROW triggers on partitioned tables */
>> + if (stmt->row && rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
>> + ereport(ERROR,
>> + (errcode(ERRCODE_WRONG_OBJECT_TYPE),
>> + errmsg("\"%s\" is a partitioned table",
>> + RelationGetRelationName(rel)),
>> + errdetail("Partitioned tables cannot have ROW triggers.")));
>>
>> need alignment
>
> Fixed.
>
>> 5)
>>
>> @@ -2512,6 +2579,7 @@ transformAlterTableStmt(Oid relid, AlterTableStmt
>> *stmt,
>> cxt.blist = NIL;
>> cxt.alist = NIL;
>> cxt.pkey = NULL;
>> + cxt.ispartitioned = rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE;
>>
>> I think adding bracket will look code more clear.
>>
>> + cxt.ispartitioned = (rel->rd_rel->relkind ==
>> RELKIND_PARTITIONED_TABLE);
>
> Agreed, done.
>
>> 6)
>>
>> + * RelationBuildPartitionKey
>> + * Build and attach to relcache partition key data of relation
>> + *
>> + * Partitioning key data is stored in CacheMemoryContext to ensure it
>> survives
>> + * as long as the relcache. To avoid leaking memory in that context in
>> case
>> + * of an error partway through this function, we build the structure in the
>> + * working context (which must be short-lived) and copy the completed
>> + * structure into the cache memory.
>>
>> extra space before "To avoid leaking memory"
>
> Same thing I said above.
>
>> 7)
>> + /* variable-length fields start here, but we allow direct access to
>> partattrs */
>> + int2vector partattrs; /* attribute numbers of columns in
>> the
>>
>> Why partattrs is allow direct access - its not really clear from the
>> comments.
>
> I have copied the comment from another catalog header file (a number of
> catalog headers have the same comment). I updated the new file's comment
> to say a little bit more; I wonder if the comment should be updated in
> other files as well? However, I noticed that there are explanatory notes
> elsewhere (for example, around the code that reads such a field from the
> catalog) about why the first variable-length field of a catalog's tuple
> (especially of type int2vector or oidvector) are directly accessible via
> their C struct offsets.
>
>> I will continue reading more patch and testing functionality.. will share
>> the
>> comments as I have it.
>
> Updated patches attached. I have also considered Robert's comments [1] as
> follows and fixed a bug that Ashutosh reported [2]:
>
> - Force partition key columns to be NOT NULL at the table creation time
> if using range partitioning
> - Use enum to represent the content of individual range datums (viz.
> finite datum, -infinity, +infinity) in the relcache struct
>
> Thanks,
> Amit
>
> [1]
> https://www.postgresql.org/message-id/CA+TgmoZ-feQsxc7U_JerM_AFChp3Qf6btK708SAe7M8Vdv5=jA@mail.gmail.com
> [2]
> https://www.postgresql.org/message-id/306c85e9-c702-3742-eeff-9b7a40498afc%40lab.ntt.co.jp

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2016-11-28 14:49:01 Re: Mail thread references in commits
Previous Message Paul Ramsey 2016-11-28 14:33:13 Re: User-defined Operator Pushdown and Collations