Re: Virtual generated columns

From: Peter Eisentraut <peter(at)eisentraut(dot)org>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Virtual generated columns
Date: 2024-08-20 10:38:25
Message-ID: 5e20ded7-4074-4c40-91e2-bfe86fe373f1@eisentraut.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the great testing again. Here is an updated patch that
addresses the issues you have pointed out.

On 14.08.24 02:00, jian he wrote:
> On Thu, Aug 8, 2024 at 2:23 PM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>>
>> Thank you for your extensive testing. Here is a new patch set that has
>> fixed all the issues you have reported (MERGE, sublinks, statistics,
>> ANALYZE).
>
> if (coldef->generated && restdef->generated &&
> coldef->generated != restdef->generated)
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_COLUMN_DEFINITION),
> errmsg("column \"%s\" inherits from
> generated column of different kind",
> restdef->colname)));
> the error message is not informal. maybe add errhint that
> "column \"%s\" should be same as parent table's generated column kind:
> %s", "virtual"|"stored"

Ok, I added an errdetail().

> .../regress/expected/create_table_like.out | 23 +-
> .../regress/expected/generated_stored.out | 27 +-
> ...rated_stored.out => generated_virtual.out} | 835 +++++++++---------
> src/test/regress/parallel_schedule | 3 +
> src/test/regress/sql/create_table_like.sql | 2 +-
> src/test/regress/sql/generated_stored.sql | 10 +-
> ...rated_stored.sql => generated_virtual.sql} | 301 ++++---
> src/test/subscription/t/011_generated.pl | 38 +-
> 55 files changed, 1280 insertions(+), 711 deletions(-)
> copy src/test/regress/expected/{generated_stored.out
> generated_virtual.out} (69%)
> copy src/test/regress/sql/{generated_stored.sql => generated_virtual.sql} (72%)
>
> I don't understand the "copy =>" part, I guess related to copy content
> from stored to virtual.
> anyway. some minor issue:

That's just what git format-patch produces. It shows that 72% of the
new file is a copy of the existing file.

> -- alter generation expression of parent and all its children altogether
> ALTER TABLE gtest_parent ALTER COLUMN f3 SET EXPRESSION AS (f2 * 2);
> \d gtest_parent
> \d gtest_child
> \d gtest_child2
> \d gtest_child3
> SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
>
> The first line ALTER TABLE will fail for
> src/test/regress/sql/generated_virtual.sql.
> so no need
> """
> \d gtest_parent
> \d gtest_child
> \d gtest_child2
> \d gtest_child3
> SELECT tableoid::regclass, * FROM gtest_parent ORDER BY 1, 2, 3;
> """
>
> Similarly the following tests for gtest29 may aslo need change
> -- ALTER TABLE ... ALTER COLUMN ... DROP EXPRESSION
>
> since we cannot do ALTER TABLE SET EXPRESSION for virtual generated columns.

I left all these tests in place from the equivalent STORED tests, in
case we want to add support for the VIRTUAL case as well. I expect that
we'll add support for some of these before too long.

> -- ALTER TABLE ... ALTER COLUMN
> CREATE TABLE gtest27 (
> a int,
> b int,
> x int GENERATED ALWAYS AS ((a + b) * 2) VIRTUAL
> );
> INSERT INTO gtest27 (a, b) VALUES (3, 7), (4, 11);
> ALTER TABLE gtest27 ALTER COLUMN a TYPE text; -- error
> ALTER TABLE gtest27 ALTER COLUMN x TYPE numeric;
>
> will
> ALTER TABLE gtest27 ALTER COLUMN a TYPE int4;
> be a no-op?

Changing the type of a column that is used by a generated column is
already prohibited. Are you proposing to change anything here?

> do we need a document that virtual generated columns will use the
> expression's collation.
> see:
> drop table if exists t5;
> CREATE TABLE t5 (
> a text collate "C",
> b text collate "C" GENERATED ALWAYS AS (a collate case_insensitive) ,
> d int DEFAULT 22
> );
> INSERT INTO t5(a,d) values ('d1',28), ('D2',27), ('D1',26);
> select * from t5 order by b asc, d asc;

I have fixed this. It will now apply the collation of the column.

> create domain mydomain as int4;
> create type mydomainrange as range(subtype=mydomain);
> CREATE TABLE t3( b bigint, c mydomain GENERATED ALWAYS AS ('11') VIRTUAL);
> CREATE TABLE t3( b bigint, c mydomainrange GENERATED ALWAYS AS
> ('[4,50)') VIRTUAL);
> domain will error out, domain over range is ok, is this fine?

Fixed. The check is now in CheckAttributeType() in heap.c, which has
the ability to recurse into composite data types.

> + When <literal>VIRTUAL</literal> is specified, the column will be
> + computed when it is read, and it will not occupy any storage. When
> + <literal>STORED</literal> is specified, the column will be computed on
> + write and will be stored on disk. <literal>VIRTUAL</literal> is the
> + default.
> drop table if exists t5;
> CREATE TABLE t5 (
> a int,
> b text storage extended collate "C" GENERATED ALWAYS AS (a::text
> collate case_insensitive) ,
> d int DEFAULT 22
> );
> select reltoastrelid <> 0 as has_toast_table from pg_class where oid =
> 't5'::regclass;
>
> if really no storage, should table t5 have an associated toast table or not?
> also check ALTER TABLE variant:
> alter table t5 alter b set storage extended;

Fixed. It does not trigger a toast table now.

> Do we need to do something in ATExecSetStatistics for cases like:
> ALTER TABLE t5 ALTER b SET STATISTICS 2000;
> (b is a generated virtual column).
> because of
> examine_attribute
> if (attr->attgenerated == ATTRIBUTE_GENERATED_VIRTUAL)
> return NULL;
> i guess, this won't have a big impact.

This is also an error now.

> There are some issues with changing virtual generated column type.
> like:
> drop table if exists another;
> create table another (f4 int, f2 text, f3 text, f1 int GENERATED
> ALWAYS AS (f4));
> insert into another values(1, 'one', 'uno'), (2, 'two', 'due'),(3,
> 'three', 'tre');
> alter table another
> alter f1 type text using f2 || ' and ' || f3 || ' more';
> table another;
>
> or
> alter table another
> alter f1 type text using f2 || ' and ' || f3 || ' more',
> drop column f1;
> ERROR: column "f1" of relation "another" does not exist
>
> These two command outputs seem not right.
> the stored generated column which works as expected.

I noticed this is already buggy for stored generated columns. It should
prevent the use of the USING clause here. I'll propose a fix for that
in a separate thread. There might be further adjustments needed for
changing the types of virtual columns, but I'll come back to that after
the existing bug is fixed.

Attachment Content-Type Size
v3-0001-Rename-regress-test-generated-to-generated_stored.patch text/plain 2.1 KB
v3-0002-Put-generated_stored-test-objects-in-a-schema.patch text/plain 19.0 KB
v3-0003-Virtual-generated-columns.patch text/plain 186.6 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-08-20 10:40:22 Re: long-standing data loss bug in initial sync of logical replication
Previous Message Amul Sul 2024-08-20 10:26:23 Re: pg_verifybackup: TAR format backup verification