Re: Virtual generated columns

From: jian he <jian(dot)universality(at)gmail(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Virtual generated columns
Date: 2024-08-29 09:01:00
Message-ID: CACJufxE46Gb+crfP5pySqKPsix4vPtCUavV4xKoYA8+T9YiPRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 21, 2024 at 6:52 PM Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>
> On Wed, 21 Aug 2024 at 08:00, Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
> >
> > On 08.08.24 20:22, Dean Rasheed wrote:
> > > Looking at the rewriter changes, it occurred to me that it could
> > > perhaps be done more simply using ReplaceVarsFromTargetList() for each
> > > RTE with virtual generated columns. That function already has the
> > > required wholerow handling code, so there'd be less code duplication.
> >
> > Hmm, I don't quite see how ReplaceVarsFromTargetList() could be used
> > here. It does have the wholerow logic that we need somehow, but other
> > than that it seems to target something different?
> >
>
> Well what I was thinking was that (in fireRIRrules()'s final loop over
> relations in the rtable), if the relation had any virtual generated
> columns, you'd build a targetlist containing a TLE for each one,
> containing the generated expression. Then you could just call
> ReplaceVarsFromTargetList() to replace any Vars in the query with the
> corresponding generated expressions. That takes care of descending
> into subqueries, adjusting varlevelsup, and expanding wholerow Vars
> that might refer to the generated expression.
>
> I also have half an eye on how this patch will interact with my patch
> to support RETURNING OLD/NEW values. If you use
> ReplaceVarsFromTargetList(), it should just do the right thing for
> RETURNING OLD/NEW generated expressions.
>
> > > I think it might be better to do this from within fireRIRrules(), just
> > > after RLS policies are applied, so it wouldn't need to worry about
> > > CTEs and sublink subqueries. That would also make the
> > > hasGeneratedVirtual flags unnecessary, since we'd already only be
> > > doing the extra work for tables with virtual generated columns. That
> > > would eliminate possible bugs caused by failing to set those flags.
> >
> > Yes, ideally, we'd piggy-back this into fireRIRrules(). One thing I'm
> > missing is that if you're descending into subqueries, there is no link
> > to the upper levels' range tables, which we need to lookup the
> > pg_attribute entries of column referencing Vars. That's why there is
> > this whole custom walk with its own context data. Maybe there is a way
> > to do this already that I missed?
> >
>
> That link to the upper levels' range tables wouldn't be needed because
> essentially using ReplaceVarsFromTargetList() flips the whole thing
> round: instead of traversing the tree looking for Var nodes that need
> to be replaced (possibly from upper query levels), you build a list of
> replacement expressions to be applied and apply them from the top,
> descending into subqueries as needed.
>
> Another argument for doing it that way round is to not add too many
> extra cycles to the processing of existing queries that don't
> reference generated expressions. ISTM that this patch is potentially
> adding quite a lot of additional overhead -- it looks like, for every
> Var in the tree, it's calling get_attgenerated(), which involves a
> syscache lookup to see if that column is a generated expression (which
> most won't be). Ideally, we should be trying to do the minimum amount
> of extra work in the common case where there are no generated
> expressions.
>
> Looking ahead, I can also imagine that one day we might want to
> support subqueries in generated expressions. That would require
> recursive processing of generated expressions in the generated
> expression's subquery, as well as applying RLS policies to the new
> relations pulled in, and checks to guard against infinite recursion.
> fireRIRrules() already has the infrastructure to support all of that,
> so that feels like a much more natural place to do this.
>

Is the attached something you are thinking of?
(mainly see changes of src/backend/rewrite/rewriteHandler.c)

i bloated rewriteHandler.c a lot, mainly because
expand_generated_columns_in_expr
not using ReplaceVarsFromTargetList, only expand_generated_columns_in_query do.

if we are using ReplaceVarsFromTargetList, then
expand_generated_columns_in_expr also needs to use ReplaceVarsFromTargetList?

I don't think we can call ReplaceVarsFromTargetList within
expand_generated_columns_in_expr.

if so, then the pattern would be like:
{
Node *tgqual;
tgqual = (Node *) expand_generated_columns_in_expr(tgqual,
relinfo->ri_RelationDesc, context);
ReplaceVarsFromTargetList
}

There are 6 of expand_generated_columns_in_expr called.

Attachment Content-Type Size
v4-0001-misc.no-cfbot application/octet-stream 196.0 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-08-29 09:01:56 Re: type cache cleanup improvements
Previous Message Jakub Wartak 2024-08-29 07:28:16 Re: Redux: Throttle WAL inserts before commit