Re: Set query_id for query contained in utility statement

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Set query_id for query contained in utility statement
Date: 2024-10-23 06:10:19
Message-ID: ZxiTS_v6mYpcPHsa@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 22, 2024 at 11:34:55AM +0200, Anthonin Bonnefoy wrote:
> On Tue, Oct 22, 2024 at 7:06 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> Something that
>> worries me a bit is that this changes makes the code less clean, by
>> having a SELECT-INTO specific routine called in the parse-analyze
>> paths, while adding three individual paths in charge of setting
>> pstate->p_stmt_len and p_stmt_location.
>
> I've moved pstate's p_stmt_len and p_stmt_location assignment to
> transformTopLevelStmt (and also restored transformTopLevelStmt). This
> will remove the multiple assignment paths.
>
>> + n->stmt_len = @3 - @2;
>>
>> This specific case deserves a comment.
>
> I think I went over this 3 times thinking "maybe I should add a
> comment here". Added.

Thanks. These changes look OK.

> This is doable. I've moved the query's location and length assignment
> to the end of transformStmt which will call setQueryLocationAndLength.
> The logic of manipulating locations and lengths will only happen in a
> single place. That creates an additional switch on the node's type as
> a small trade off.

Grouping both assignments in a single setQueryLocationAndLength() is
less confusing.

> Also, there was an unnecessary cast in analyze.c "result->utilityStmt
> = (Node *) parseTree;" as parseTree is already a Node. I removed it in
> 0001.

Indeed. It does not matter one way or another and we have plenty of
these in the tree.

I have some more minor comments.

- if (@$ < 0) /* see comments for YYLLOC_DEFAULT */
- @$ = @2;

With 14e5680eee19 now in the tree (interesting timing as this did not
exist until yesterday), it looks like we don't need these ones
anymore, no?

@@ -18943,11 +19004,13 @@ insertSelectOptions(SelectStmt *stmt,
+ /* Update SelectStmt's location to the start of the with clause */
+ stmt->stmt_location = withClause->location;

I have been wondering for a bit what this is about, and indeed this
makes the location setup easier to think about with the various SELECT
rules we need to deal with (the ones calling insertSelectOptions), and
WITH is just in a subset of them. So LGTM.

+ ParseLoc p_stmt_location; /* start location, or -1 if unknown */
+ ParseLoc p_stmt_len; /* length in bytes; 0 means "rest of string" */

So, the reason why these two fields are added to the ParseState is
that the lower layers in charge of the query transforms don't have to
RawStmt so as the location and lengths can be adjusted when queries
are between parenthesis. I was first wondering if we should push
RawStmt deeper into the argument stack, but based on the stmt_location
assignments for the DMLs and WITH, storing that in the ParseState
looks neater. The patch is lacking a description of these two fields
at the top of the ParseState structure in parse_node.h. This stuff
needs to be explained, aka we need them to be able to adjust the
locations and lengths depending on inner clauses of the queries we are
dealing with, or something like that.

While reviewing the whole, I've done some changes, mostly stylistic.
Please see the attach about them, that I have kept outside of your
v11-0001 for clarity. I still need to dive deeper into v11-0002 (not
attached here), but let's take one step at a time and conclude on
v11-0001 first..
--
Michael

Attachment Content-Type Size
v12-0001-Track-location-to-extract-relevant-part-in-neste.patch text/x-diff 40.2 KB
v12-0002-Some-edits-from-me.patch text/x-diff 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-10-23 06:21:17 Re: Pgoutput not capturing the generated columns
Previous Message Peter Smith 2024-10-23 05:00:05 Re: Pgoutput not capturing the generated columns