| From: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> | 
|---|---|
| To: | Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> | 
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: pgbench - allow to store select results into variables | 
| Date: | 2016-09-12 07:33:39 | 
| Message-ID: | de93db2a-539e-4044-bca8-47f2afa01ce1@lab.ntt.co.jp | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Hi Fabien,
On 2016/09/07 23:01, Fabien COELHO wrote:
>> Custom script looks like:
>>
>> \;
>> select a \into a
>>    from tab where a = 1;
>> \set i debug(:a)
>>
>> I get the following error:
>>
>> undefined variable "a"
>> client 0 aborted in state 1; execution of meta-command failed
> 
> Good catch!
> 
> Indeed, there is a problem with empty commands which are simply ignored by
> libpq/postgres if there are other commands around, so my synchronization
> between results & commands was too naive.
> 
> In order to fix this, I made the scanner also count empty commands and
> ignore comments. I guessed that proposing to change libpq/postgres
> behavior was not an option.
Seems to be fixed.
>> Comments on the patch follow:
>>
>> +    <listitem>
>> +     <para>
>> +      Stores the first fields of the resulting row from the current or
>> preceding
>> +      <command>SELECT</> command into these variables.
>>
>> Any command returning rows ought to work, no?
> 
> Yes. I put "SQL command" instead.
Check.
>> -    char       *line;           /* text of command line */
>> +    char       *line;           /* first line for short display */
>> +    char       *lines;          /* full multi-line text of command */
>>
>> When I looked at this and related hunks (and also the hunks related to
>> semicolons), it made me wonder whether this patch contains two logical
>> changes.  Is this a just a refactoring for the \into implementation or
>> does this provide some new externally visible feature?
> 
> There is essentially a refactoring that I did when updating how Command is
> managed because it has to be done in several stages to fit "into" into it
> and to take care of compounds.
> 
> However there was small "new externally visible feature": on -r, instead
> of cutting abruptly a multiline command at the end of the first line it
> appended "..." as an ellipsis because it looked nicer.
> I've removed this small visible changed.
There still seems to be a change in behavior of the -r option due to the
patch. Consider the following example:
# no \into used in the script
$ cat /tmp/into.sql
select a from a where a = 1 \;
select a+1 from a where a = 1;
\set a 1
\set b 2
\set i debug(:a)
\set i debug(:b)
$ pgbench -r -n -t 1 -f /tmp/into.sql postgres
<snip>
 - statement latencies in milliseconds:
         2.889  select a from a where a = 1 ;
         0.012  \set a 1
         0.009  \set b 2
         0.031  \set i debug(:a)
         0.014  \set i debug(:b)
# behavior wrt compound statement changes when \into is used
$ cat /tmp/into.sql
select a from a where a = 1 \; \into a
select a+1 from a where a = 1; \into b
\set i debug(:a)
\set i debug(:b)
$ pgbench -r -n -t 1 -f /tmp/into.sql postgres
<snip>
 - statement latencies in milliseconds:
         2.093  select a from a where a = 1 ; select a+1 from a where a = 1;
         0.034  \set i debug(:a)
         0.015  \set i debug(:b)
One more thing I observed which I am not sure if it's a fault of this
patch is illustrated below:
$ cat /tmp/into.sql
\;
select a from a where a = 1 \;
select a+1 from a where a = 1;
\set a 1
\set b 2
\set i debug(:a)
\set i debug(:b)
$ pgbench -r -n -t 1 -f /tmp/into.sql postgres
<snip>
 - statement latencies in milliseconds:
         2.349  ;
         0.013  \set a 1
         0.009  \set b 2
         0.029  \set i debug(:a)
         0.015  \set i debug(:b)
Note that the compound select statement is nowhere to be seen in the
latencies output. The output remains the same even if I use the \into's.
What seems to be going on is that the empty statement on the first line
(\;) is the only part kept of the compound statement spanning lines 1-3.
>> Also I wonder if intonames or intoargs would be a slightly better name
>> for the field.
> 
> I put "intovars" as they are variable names.
Sounds fine.
>> +                fprintf(stderr,
>> +                        "client %d state %d compound %d: "
>> +                        "cannot apply \\into to non SELECT statement\n",
>> +                        st->id, st->state, compound);
>>
>> How about make this error message say something like other messages
>> related to \into, perhaps something like: "\\into cannot follow non SELECT
>> statements\n"
> 
> As you pointed out above, there may be statements without "SELECT" which
> which return a row. I wrote "\\into expects a row" instead.
Sounds fine.
> 
>>             /*
>>              * Read and discard the query result; note this is not
>> included in
>> -             * the statement latency numbers.
>> +             * the statement latency numbers (above), thus if reading the
>> +             * response fails the transaction is counted nevertheless.
>>              */
>>
>> Does this comment need to mention that the result is not discarded when
>> \into is specified?
> 
> Hmmm. The result structure is discarded, but the results are copied into
> variables before that, so the comments seems ok...
Hmm, OK.
>> +    bool        sql_command_in_progress = false;
>>
>> This variable's name could be slightly confusing to readers. If I
>> understand its purpose correctly, perhaps it could be called
>> sql_command_continues.
> 
> It is possible. I like 'in progress' though. Why is it confusing?
> It means that the current command is not finished yet and more is
> expected, that is the final ';' has not been encountered.
I understood that it refers to what you explain here.  But to me it
sounded like the name is referring to the progress of *execution* of a SQL
command whereas the code in question is simply expecting to finish
*parsing* the SQL command using the next lines.  It may be fine though.
>> +                    if (index == 0)
>> +                        syntax_error(desc, lineno, NULL, NULL,
>> +                                     "\\into cannot start a script",
>> +                                     NULL, -1);
>>
>> How about: "\\into cannot be at the beginning of a script" ?
> 
> It is possible, but it's quite longer... I'm not a native speaker, so I'm
> do not know whether it would be better.
Me neither, let's leave it for the committer to decide.
> The attached patch takes into all your comments but:
>  - comment about discarded results...
>  - the sql_command_in_progress variable name change
>  - the longer message on into at the start of a script
The patch seems fine without these, although please consider the concern I
raised with regard to the -r option above.
Thanks,
Amit
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Simon Riggs | 2016-09-12 07:59:27 | Re: Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn | 
| Previous Message | Michael Paquier | 2016-09-12 07:28:41 | Re: Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn |