Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]

From: Amit Kapila <amit(dot)kapila(at)huawei(dot)com>
To: "'Pavel Stehule'" <pavel(dot)stehule(at)gmail(dot)com>
Cc: "'Heikki Linnakangas'" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "'Pg Hackers'" <pgsql-hackers(at)postgresql(dot)org>, "'Bruce Momjian'" <bruce(at)momjian(dot)us>
Subject: Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Date: 2012-09-28 04:47:21
Message-ID: 004e01cd9d34$59b6b9e0$0d242da0$@kapila@huawei.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Friday, September 28, 2012 2:28 AM Pavel Stehule wrote:
> Hello
>
> I reduced this patch as just plpgsql (SPI) problem solution.
>
> Correct generic solution needs a discussion about enhancing our libpq
> interface - we can take a number of returned rows, but we should to get
> number of processed rows too. A users should to parse this number from
> completationTag, but this problem is not too hot and usually is not
> blocker, because anybody can process completationTag usually.
>
> So this issue is primary for PL/pgSQL, where this information is not
> possible. There is precedent - CREATE AS SELECT (thanks for sample :)),
> and COPY should to have same impact on ROW_COUNT like this statement
> (last patch implements it).

IMO now this patch is okay. I have marked it as Ready For Committer.

With Regards,
Amit Kapila.

>
>
> 2012/9/21 Amit Kapila <amit(dot)kapila(at)huawei(dot)com>:
> > On Friday, September 21, 2012 1:23 AM Pavel Stehule wrote:
> >
> >>> Basic stuff:
> >>> ------------
> >>> - Patch applies OK. but offset difference in line numbers.
> >>> - Compiles with errors in contrib [pg_stat_statements, sepgsql]
> >>> modules
> >>> - Regression failed; one test-case in COPY due to incomplete
> >>> test-case attached patch. – same as reported by Heikki
> >>
> >>fixed patch is in attachment
> >
> > After modifications:
> > ---------------------------
> > - Patch applies OK
> > - Compiles cleanly without any errors/warnings
> > - Regression tests pass.
> >
> >>>
> >>> What it does:
> >>> --------------
> >>> Modification to get the number of processed rows evaluated via SPI.
> >>> The changes are to add extra parameter in ProcessUtility to get the
> >>> number of rows processed by COPY command.
> >>>
> >>> Code Review Comments:
> >>> ---------------------
> >>> 1. New parameter is added to ProcessUtility_hook_type function
> >>> but the functions which get assigned to these functions like
> >>> sepgsql_utility_command, pgss_ProcessUtility, prototype &
> >>> definition is not modified.
> >
> > Functionality is not fixed correctly for hook functions, In function
> > pgss_ProcessUtility for bellow snippet of code processed parameter is
> passed NULL, as well as not initialized.
> > because of this when "pg_stat_statements" extention is utilized COPY
> command is giving garbage values.
> > if (prev_ProcessUtility)
> > prev_ProcessUtility(parsetree, queryString, params,
> > dest,
> completionTag, context, NULL);
> > else
> > standard_ProcessUtility(parsetree, queryString,
> params,
> > dest,
> > completionTag, context, NULL);
> >
> > Testcase is attached.
> > In this testcase table has only 1000 records but it show garbage
> value.
> > postgres=# show shared_preload_libraries ;
> > shared_preload_libraries
> > --------------------------
> > pg_stat_statements
> > (1 row)
> > postgres=# CREATE TABLE tbl (a int);
> > CREATE TABLE
> > postgres=# INSERT INTO tbl VALUES(generate_series(1,1000));
> > INSERT 0 1000
> > postgres=# do $$
> > declare r int;
> > begin
> > copy tbl to '/home/kiran/copytest.csv' csv;
> > get diagnostics r = row_count;
> > raise notice 'exported % rows', r;
> > truncate tbl;
> > copy tbl from '/home/kiran/copytest.csv' csv;
> > get diagnostics r = row_count;
> > raise notice 'imported % rows', r;
> > end;
> > $$ language plpgsql;
> > postgres$#
> > NOTICE: exported 13281616 rows
> > NOTICE: imported 13281616 rows
> > DO
> >
> >>>
> >>> 2. Why to add the new parameter if completionTag hold the number of
> >>> processed tuple information; can be extracted
> >>>
> >>> from it as follows:
> >>> _SPI_current->processed = strtoul(completionTag
> >>> + 7, NULL, 10);
> >>
> >>this is basic question. I prefer a natural type for counter - uint64
> >>instead text. And there are no simply way to get offset (7 in this
> >>case)
> >
> > I agree with your point, but currently in few other places we are
> > parsing the completion tag for getting number of tuples processed. So
> > may be in future we can change those places as well. For example
>
> yes, this step can be done in future - together with libpq enhancing.
> Is not adequate change API (and break lot of extensions) for this
> isolated problem. So I propose fix plpgsql issue, and add to ToDo -
> "enhance libpq to return a number of processed rows as number" - but
> this change breaking compatibility.
>
> Pavel
>
> >
> > pgss_ProcessUtility
> > {
> > ..
> >
> > /* parse command tag to retrieve the number of affected rows. */ if
> > (completionTag &&
> > sscanf(completionTag, "COPY " UINT64_FORMAT, &rows) != 1)
> > rows = 0;
> >
> > }
> >
> > _SPI_execute_plan
> > {
> > ..
> > ..
> > if (IsA(stmt, CreateTableAsStmt))
> > {
> > Assert(strncmp(completionTag, "SELECT ", 7) == 0);
> > _SPI_current->processed = strtoul(completionTag + 7,
> >
> > NULL, 10);
> >
> > ..
> > }
> >
> >
> > With Regards,
> > Amit Kapila.
> >
> >
> >
> > --
> > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To
> > make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-hackers

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2012-09-28 05:19:06 Re: ToDo: allow to get a number of processed rows by COPY statement [Review of Patch]
Previous Message Peter Eisentraut 2012-09-28 03:29:44 setting per-database/role parameters checks them against wrong context