Re: Make COPY format extendable: Extract COPY TO format implementations

From: Sutou Kouhei <kou(at)clear-code(dot)com>
To: zhjwpku(at)gmail(dot)com
Cc: tomas(dot)vondra(at)enterprisedb(dot)com, andres(at)anarazel(dot)de, michael(at)paquier(dot)xyz, sawada(dot)mshk(at)gmail(dot)com, andrew(at)dunslane(dot)net, nathandbossart(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2024-07-30 02:58:24
Message-ID: 20240730.115824.1120108664579849899.kou@clear-code.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

In <CAEG8a3+KN=uofw5ksnCwh5s3m_VcfFYd=jTzcpO5uVLBHwSQEg(at)mail(dot)gmail(dot)com>
"Re: Make COPY format extendable: Extract COPY TO format implementations" on Sun, 28 Jul 2024 22:49:47 +0800,
Junwang Zhao <zhjwpku(at)gmail(dot)com> wrote:

> Thanks for updating the patches, I applied them and test
> in my local machine, I did not use tmpfs in my test, I guess
> if I run the tests enough rounds, the OS will cache the
> pages, below is my numbers(I run each test 30 times, I
> count for the last 10 ones):
>
> HEAD PATCHED
>
> COPY to_tab_30 TO '/dev/null' WITH (FORMAT text);
>
> 5628.280 ms 5679.860 ms
> 5583.144 ms 5588.078 ms
> 5604.444 ms 5628.029 ms
> 5617.133 ms 5613.926 ms
> 5575.570 ms 5601.045 ms
> 5634.828 ms 5616.409 ms
> 5693.489 ms 5637.434 ms
> 5585.857 ms 5609.531 ms
> 5613.948 ms 5643.629 ms
> 5610.394 ms 5580.206 ms
>
> COPY from_tab_30 FROM '/tmp/to_tab_30.txt' WITH (FORMAT text);
>
> 3929.955 ms 4050.895 ms
> 3909.061 ms 3890.156 ms
> 3940.272 ms 3927.614 ms
> 3907.535 ms 3925.560 ms
> 3952.719 ms 3942.141 ms
> 3933.751 ms 3904.250 ms
> 3958.274 ms 4025.581 ms
> 3937.065 ms 3894.149 ms
> 3949.896 ms 3933.878 ms
> 3925.399 ms 3936.170 ms
>
> I did not see obvious performance degradation, maybe it's
> because I did not use tmpfs, but I think this OTH means
> that the *function call* and *if branch* added for each row
> is not the bottleneck of the whole execution path.

Thanks for sharing your numbers. I agree with there are no
obvious performance degradation.

> In 0001,
>
> +typedef struct CopyFromRoutine
> +{
> + /*
> + * Called when COPY FROM is started to set up the input functions
> + * associated to the relation's attributes writing to. `finfo` can be
> + * optionally filled to provide the catalog information of the input
> + * function. `typioparam` can be optionally filled to define the OID of
> + * the type to pass to the input function. `atttypid` is the OID of data
> + * type used by the relation's attribute.
>
> +typedef struct CopyToRoutine
> +{
> + /*
> + * Called when COPY TO is started to set up the output functions
> + * associated to the relation's attributes reading from. `finfo` can be
> + * optionally filled. `atttypid` is the OID of data type used by the
> + * relation's attribute.
>
> The second comment has a simplified description for `finfo`, I think it
> should match the first by:
>
> `finfo` can be optionally filled to provide the catalog information of the
> output function.

Good catch. I'll update it as suggested in the next patch set.

> After I post the patch diffs, the gmail grammer shows some hints that
> it should be *associated with* rather than *associated to*, but I'm
> not sure about this one.

Thanks. I'll use "associated with".

> I think the patches are in good shape, I can help to do some
> further tests if needed, thanks for working on this.

Thanks!

--
kou

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhang Mingli 2024-07-30 03:50:54 COPY FROM crash
Previous Message Thomas Munro 2024-07-30 00:39:44 Re: Remove last traces of HPPA support