From: | Sutou Kouhei <kou(at)clear-code(dot)com> |
---|---|
To: | sawada(dot)mshk(at)gmail(dot)com |
Cc: | michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date: | 2024-11-27 07:53:44 |
Message-ID: | 20241127.165344.1671821093365528062.kou@clear-code.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
In <CAD21AoBW5dEv=Gd2iF_BYNZGEsF=3KTG7fpq=vP5qwpC1CAOeA(at)mail(dot)gmail(dot)com>
"Re: Make COPY format extendable: Extract COPY TO format implementations" on Mon, 25 Nov 2024 23:10:50 -0800,
Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> Custom COPY format extensions need to use
>> CopyToStateData/CopyFromStateData. For example,
>> CopyToStateData::rel is used to retrieve table schema. If we
>> move CopyToStateData to copyto_internal.h not copyapi.h,
>> custom COPY format extensions need to include
>> copyto_internal.h. I feel that it's strange that extensions
>> need to use internal headers.
>>
>> What is your real concern? If you don't want to export
>> CopyToStateData/CopyFromStateData entirely, we can provide
>> accessors only for some members of them.
>
> I'm not against exposing CopyToStateData and CopyFromStateData. My
> concern is that if we move all copy-related structs to copyapi.h,
> other copy-related files would need to include copyapi.h even if the
> file is not related to copy format APIs. IMO copyapi.h should have
> only copy-format-API-related variables structs such as CopyFromRoutine
> and CopyToRoutine and functions that custom COPY format extension can
> utilize to access data source and destination, such as CopyGetData().
>
> When it comes to CopyToStateData and CopyFromStateData, I feel that
> they have mixed fields of common fields (e.g., rel, num_errors, and
> transition_capture) and format-specific fields (e.g., input_buf,
> line_buf, and eol_type). While it makes sense to me that custom copy
> format extensions can access the common fields, it seems odd to me
> that they can access text-and-csv-format-specific fields such as
> input_buf. We might want to sort out these fields but it could be a
> huge task.
I understand you concern.
How about using Copy{To,From}StateData::opaque to store
text-and-csv-format-specific data? I feel that this
refactoring doesn't block the 0001/0002 patches. Do you
think that this is a blocker of the 0001/0002 patches?
I think that this may block the 0004/0007 patches that
export Copy{To,From}StateData. But we can work on it after
we merge the 0004/0007 patches. Which is preferred?
> Also, I realized that CopyFromTextLikeOneRow() does input function
> calls and handle soft errors based on ON_ERROR and LOG_VERBOSITY
> options. I think these should be done in the core side.
How about extracting the following part in NextCopyFrom() as
a function and provide it for extensions?
----
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
cstate->num_errors++;
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
{
/*
* Since we emit line number and column info in the below
* notice message, we suppress error context information
* other than the relation name.
*/
Assert(!cstate->relname_only);
cstate->relname_only = true;
if (cstate->cur_attval)
{
char *attval;
attval = CopyLimitPrintoutLength(cstate->cur_attval);
ereport(NOTICE,
errmsg("skipping row due to data type incompatibility at line %llu for column \"%s\": \"%s\"",
(unsigned long long) cstate->cur_lineno,
cstate->cur_attname,
attval));
pfree(attval);
}
else
ereport(NOTICE,
errmsg("skipping row due to data type incompatibility at line %llu for column \"%s\": null input",
(unsigned long long) cstate->cur_lineno,
cstate->cur_attname));
/* reset relname_only */
cstate->relname_only = false;
}
----
See the attached v27 patch set for this idea.
0001-0008 are almost same as the v26 patch set.
("format" -> "FORMAT" in COPY test changes are included.)
0009 exports the above code as
CopyFromSkipErrorRow(). Extensions should call it when they
use errsave() for a soft error in CopyFromOneRow callback.
Thanks,
--
kou
Attachment | Content-Type | Size |
---|---|---|
v27-0001-Refactor-COPY-TO-to-use-format-callback-function.patch | text/x-patch | 17.9 KB |
v27-0002-Refactor-COPY-FROM-to-use-format-callback-functi.patch | text/x-patch | 32.5 KB |
v27-0003-Add-support-for-adding-custom-COPY-TO-format.patch | text/x-patch | 17.6 KB |
v27-0004-Export-CopyToStateData.patch | text/x-patch | 9.1 KB |
v27-0005-Add-support-for-implementing-custom-COPY-TO-form.patch | text/x-patch | 2.0 KB |
v27-0006-Add-support-for-adding-custom-COPY-FROM-format.patch | text/x-patch | 9.1 KB |
v27-0007-Export-CopyFromStateData.patch | text/x-patch | 17.4 KB |
v27-0008-Add-support-for-implementing-custom-COPY-FROM-fo.patch | text/x-patch | 1.9 KB |
v27-0009-Add-CopyFromSkipErrorRow-for-custom-COPY-format-.patch | text/x-patch | 10.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2024-11-27 08:01:26 | Re: Support LIKE with nondeterministic collations |
Previous Message | wenhui qiu | 2024-11-27 07:51:05 | Re: Auto Vacuum optimisation |