From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Make COPY extendable in order to support Parquet and other formats |
Date: | 2022-06-22 23:49:08 |
Message-ID: | 20220622234908.jkmc6qg352dsh5x5@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2022-06-22 16:59:16 +0530, Ashutosh Bapat wrote:
> On Tue, Jun 21, 2022 at 3:26 PM Aleksander Alekseev
> <aleksander(at)timescale(dot)com> wrote:
>
> >
> > In other words, personally I'm unaware of use cases when somebody
> > needs a complete read/write FDW or TableAM implementation for formats
> > like Parquet, ORC, etc. Also to my knowledge they are not particularly
> > optimized for this.
> >
>
> IIUC, you want extensibility in FORMAT argument to COPY command
> https://www.postgresql.org/docs/current/sql-copy.html. Where the
> format is pluggable. That seems useful.
Agreed.
But I think it needs quite a bit of care. Just plugging in a bunch of per-row
(or worse, per field) switches to COPYs input / output parsing will make the
code even harder to read and even slower.
I suspect that we'd first need a patch to refactor the existing copy code a
good bit to clean things up. After that it hopefully will be possible to plug
in a new format without being too intrusive.
I know little about parquet - can it support FROM STDIN efficiently?
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Justin Pryzby | 2022-06-22 23:58:45 | Re: pg_upgrade (12->14) fails on aggregate |
Previous Message | Jacob Champion | 2022-06-22 23:36:00 | Re: [PoC] Let libpq reject unexpected authentication requests |