Re: How to import Apache parquet files?

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Softwarelimits <softwarelimits(at)gmail(dot)com>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: How to import Apache parquet files?
Date: 2019-11-05 16:05:48
Message-ID: 20191105160548.i6dbennbjapxmnuy@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Nov 05, 2019 at 04:21:45PM +0100, Softwarelimits wrote:
>Hi Imre, thanks for the quick response - yes, I found that, but I was not
>sure if it is already production ready - also I would like to use the data
>with the timescale extension, that is why I need a full import.
>

Well, we're not in the position to decide if parquet_fdw is production
ready, that's something you need to ask author of the extension (and
then also judge yourself).

That being said, I think FDW is probably the best way to do this. It's
explicitly designed to work with foreign data, so using it to access
parquet files seems somewhat natural.

The alternative is probably transforming the data into COPY format, and
then load it into Postgres using COPY (either as a file, or stdin).

Which of these options is the right one depends on your requirements.
FDW is more convenient, but row-based and probably significantly less
efficient than COPY. So if you have a lot of these parquet files, I'd
probably use the COPY. But maybe the ability to query the parquet files
directly (with FDW) is useful for you.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andres Freund 2019-11-05 16:18:28 Re: logical replication - negative bitmapset member not allowed
Previous Message Jehan-Guillaume de Rorthais 2019-11-05 16:05:05 Re: logical replication - negative bitmapset member not allowed