How to import Apache parquet files?

From: Softwarelimits <softwarelimits(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: How to import Apache parquet files?
Date: 2019-11-05 14:56:26
Message-ID: CALnJc4WZ1EYO29W9wSAP4ZQOf0BTD11ZPsydg422tfKOshufjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi, I need to come and ask here, I did not find enough information so I
hope I am just having a bad day or somebody is censoring my search results
for fun... :)

I would like to import (lots of) Apache parquet files to a PostgreSQL 11
cluster - yes, I believe it should be done with the Python pyarrow module,
but before digging into the possible traps I would like to ask here if
there is some common, well understood and documented tool that may be
helpful with that process?

It seems that the COPY command can import binary data, but I am not able to
allocate enough resources to understand how to implement a parquet file
import with that.

I really would like follow a person with much more knowledge than me about
either PostgreSQL or Apache parquet format instead of inventing a bad
wheel.

Any hints very welcome,
thank you very much for your attention!
John

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Frost 2019-11-05 15:00:35 Re: v12 and pg_restore -f-
Previous Message Tom Lane 2019-11-05 14:46:00 Re: v12 and pg_restore -f-