Re: New Copy Formats - avro/orc/parquet

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Nicolas Paris <niparisco(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: New Copy Formats - avro/orc/parquet
Date: 2018-02-11 21:19:31
Message-ID: 69a4a33f-c63d-a614-8687-c746ae015fb3@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 02/11/2018 12:57 PM, Nicolas Paris wrote:
> Le 11 févr. 2018 à 21:53, Andres Freund écrivait :
>> On 2018-02-11 21:41:26 +0100, Nicolas Paris wrote:
>>> I have also the storage and network transfers overhead in mind:
>>> All those new formats are compressed; this is not true for current
>>> postgres BINARY format and obviously text based format. By experience,
>>> the binary format is 10 to 30% larger than the text one. On the
>>> contrary, an ORC file can be up to 10 times smaller than a text base
>>> format.
>>
>> That seems largely irrelevant when arguing about using PROGRAM though,
>> right?
>>
>
> Indeed those storage and network transfers are only considered versus
> CSV/BINARY format. No link with PROGRAM aspect.
>

Just wondering what your time frame is on this? Asking because this
would be considered a new feature and so would need to be added to a
major release of Postgres. Currently work is going on for Postgres
version 11 to be released(just a guess) late Fall 2018/early Winter
2019. The CommitFest(https://commitfest.postgresql.org/) for this
release is currently approximately 3/4 of the way through. Not sure that
new code could make it in at this point. This means it would be bumped
to version 12 for 2019/2020.

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Nicolas Paris 2018-02-11 22:02:36 Re: New Copy Formats - avro/orc/parquet
Previous Message Andres Freund 2018-02-11 21:12:35 Re: New Copy Formats - avro/orc/parquet