Re: break table into portions for writing to separate files

From: Seb <spluque(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: break table into portions for writing to separate files
Date: 2014-05-01 18:50:22
Message-ID: 87d2fxpc41.fsf@net82.ceos.umanitoba.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 1 May 2014 20:20:23 +0200,
Francisco Olarte <folarte(at)peoplecall(dot)com> wrote:

[...]

> As you mention looping and a shell, I suppose you are in something
> unix like, with pipes et al. You can pipe COPY ( either with the pipe
> options for copy, or piping a psql command, or whichever thing you
> like ) through a script which spits ecah data chunk into its
> corresponding file. If your data is somehow clustered into the table (
> by chance or by design ) you don't even need a to sort the data, just
> use an open file pool, I did that once with call files, chunked them
> into day sized files and it worked like a charm ( and if you need the
> files sorted, you can then use sort on each of them, which normally is
> quite fast ).

> For your description of data, with a pipe, you could read a line,
> extract a key for the record ( the timestamp rounded down to 20
> minutes would be a good one ), get and open output file for append (
> using a small caching layer ) , write it.

> Depending on how many files you expect, how many RAM you have and how
> many files your OS allows you to open, other solutions exists. And if
> you do not have enough ram / openfiles / clustering for any of them
> there are multitude of tricks ( if, say, you have 3 years worth, no
> correlation, and can only open/buffer about 1000 files you could split
> from the db into day sized chunks and then split each of them into 20
> minutes ones.

Thanks, I'm glad to hear you've used this approach successfully. It
seems as though the best solution is to do a single SELECT to get the
data out of the server (it is a view with a very complex query plan
joining several other similar views), and then pipe the output through
say awk to break down into chunks for writing the files, as you
describe.

Cheers,

--
Seb

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Seb 2014-05-01 19:01:26 Re: break table into portions for writing to separate files
Previous Message Stephan Fabel 2014-05-01 18:40:17 Ubuntu Packages / Config Files