Re: Loading 500m json files to database

From: Rob Sargent <robjsargent(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Loading 500m json files to database
Date: 2020-03-24 17:32:55
Message-ID: 3f30c24a-d5b0-8b20-b626-5a674bb9b2f1@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 3/24/20 11:29 AM, Kevin Brannen wrote:
> From: pinker <pinker(at)onet(dot)eu>
>
>> it's a cloud and no plpythonu extension avaiable unfortunately
>
> You're misunderstanding him. See David's post for an example, but the point was that you can control all of this from an *external* Perl, Python, Bash, whatever program on the command line at the shell.
>
> In pseudo-code, probably fed by a "find" command piping filenames to it:
>
> while more files
> do { read in a file name & add to list } while (list.length < 1000);
> process entire list with \copy commands to 1 psql command
>
> I've left all kinds of checks out of that, but that's the basic thing that you need, implement in whatever scripting language you're comfortable with.
>
> HTH,
> Kevin

Sorry if I missed it, but have we seen the size range of these json files?
> This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Brannen 2020-03-24 17:53:31 RE: Loading 500m json files to database
Previous Message Kevin Brannen 2020-03-24 17:29:23 RE: Loading 500m json files to database