From: | Kevin Brannen <KBrannen(at)efji(dot)com> |
---|---|
To: | pinker <pinker(at)onet(dot)eu>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | RE: Loading 500m json files to database |
Date: | 2020-03-24 17:29:23 |
Message-ID: | SA0PR19MB42555877CA8D229BA3F405ACA4F10@SA0PR19MB4255.namprd19.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
From: pinker <pinker(at)onet(dot)eu>
> it's a cloud and no plpythonu extension avaiable unfortunately
You're misunderstanding him. See David's post for an example, but the point was that you can control all of this from an *external* Perl, Python, Bash, whatever program on the command line at the shell.
In pseudo-code, probably fed by a "find" command piping filenames to it:
while more files
do { read in a file name & add to list } while (list.length < 1000);
process entire list with \copy commands to 1 psql command
I've left all kinds of checks out of that, but that's the basic thing that you need, implement in whatever scripting language you're comfortable with.
HTH,
Kevin
This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.
From | Date | Subject | |
---|---|---|---|
Next Message | Rob Sargent | 2020-03-24 17:32:55 | Re: Loading 500m json files to database |
Previous Message | Jerry Sievers | 2020-03-24 16:49:24 | Re: avoid WAL for refresh of materialized view |