Quick Links

Re: Importing a Large .ndjson file

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Sankar P <sankar(dot)curiosity(at)gmail(dot)com>
Cc:	pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: Importing a Large .ndjson file
Date:	2020-06-17 15:18:21
Message-ID:	1446220.1592407101@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Sankar P <sankar(dot)curiosity(at)gmail(dot)com> writes:
> I have a .ndjson file. It is a new-line-delimited JSON file. It is
> about 10GB and has about 100,000 records.
> Some sample records:
> { "key11": "value11", "key12": [ "value12.1", "value12.2"], "key13": {
> "k111": "v111" } } \n\r
> { "key21": "value21", "key22": [ "value22.1", "value22.2"] }

> What is the best way to do this on a postgresql database, deployed in
> kubernetes, with a 1 GB RAM allocated ?

It looks like plain old COPY would do this just fine, along the lines
of (in psql)

\copy myTable(content) from 'myfile.ndjson'

If the newlines actually are \n\r rather than the more usual \r\n,
you might have to clean that up to stop COPY from thinking they
represent two line endings not one.

I'd advise extracting the first hundred or so lines of the file and doing
a test import into a temporary table, just to verify the process.

regards, tom lane

In response to

Importing a Large .ndjson file at 2020-06-17 11:21:21 from Sankar P

Responses

Re: Importing a Large .ndjson file at 2020-06-18 07:46:43 from Sankar P

Browse pgsql-general by date

	From	Date	Subject
Next Message	Joshua Drake	2020-06-17 15:19:35	Re: Minor Upgrade Question
Previous Message	Jim Hurne	2020-06-17 14:42:50	Re: Sv: autovacuum failing on pg_largeobject and disk usage of the pg_largeobject growing unchecked