Quick Links

Re: Using COPY to import large xml file

From:	Tim Cross <theophilusx(at)gmail(dot)com>
To:	Anto Aravinth <anto(dot)aravinth(dot)cse(at)gmail(dot)com>
Cc:	Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject:	Re: Using COPY to import large xml file
Date:	2018-06-25 22:10:29
Message-ID:	8736xaseju.fsf@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Anto Aravinth <anto(dot)aravinth(dot)cse(at)gmail(dot)com> writes:

> Thanks a lot. But I do got lot of challenges! Looks like SO data contains
> lot of tabs within itself.. So tabs delimiter didn't work for me. I thought
> I can give a special demiliter but looks like Postrgesql copy allow only
> one character as delimiter :(
>
> Sad, I guess only way is to insert or do a through serialization of my data
> into something that COPY can understand.
>

The COPY command has a number of options, including setting what is used
as the delimiter - it doesn't have to be tab. You need to also look at
the logs/output to see exactly why the copy fails.

I'd recommend first pre-processing your input data to make sure it is
'clean' and all the fields actually match with whatever DDL you have
used to define your db tables etc. I'd then select a small subset and
try different parameters to the copy command until you get the right
combination of data format and copy definition.

It may take some effort to get the right combination, but the result is
probably worth it given your data set size i.e. difference between hours
and days.

--
Tim Cross

In response to

Re: Using COPY to import large xml file at 2018-06-25 14:25:25 from Anto Aravinth

Responses

Re: Using COPY to import large xml file at 2018-06-26 15:08:05 from Anto Aravinth

Browse pgsql-general by date

	From	Date	Subject
Next Message	Data Ace	2018-06-26 00:34:50	Re: PostgreSQL Volume Question
Previous Message	Jeff Janes	2018-06-25 21:54:15	Re: DB size growing exponentially when materialized view refreshed concurrently (postgres 9.6)