From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
---|---|
To: | Srinivas Karthik V <skarthikv(dot)iitb(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bulk Insert into PostgreSQL |
Date: | 2018-06-27 11:25:06 |
Message-ID: | CAFj8pRA+3xfHkTWnEr89xXWhC9dBMMF9a1z5ZdaQMmEemghDkg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
2018-06-27 13:18 GMT+02:00 Srinivas Karthik V <skarthikv(dot)iitb(at)gmail(dot)com>:
> Hi,
> I am performing a bulk insert of 1TB TPC-DS benchmark data into PostgreSQL
> 9.4. It's taking around two days to insert 100 GB of data. Please let me
> know your suggestions to improve the performance. Below are the
> configuration parameters I am using:
> shared_buffers =12GB
> maintainence_work_mem = 8GB
> work_mem = 1GB
> fsync = off
> synchronous_commit = off
> checkpoint_segments = 256
> checkpoint_timeout = 1h
> checkpoint_completion_target = 0.9
> checkpoint_warning = 0
> autovaccum = off
> Other parameters are set to default value. Moreover, I have specified the
> primary key constraint during table creation. This is the only possible
> index being created before data loading and I am sure there are no other
> indexes apart from the primary key column(s).
>
The main factor is using COPY instead INSERTs.
load 100GB database should to get about few hours, not two days.
Regards
Pavel
> Regards,
> Srinivas Karthik
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | ROS Didier | 2018-06-27 11:46:38 | RE: Bulk Insert into PostgreSQL |
Previous Message | Pavel Stehule | 2018-06-27 11:22:10 | Re: [HACKERS] proposal: schema variables |