| From: | Srinivas Karthik V <skarthikv(dot)iitb(at)gmail(dot)com> |
|---|---|
| To: | Peter Geoghegan <pg(at)bowt(dot)ie> |
| Cc: | "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, Don Seiler <don(at)seiler(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
| Subject: | Re: Bulk Insert into PostgreSQL |
| Date: | 2018-07-03 23:34:31 |
| Message-ID: | CAEfuzeRWJufow_pn7bPJ8MB9jzUNpHn0piY6xiv+NBhTbRtFuA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
@Peter: I was indexing the primary key of all the tables in tpc-ds. Some of
the fact tables has multiple columns as part of the primary key. Also, most
of them are numeric type.
On Mon, Jul 2, 2018 at 7:09 AM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
> On Sun, Jul 1, 2018 at 5:19 PM, Tsunakawa, Takayuki
> <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com> wrote:
> > 400 GB / 15 hours = 7.6 MB/s
> >
> > That looks too slow. I experienced a similar slowness. While our user
> tried to INSERT (not COPY) a billion record, they reported INSERTs slowed
> down by 10 times or so after inserting about 500 million records. Periodic
> pstack runs on Linux showed that the backend was busy in btree operations.
> I didn't pursue the cause due to other businesses, but there might be
> something to be improved.
>
> What kind of data was indexed? Was it a bigserial primary key, or
> something else?
>
> --
> Peter Geoghegan
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alvaro Herrera | 2018-07-03 23:44:21 | Re: Cache invalidation after authentication (on-the-fly role creation) |
| Previous Message | Nikita Glukhov | 2018-07-03 23:21:08 | Re: [HACKERS] [PATCH] kNN for SP-GiST |