RE: [EXTERNAL] Re: Inserts and bad performance

From: "Godfrin, Philippe E" <Philippe(dot)Godfrin(at)nov(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: RE: [EXTERNAL] Re: Inserts and bad performance
Date: 2021-11-24 19:59:38
Message-ID: SA0PR15MB3933EA7C7372814AAEFC74DF82619@SA0PR15MB3933.namprd15.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Tom. Good point about the index paging out of the buffer. I did that and no change. I do have the shared buffers at 40GB, so there's a good bit there, but I also did all those things on the page you referred, except for using copy. At this point the data has not been scrubbed, so I'm trapping data errors and duplicates. I am curios though, as sidebar, why copy is considered faster than inserts. I was unable to get COPY faster than around 25K inserts a second (pretty fast anyway). Frankly, initially I was running 3 concurrent insert jobs and getting 90K ins/sec ! but after a certain number of records, the speed just dropped off.

pg

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Sent: Wednesday, November 24, 2021 1:32 PM
To: Godfrin, Philippe E <Philippe(dot)Godfrin(at)nov(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: [EXTERNAL] Re: Inserts and bad performance

"Godfrin, Philippe E" <Philippe(dot)Godfrin(at)nov(dot)com<mailto:Philippe(dot)Godfrin(at)nov(dot)com>> writes:
> I am inserting a large number of rows, 5,10, 15 million. The python code commits every 5000 inserts. The table has partitioned children.
> At first, when there were a low number of rows inserted, the inserts would run at a good clip - 30 - 50K inserts per second. Now, after inserting oh say 1.5 Billion rows, the insert rate has dropped to around 5000 inserts per second. I dropped the unique index , rebuilt the other indexes and no change. The instance is 16 vcpu and 64GB ram.

Can you drop the indexes and not rebuild them till after the bulk load is
done? Once the indexes exceed available RAM, insert performance is going
to fall off a cliff, except maybe for indexes that are receiving purely
sequential inserts (so that only the right end of the index gets touched).

Also see

https://www.postgresql.org/docs/current/populate.html<https://www.postgresql.org/docs/current/populate.html>

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Godfrin, Philippe E 2021-11-24 20:27:48 RE: [EXTERNAL] Re: Inserts and bad performance
Previous Message Gavin Roy 2021-11-24 19:50:16 Re: Inserts and bad performance