| From: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
|---|---|
| To: | bill(dot)poole(at)ymail(dot)com, pgsql-performance(at)postgresql(dot)org |
| Subject: | Re: Bulk DML performance |
| Date: | 2025-03-13 11:28:09 |
| Message-ID: | d58bb88b06a40233aef8e07a39fd234458abc009.camel@cybertec.at |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
On Thu, 2025-03-13 at 18:13 +0800, bill(dot)poole(at)ymail(dot)com wrote:
>
> it is noteworthy that inserting 27 MB of data into a newly created table creates
> 191 MB of data including the index and 127 MB of data excluding the index.
PostgreSQL has a lot of overhead per row.
>
> Can you help me understand why performing 3 million lookups on a b-tree index
> with all pages cached in memory takes so long?
It is probably not the lookup, but the *modification* of the index that is slow.
>
> It seems like deleting 3 million rows identified by 3 million IDs should be
> faster than updating 3 million rows (also identified by 3 million IDs).
It should be, yes.
To find out where the time is spent, use EXPLAIN (ANALYZE, BUFFERS) on the
statement.
> > To get better performance, the best I can think of is to parallelize loading
> > the data until you saturate CPU, disk or hit internal contention in the database.
>
> Sadly, I cannot do that because I need all rows to be inserted in a single
> database transaction, which I cannot do over multiple database connections.
Then the best you can do is to use COPY rather than INSERT.
It will perform better (but now vastly better).
Yours,
Laurenz Albe
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Laurenz Albe | 2025-03-13 11:29:23 | Re: Bulk DML performance |
| Previous Message | bill.poole | 2025-03-13 10:13:08 | RE: Bulk DML performance |