Re: Improve COPY performance into table with indexes.

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: James Brauman <james(dot)brauman(at)envato(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Improve COPY performance into table with indexes.
Date: 2020-04-03 15:52:21
Message-ID: fb5477a3-c46a-e7f1-433c-444774942f31@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 4/2/20 9:42 PM, James Brauman wrote:
> I am using the COPY command to insert 10 million rows from a CSV file
> into a database table and I am finding the performance is
> unacceptable. When the COPY command is executed the disk I/O is
> extremely high which leads to degraded query performance for other
> queries being executed on the database.
>
> I have tried removing the indexes from the database table and this
> dramatically improved performance (both reducing the execution time
> and reducing disk I/O).

Is there the option to use tablespaces to move the index(s) to another
disk?:

https://www.postgresql.org/docs/12/sql-alterindex.html

"
SET TABLESPACE

This form changes the index's tablespace to the specified
tablespace and moves the data file(s) associated with the index to the
new tablespace. ..."

>
> The constraints I am working with are:
> - Indexes must be in place to ensure that other queries that are
> executed on the table while the COPY command is executing have
> acceptable performance.
> - The table should not be locked for reads during the COPY (it is
> fine to lock for writes).
>
> I don't know how COPY is implemented, but it seems that the indexes
> are updated as data is inserted into the table. Ideally I would like
> to delay updating the indexes until the COPY command has completed and
> just update the indexes a single time, I think this would lead to much
> better performance. Is this possible?
>
> Another idea that I had was to start a transaction, drop the indexes,
> run the COPY command and then recreate the indexes. The problem with
> this approach is that DROP INDEX acquires an exclusive lock on the
> table, which is not acceptable in this system (otherwise other queries
> won't be able to execute). I tried using DROP INDEX CONCURRENTLY, but
> that cannot be executed inside a transaction.
>
> What are my options in this scenario?
>
> Thanks
>
>

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rob Sargent 2020-04-03 16:08:04 Re: Improve COPY performance into table with indexes.
Previous Message Adrian Klaver 2020-04-03 15:39:53 Re: Cstore_fdw issue.