Re: Big data INSERT optimization - ExclusiveLock on extension of the table

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: pinker <pinker(at)onet(dot)eu>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Big data INSERT optimization - ExclusiveLock on extension of the table
Date: 2016-08-19 15:01:11
Message-ID: CAHyXU0x85omPv2P=pcDSPsd_9kDNkyzsPwQjLEWHKyGWdiz-2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Aug 17, 2016 at 6:45 AM, pinker <pinker(at)onet(dot)eu> wrote:
> Hi,
> the problem I'm dealing with is long holding locks during extensions of
> table:
> process xxx still waiting for ExclusiveLock on extension of relation xxx of
> database xxx after 3000.158 ms
> My application is write intensive, in one round I need to insert about 1M
> rows. The general scheme of the process looks as follows:
> 1. rename table t01 to t02
> 2. insert into t02 1M rows in chunks for about 100k
> 3. from t01 (previously loaded table) insert data through stored procedure
> to b01 - this happens parallel in over a dozen sessions
> 4. truncate t01
>
> Some data:
> PostgreSQL version 9.5
>
> commit_delay | 0
> | Sets the delay in microseconds between transaction commit and flushing WAL
> to disk.
> checkpoint_completion_target | 0.9
> | Time spent flushing dirty buffers during checkpoint, as fraction of
> checkpoint interval
> maintenance_work_mem | 2GB
> | Sets the maximum memory to be used for maintenance operations.
> shared_buffers | 2GB
>
> wal_block_size | 8192
> | Shows the block size in the write ahead log.
> wal_buffers | 16MB
> | Sets the number of disk-page buffers in shared memory for WAL.
> wal_compression | off
> | Compresses full-page writes written in WAL file.
> wal_keep_segments | 0
> | Sets the number of WAL files held for standby servers.
> wal_level | minimal
> | Set the level of information written to the WAL.
> wal_log_hints | off
> | Writes full pages to WAL when first modified after a checkpoint, even for
> a non-critical modifications.
> wal_receiver_status_interval | 10s
> | Sets the maximum interval between WAL receiver status reports to the
> primary.
> wal_receiver_timeout | 1min
> | Sets the maximum wait time to receive data from the primary.
> wal_retrieve_retry_interval | 5s
> | Sets the time to wait before retrying to retrieve WAL after a failed
> attempt.
> wal_segment_size | 16MB
> | Shows the number of pages per write ahead log segment.
> wal_sender_timeout | 1min
> | Sets the maximum time to wait for WAL replication.
> wal_sync_method | fdatasync
> | Selects the method used for forcing WAL updates to disk.
> wal_writer_delay | 200ms
> | WAL writer sleep time between WAL flushes.
> work_mem | 32MB
> | Sets the maximum memory to be used for query workspaces.
>
> Checkpoints occur every ~ 30sec.
>
> Following the advices from this mailing list shared buffers size was changed
> from 12 to 2GB but nothing has changed.
>
> I'm not sure or my bottleneck is the I/O subsystem or there is anything else
> I can do to make it faster? What I came up with is (but I'm not sure if any
> of this makes sense):
> * change settings for bgwriter/wal?
> * make sure huge pages are in use by changing huge_pages parameter to on
> * replace truncate with DROP/CREATE command?
> * turning off fsync for loading?
> * increase commit_delay value?
> * move temporary tables to a different tablespace
>
> Your advice or suggestions will be much appreciated.

Here's how I do it:
CREATE TABLE t_new (LIKE t INCLUDING ALL);
<insert from n threads to t_new>

BEGIN;
DROP TABLE t;
ALTER TABLE t_new RENAME to t;
<recreate views etc as needed>
COMMIT;

If moving multiple tables in a single transaction I do a looped lock
probe with NOWAIT to avoid deadlocks. Postgres deadlock resolution
behavior is such that longer running processes seem to get killed
first; in these scenarios it seems to almost always kill the one you
*don't* want killed :-).

This strategy will even work in complicated scenarios, for example
partitioned tables; you can build up the partition on the side and
swap in the the new one over the old one in a transaction.

The above is all about avoiding locks. If your problem is i/o bound,
here are some general strategies to improve insert performance:
*) UNLOGGED tables (beware: no data on spurious restart)
*) synchronous_commit =false
*) ensure shared_buffers is high enough (too low and you get checkpoints)

merlin

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message debasis.moharana 2016-08-20 08:27:09 pgsql-performance issue
Previous Message Jim Nasby 2016-08-19 14:13:20 Re: Re: Big data INSERT optimization - ExclusiveLock on extension of the table