Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Date: 2015-03-06 00:20:11
Message-ID: 54F8F2BB.2010402@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

On 5.3.2015 16:01, Gunnlaugur Thor Briem wrote:
> Hi, thanks for your follow-up questions.
>
> - postgres version is 9.1.13
> - the number of rows (in this latest instance) is 28,474,842
> - I've clustered and vacuum-full-ed and analyzed this table frequently,
> attempting to troubleshoot this. (Running vacuum full on the whole
> catalog seems a little excessive, and unlikely to help.)
> - no other processes are likely to be interfering; nothing other than
> PostgreSQL runs on this machine (except for normal OS processes and New
> Relic server monitoring service); concurrent activity in PostgreSQL is
> low-level and unrelated, and this effect is observed systematically
> whenever this kind of operation is performed on this table
> - no override for this table; the system default_statistics_target is
> 100 (the default)
> - yes, an ANALYZE is performed on the temp table after the COPY and
> before the INSERT
> - no index on the temp table (but I'm scanning the whole thing anyway).
> There are indexes on f_foo as detailed in my original post, and I expect
> the PK to make the WHERE NOT EXISTS filtering efficient (as it filters
> on exactly all columns of the PK) ... but even if it didn't, I would
> expect that to only slow down the actual insert execution, not the EXPLAIN.

The only thing I can think of is some sort of memory exhaustion,
resulting in swapping out large amounts of memory. That'd explain the
I/O load. Can you run something like vmstat to see if this really is swap?

The fact that plain INSERT does not do that contradicts that, as it
should be able to plan either both queries (plain and EXPLAIN), or none
of them.

Can you prepare a self-contained test case? I.e. a script that
demonstrates the issue? I tried to reproduce the issue using the
information provided so far, but unsuccessfully :-(

Even if you could reproduce the problem on another machine (because of
keeping the data internal) on a server with debug symbols and see where
most of the time is spent (e.g. using 'perf top'), that'd be useful.

regards

--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2015-03-06 00:44:54 Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT
Previous Message Marc Mamin 2015-03-05 21:13:47 Re: EXPLAIN (no ANALYZE) taking an hour for INSERT FROM SELECT