Re: Odd sudden performance degradation related to temp object churn

From: Jerry Sievers <gsievers19(at)comcast(dot)net>
To: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
Cc: Jeremy Finzel <finzelj(at)gmail(dot)com>, postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Odd sudden performance degradation related to temp object churn
Date: 2017-08-14 22:16:36
Message-ID: 87r2wdeqwr.fsf@jsievers.enova.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> writes:

> On Mon, Aug 14, 2017 at 2:46 PM, Jeremy Finzel <finzelj(at)gmail(dot)com> wrote:
>
>> On Mon, Aug 14, 2017 at 3:01 PM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
>> wrote:
>>>
>>> On Mon, Aug 14, 2017 at 1:53 PM, Jeremy Finzel <finzelj(at)gmail(dot)com> wrote:
>>> >
>>> > Any insights would be greatly appreciated, as we are concerned not
>>> > knowing
>>> > the root cause.
>>>
>>> How are your disks setup? One big drive with everything on it?
>>> Separate disks for pg_xlog and pg's data dir and the OS logging? IO
>>> contention is one of the big killers of db performance.
>>
>>
>> It's one san volume ssd for the data and wal files. But logging and memory
>> spilling and archived xlogs go to a local ssd disk.
>>
>>>
>>> Logging likely isn't your problem, but yeah you don't need to log
>>> ERRYTHANG to see the problem either. Log long running queries temp
>>> usage, buffer usage, query plans on slow queries, stuff like that.
>>>
>>> You've likely hit a "tipping point" in terms of data size. Either it's
>>> cause the query planner to make a bad decision, or you're spilling to
>>> disk a lot more than you used to.
>>>
>>> Be sure to log temporary stuff with log_temp_files = 0 in your
>>> postgresql.conf and then look for temporary file in your logs. I bet
>>> you've started spilling into the same place as your temp tables are
>>> going, and by default that's your data directory. Adding another drive
>>> and moving pgsql's temp table space to it might help.
>>
>>
>> We would not have competition between disk spilling and temp tables because
>> what I described above - they are going to two different places. Also, I
>> neglected to mention that we turned on auto-explain during this crisis, and
>> found the query plan was good, it was just taking forever due to thrashing
>> just seconds after we kicked off the batches. I did NOT turn on log_analyze
>> and timing but it was enough to see there was no apparent query plan
>> regression. Also, we had no change in the performance/plan after
>> re-analyzing all tables.
>
> You do know that temp tables go into the default temp table space,
> just like sorts, right?

Not so.

This system has no defined temp_tablespace however spillage due to
sorting/hashing that exceeds work_mem goes to base/pgsql_tmp which we
have symlinked out to a local SSD drive.

We do run a few of our other systems with temp_tablespace defined and
for these the heap/index files do share same volume as other temp usage.

Thx

>
> Have you used something like iostat to see which volume is getting all the IO?
>
>>
>>>
>>> Also increasing work_mem (but don't go crazy, it's per sort, so can
>>> multiply fast on a busy server)
>>
>>
>> We are already up at 400MB, and this query was using memory in the low KB
>> levels because it is very small (1 - 20 rows of data per temp table, and no
>> expensive selects with missing indexes or anything).
>
> Ahh so it doesn't sound like it's spilling to disk then. Do the logs
> say yes or no on that?
>
> Basically use unix tools to look for where you're thrashing. iotop can
> be handy too.

--
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres(dot)consulting(at)comcast(dot)net
p: 312.241.7800

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Peter Geoghegan 2017-08-14 22:43:40 Re: Odd sudden performance degradation related to temp object churn
Previous Message Scott Marlowe 2017-08-14 21:02:22 Re: Odd sudden performance degradation related to temp object churn