Re: Performance degradation when using auto_explain

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Performance degradation when using auto_explain
Date: 2017-01-05 00:08:25
Message-ID: a225f624-60ae-1e67-62aa-a9f975e2cd43@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 01/04/2017 08:54 PM, Kisung Kim wrote:
>
>
> On Wed, Jan 4, 2017 at 1:21 AM, Andreas Kretschmer
> <akretschmer(at)spamfence(dot)net <mailto:akretschmer(at)spamfence(dot)net>> wrote:
>
> Kisung Kim <kskim(at)bitnine(dot)net <mailto:kskim(at)bitnine(dot)net>> wrote:
>
> > And finally I found that auto_explain is the cause of the problem.
>
> real hardware or virtual hardware? On virtual there are sometimes
> problems with exact timings, please read:
>
> https://www.postgresql.org/docs/current/static/pgtesttiming.html
> <https://www.postgresql.org/docs/current/static/pgtesttiming.html>
>
>
> Thank you for your reply.
> I use real hardware.
> I am curious timing functions have contentions when multi-threads call them.
>

It's not so much about contention between threads/processes, but mostly
about the cost of actually reading data from the clock source. So even
if you run on physical hardware, the clock source may be slow. Actually,
there may be multiple clock sources available, differing in precision
and overhead.

See:

/sys/devices/system/clocksource/*/available_clocksource

On my machine I see 3 different sources "tsc hpet acpi_pm" and after
running the pg_test_timing tool, linked by Andreas, and I get this for
'tsc' clock source

Testing timing overhead for 3 seconds.
Per loop time including overhead: 29.87 nsec
Histogram of timing durations:
< usec % of total count
1 97.09159 97499400
2 2.90085 2913031
4 0.00193 1936
8 0.00089 891
16 0.00140 1405
32 0.00335 3366
64 0.00000 2

suggesting that 97% of calls took less than 1 usec, which is quite good.
For comparison, using 'hpet' gives me this:

Testing timing overhead for 3 seconds.
Per loop time including overhead: 766.92 ns
Histogram of timing durations:
< us % of total count
1 27.69558 1083389
2 71.28437 2788485
4 0.75748 29631
8 0.02886 1129
16 0.06578 2573
32 0.16755 6554
64 0.00033 13
128 0.00003 1
256 0.00003 1

Which is clearly much worse (it increased the per-loop cost from 30ns to
767ns, which is ~25x more).

So check which clock source you have selected, and test how expensive
that is. But even with a fast clock source, the additional timing
overhead may make EXPLAIN ANALYZE considerably slower. There's not much
we can do about it :-(

And to make matters worse, it may affect different plans differently
(e.g. nested loop joins do many more gettimeofday calls than other join
types, amplifying the timing costs).

But the question is whether you actually need the timing - the total
duration + row counts are far more important in my experience, so by setting

auto_explain.log_analyze = on
auto_explain.log_timing = off

you may significantly reduce the impact on query performance, while
retaining the most valuable information.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message DrakoRod 2017-01-05 00:13:49 Re: could not load library "$libdir/sslutils": in pg_upgrade process
Previous Message Merlin Moncure 2017-01-04 22:55:41 Re: Performance PLV8 vs PLPGSQL