From: | Ben Snaidero <bensnaidero(at)geotab(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Many DataFileRead - IO waits |
Date: | 2020-03-02 22:39:37 |
Message-ID: | CAEPE5bPn=1n+=QTQYG-+qdquW_Pf9ekycSfDjk+bNKpHAte7sw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Sat, Feb 29, 2020 at 11:22 AM Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Thu, Feb 27, 2020 at 11:33 AM Ben Snaidero <bensnaidero(at)geotab(dot)com>
> wrote:
>
>
>> I have the following query that was on average running in ~2ms suddenly
>> jump up to on average ~25ms.
>>
>
> What are you averaging over? The plan you show us is slow enough that if
> you were averaging over the last 1000 executions, that one execution could
> skew the entire average just by itself. When individual execution times
> can vary over 4 powers of 10, I don't think averages are a very good way of
> analyzing things.
>
>
>
>> This query is called millions of time per day and there were cases of the
>> query taking 20-30 seconds. Below is the explain analyze of one such
>> example.
>> When seeing this issue, the server was under some CPU pressure but even
>> with that, I would not think it should get as slow as shown below as we are
>> using SSDs and none of the windows disk counters (IOPS, queue length) show
>> any value that would be of concern.
>>
>
> What is the average and expected random read latency on your SSDs? Have
> you benchmarked them (outside of the database system) to see if they are
> performing as expected?
>
>
>> Rows Removed by Filter: 91686
>> Buffers: shared hit=12102 read=13259 written=111
>>
>
> Do the faster executions have fewer rows removed by the filter (and fewer
> buffers read), or are they just faster despite having about the same values?
>
>
>
>> We took a perfview during the issue and below is the call stack from a
>> process running this query, two call paths are shown.
>>
>
> I've never used perfview. But if I try to naively interpret it similar to
> gdb backtrace, it doesn't make much sense to me. InitBufTable is only
> called by "postmaster" while starting the database, how could it be part of
> call paths during regular operations? Are these views of the slow-running
> back end itself, or of some other postgresql process which was idle at the
> time the snapshot was taken?
>
> Cheers,
>
> Jeff
>
Query statistics were averaged over ~3million calls so I don't think a
single outlier would skew the results too much.
The perfview call stack is similar to gdb backtrace. I am 99% sure that
this call path is from the backend running this query as we queried
pg_stat_activity at the time of the perfview and cross-referenced the
PIDs. That said I am going to try building on windows with debug symbols
enabled and see if I can use gdb to debug and confirm.
From | Date | Subject | |
---|---|---|---|
Next Message | legrand legrand | 2020-03-02 23:09:53 | Re: Many DataFileRead - IO waits |
Previous Message | Ben Snaidero | 2020-03-02 22:31:02 | Re: Many DataFileRead - IO waits |