Re: Predicting query runtime

From: Istvan Soos <istvan(dot)soos(at)gmail(dot)com>
To: Vinicius Segalin <vinisegalin(at)gmail(dot)com>
Cc: pgsql general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Predicting query runtime
Date: 2016-09-13 19:54:14
Message-ID: CALdQGguuuVm0-WtDKy+_pDjghXbRxEWULAbn-t+26q2VTkvSPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Sep 13, 2016 at 2:06 AM, Vinicius Segalin <vinisegalin(at)gmail(dot)com> wrote:
> 2016-09-12 18:22 GMT-03:00 Istvan Soos <istvan(dot)soos(at)gmail(dot)com>:
>> At Heap we have non-trivial complexity in our analytical queries, and
>> some of them can take a long time to complete. We did analyze features
>> like the query planner's output, our query properties (type,
>> parameters, complexity) and tried to automatically identify factors
>> that contribute the most into the total query time. It turns out that
>> you don't need to use machine learning for the basics, but at this
>> point we were not aiming for predictions yet.
>
> And how did you do that? Manually analyzing some queries?

In this case, it was automatic analysis and feature discovery. We were
generating features out of our query parameters, out of the SQL
string, and also out of the explain analyze output. For each of these
features, we have examined the P(query is slow | feature is present),
and measured its statistical properties (precision, recall,
correlations...).

With these we have built a decision tree-based partitioning, where our
feature-predicates divided the queries into subsets. Such a tree could
be used for predictions, or if we would like to be fancy, we could use
the feature vectors to train a neural network.

Hope this helps for now,
Istvan

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message carpenterc17 2016-09-13 19:55:24 Testers Needed For New Database Monitoring Platform
Previous Message Steve Crawford 2016-09-13 18:23:25 Re: Installing 9.6 RC on Ubuntu [Solved]