Re: Slow Query

From: robbyc <robcampbell73(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Slow Query
Date: 2015-08-13 00:50:39
Message-ID: CA+emuv+t0qxtY0o5GUvWjkkyH0P9R4NSF1Q2xP1BODsQ9pPGDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Vik,

Thanks for your feedback, very helpful.

I modified your query slightly, this will return all vacancy templates and
all level 1 vacancies which arent templates, and does so in about
~800-900ms less, an great improvement on the original query.

SELECT "Vacancy"."ID",
"Vacancy"."JobTitle",
"Vacancy"."DateCreated",
"Vacancy"."CustomAccess",
"Department"."Name" as "Department",
list("Occupation"."Name") as "Occupation",
"Vacancy"."PositionNo",
"Vacancy"."Template"
FROM "Vacancy"
JOIN "CategoryOption_TableRow" as "c_50" ON (
"c_50"."Category_TableID"= 50
AND "c_50"."RowID" = "Vacancy"."ID"
AND "c_50"."CategoryOptionID"=19205)
LEFT JOIN "CategoryOption_TableRow" as "c_22" ON (
"c_22"."RowID" = "Vacancy"."ID"
AND "c_22"."Category_TableID" = 22)
LEFT JOIN "CategoryOption" as "Occupation" ON (
"Occupation"."ID" = "c_22"."CategoryOptionID")
LEFT JOIN "TableRow_TableRow" as "t_33" ON (
"t_33"."Table1RowID" = "Vacancy"."ID"
AND "t_33"."Table_TableID" = 33)
LEFT JOIN "Department" ON (
"Department"."ID" = "t_33"."Table2RowID"
AND "Department"."Active" = 't'
AND "Department"."ClientID" = 263)
WHERE "Vacancy"."ClientID" = 263
AND NOT EXISTS (
SELECT 1
FROM "Vacancy" as "v"
JOIN "CategoryOption_TableRow" "ct126" on (
"ct126"."Category_TableID" = 126
AND "RowID" = "v"."ID")
WHERE "v"."Template"
AND "ct126"."CategoryOptionID" IN (34024,35254,35255,35256)
AND "v"."ID" = "Vacancy"."ID")
AND ("Vacancy"."Template" OR ("Vacancy"."Template" = 'f' AND
"Vacancy"."Level" = 1))
GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated",
"Vacancy"."CustomAccess", "Department"."Name",
"Vacancy"."PositionNo", "Vacancy"."Template"

On Wed, Aug 12, 2015 at 9:35 PM, Vik Fearing-3 [via PostgreSQL] <
ml-node+s1045698n5861873h96(at)n5(dot)nabble(dot)com> wrote:

> On 08/12/2015 04:34 AM, robbyc wrote:
> > Hi,
> >
> > I am new to optimizing queries and i'm getting a slow running time
> > (~1.5secs) with the following SQL:
>
> Before mucking about with work_mem and indexes, the first thing to do is
> rewrite this query correctly. Here are just some of the things wrong
> with the query as written:
>
> * You're doing a DISTINCT on the same set of columns also in a GROUP BY.
> This is redundant and causes needless deduplication.
>
> * You're joining two GROUPed BY then DISTINCTed queries using the UNION
> operator which will do yet another pass for deduplication.
>
> * You've got the entire query repeated for just a simple difference in
> the global WHERE clause. These can be merged.
>
> * You've kept LEFT JOINs in the subquery but you don't use any values
> from them. These can be safely removed altogether.
>
> * You're using a NOT IN clause which is almost never what you want. Use
> NOT EXISTS instead.
>
> What is this list() function? How is it defined? Can it be replaced
> with string_agg()?
>
> You're not doing yourself any favors at all with all this quoting and
> mixed case stuff.
>
> Here is a rewritten version, please let me know how it performs:
>
> SELECT "Vacancy"."ID",
> "Vacancy"."JobTitle",
> "Vacancy"."DateCreated",
> "Vacancy"."CustomAccess",
> "Department"."Name" as "Department",
> list("Occupation"."Name") as "Occupation",
> "Vacancy"."PositionNo",
> "Vacancy"."Template"
> FROM "Vacancy"
> JOIN "CategoryOption_TableRow" as "c_50" ON (
> "c_50"."Category_TableID"= 50
> AND "c_50"."RowID" = "Vacancy"."ID"
> AND "c_50"."CategoryOptionID"=19205)
> LEFT JOIN "CategoryOption_TableRow" as "c_22" ON (
> "c_22"."RowID" = "Vacancy"."ID"
> AND "c_22"."Category_TableID" = 22)
> LEFT JOIN "CategoryOption" as "Occupation" ON (
> "Occupation"."ID" = "c_22"."CategoryOptionID")
> LEFT JOIN "TableRow_TableRow" as "t_33" ON (
> "t_33"."Table1RowID" = "Vacancy"."ID"
> AND "t_33"."Table_TableID" = 33)
> LEFT JOIN "Department" ON (
> "Department"."ID" = "t_33"."Table2RowID"
> AND "Department"."Active" = 't'
> AND "Department"."ClientID" = 263)
> WHERE "Vacancy"."ClientID" = 263
> AND NOT EXISTS (
> SELECT 1
> FROM "Vacancy" as _Vacancy
> JOIN "CategoryOption_TableRow" "ct126" on (
> "ct126"."Category_TableID" = 126
> AND "RowID" = _Vacancy."ID")
> WHERE _Vacancy."Template"
> AND "ct126"."CategoryOptionID" IN (34024,35254,35255,35256)
> AND _Vacancy."ID" = "Vacancy"."ID")
> AND ("Vacancy"."Template" = 't' OR "Vacancy"."Level" = 1)
> GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated",
> "Vacancy"."CustomAccess", "Department"."Name",
> "Vacancy"."PositionNo", "Vacancy"."Template"
>
>
> > SELECT distinct("Vacancy"."ID"), "Vacancy"."JobTitle",
> > "Vacancy"."DateCreated", "Vacancy"."CustomAccess"
> > , "Department"."Name" as "Department", list("Occupation"."Name") as
> > "Occupation", "Vacancy"."PositionNo"
> > , "Vacancy"."Template" from
> > "Vacancy"
> > LEFT JOIN "CategoryOption_TableRow" as "c_22"
> > ON ("c_22"."RowID" = "Vacancy"."ID"
> > and "c_22"."Category_TableID" = 22)
> > LEFT JOIN "CategoryOption" as "Occupation"
> > ON ("Occupation"."ID" = "c_22"."CategoryOptionID")
> > LEFT JOIN "TableRow_TableRow" as "t_33"
> > ON ("t_33"."Table1RowID" = "Vacancy"."ID"
> > and "t_33"."Table_TableID" = 33 )
> > LEFT JOIN "Department"
> > ON ("Department"."ID" = "t_33"."Table2RowID" and
> > "Department"."Active" = 't' and "Department"
> > ."ClientID" = 263)
> > JOIN "CategoryOption_TableRow" as "c_50" ON ("c_50"."Category_TableID"=
> 50
> > and "c_50"."RowID" = "Vacancy"
> > ."ID" and "c_50"."CategoryOptionID"=19205)
> > WHERE "Vacancy"."ClientID" = 263 AND "Vacancy"."ID" NOT IN(SELECT
> > DISTINCT("Vacancy"."ID")
> > FROM "Vacancy" join "CategoryOption_TableRow" "ct126" on
> > ("ct126"."Category_TableID" = 126
> > and "RowID" = "Vacancy"."ID")
> > left join "Workflow" on ("Workflow"."VacancyID" =
> "Vacancy"."ID"
> > and "Workflow"."Level"
> > = 1)
> > left join "CategoryOption_TableRow" "c30" on ("c30"."RowID"
> =
> > "Workflow"."ID" and "c30"."Category_TableID"
> > = 30 and "c30"."CategoryOptionID" = 21923)
> > WHERE "Template" AND "ct126"."CategoryOptionID"
> > IN(34024,35254,35255,35256)) and "Vacancy"
> > ."Template" = 't'
> > GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated",
> > "Vacancy"."CustomAccess", "Department"
> > ."Name", "Vacancy"."PositionNo", "Vacancy"."Template"
> > UNION
> > SELECT distinct("Vacancy"."ID"), "Vacancy"."JobTitle",
> > "Vacancy"."DateCreated", "Vacancy"."CustomAccess"
> > , "Department"."Name" as "Department", list("Occupation"."Name") as
> > "Occupation", "Vacancy"."PositionNo"
> > , "Vacancy"."Template" from
> > "Vacancy"
> > LEFT JOIN "CategoryOption_TableRow" as "c_22"
> > ON ("c_22"."RowID" = "Vacancy"."ID"
> > and "c_22"."Category_TableID" = 22)
> > LEFT JOIN "CategoryOption" as "Occupation"
> > ON ("Occupation"."ID" = "c_22"."CategoryOptionID")
> > LEFT JOIN "TableRow_TableRow" as "t_33"
> > ON ("t_33"."Table1RowID" = "Vacancy"."ID"
> > and "t_33"."Table_TableID" = 33 )
> > LEFT JOIN "Department"
> > ON ("Department"."ID" = "t_33"."Table2RowID" and
> > "Department"."Active" = 't' and "Department"
> > ."ClientID" = 263)
> > JOIN "CategoryOption_TableRow" as "c_50" ON ("c_50"."Category_TableID"=
> 50
> > and "c_50"."RowID" = "Vacancy"
> > ."ID" and "c_50"."CategoryOptionID"=19205)
> > WHERE "Vacancy"."ClientID" = 263 AND "Vacancy"."ID" NOT IN(SELECT
> > DISTINCT("Vacancy"."ID")
> > FROM "Vacancy" join "CategoryOption_TableRow" "ct126" on
> > ("ct126"."Category_TableID" = 126
> > and "RowID" = "Vacancy"."ID")
> > left join "Workflow" on ("Workflow"."VacancyID" =
> "Vacancy"."ID"
> > and "Workflow"."Level"
> > = 1)
> > left join "CategoryOption_TableRow" "c30" on ("c30"."RowID"
> =
> > "Workflow"."ID" and "c30"."Category_TableID"
> > = 30 and "c30"."CategoryOptionID" = 21923)
> > WHERE "Template" AND "ct126"."CategoryOptionID"
> > IN(34024,35254,35255,35256)) and "Vacancy"
> > ."Template" <> 't' AND "Vacancy"."Level" = 1
> > GROUP BY "Vacancy"."ID", "Vacancy"."JobTitle", "Vacancy"."DateCreated",
> > "Vacancy"."CustomAccess", "Department"
> > ."Name", "Vacancy"."PositionNo", "Vacancy"."Template"
> > ORDER BY "JobTitle"
> >
> > Running explain analyze gives me the following information:
> > http://explain.depesz.com/s/pdC <http://explain.depesz.com/s/pdC>
> >
> > For a total runtime: 2877.157 ms
> >
> > If i remove the left joins on Department and TableRow_TableRow this
> reduces
> > the run time by about a third.
> > Additionally removing CategoryOption and CategoryOption_TableRow joins
> > further reduces by a about a third.
> >
> > Given that i need both these joins for the information retrieved by
> them,
> > what would be the best way to re-factor this query so it runs faster?
> >
> > Looking at the output of explain analyze the hash aggregates and sort
> seem
> > to be the primary issue.
> >
> > Thanks in advance
>
>
> --
> Vik Fearing +33 6 46 75 15 36
> http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
>
>
> --
> Sent via pgsql-performance mailing list ([hidden email]
> <http:///user/SendEmail.jtp?type=node&node=5861873&i=0>)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
>
> ------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://postgresql.nabble.com/Slow-Query-tp5861835p5861873.html
> To unsubscribe from Slow Query, click here
> <http://postgresql.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=5861835&code=cm9iY2FtcGJlbGw3M0BnbWFpbC5jb218NTg2MTgzNXwxOTc1MDc2ODM4>
> .
> NAML
> <http://postgresql.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>

--
Regards

Robert Campbell
+61412062971
robcampbell73(at)gmail(dot)com

--
View this message in context: http://postgresql.nabble.com/Slow-Query-tp5861835p5861961.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message 林士博 2015-08-13 09:21:23 Re: Slow Query
Previous Message Vik Fearing 2015-08-12 11:34:44 Re: Slow Query