Re: Very slow query (3-4mn) on a table with 25millions rows

From: Félix GERZAGUET <felix(dot)gerzaguet(at)gmail(dot)com>
To: Abadie Lana <Lana(dot)Abadie(at)iter(dot)org>
Cc: Martín Marqués <martin(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Very slow query (3-4mn) on a table with 25millions rows
Date: 2016-07-27 09:15:49
Message-ID: CANVwZts5xFa9Wioz46eJfATtGrU_U0sR4wP0Y3LkR02fNgmULQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello Lana,

On Wed, Jul 27, 2016 at 8:03 AM, Abadie Lana <Lana(dot)Abadie(at)iter(dot)org> wrote:

> Here the result of explain (analyse, buffer). Thanks for your help and let
> me know if you need more information.

I noticed 3 things in your query:

1. In the second part (after the except), the 2 tables utva and utv are not
joined against the others table. Is there a missing join somewhere ?

Let that snipset:

select s.attvalue
from functionalvarattributes s
, tags t
, variableattributetypes vat
where t.id=s.tag_id
and t.status!='Internal'
and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
and vat.id=s.atttype_id
and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
from
functionalvariables e
,
usertemplatevariable ut
where
e.usertemplatevar_id=ut.id
and
ut.usertempl_id=15
)

be called A

Let that snipset:

select *
from usertemplvarattribute utva
, usertemplatevariable utv
where utv.id=utva.usertempvariable_fk
and utv.usertempl_id=15

be called B

Then you query is:

A
except
A CROSS JOIN B

If B is not the empty set, than the above query is guaranteed to always
have 0 row.

2. Assuming your query is right (even if I failed to understand its point),
we could only do the A snipset once instead of twice using a with clause as
in:

with filtered_s as (
select s.attvalue
from functionalvarattributes s
, tags t
, variableattributetypes vat
where t.id=s.tag_id
and t.status!='Internal'
and vat.fieldtype in ('DBF_INLINK','DBF_OUTLINK','DBF_FWDLINK')
and vat.id=s.atttype_id
and split_part(split_part(s.attvalue,' ',1),'.',1) in ( select e.name
from
functionalvariables e
,
usertemplatevariable ut
where
e.usertemplatevar_id=ut.id
and
ut.usertempl_id=15
)
)
select s.attvalue
from filtered_s s
except
select s.attvalue
from filtered_s s
, usertemplvarattribute utva
, usertemplatevariable utv
where utv.id=utva.usertempvariable_fk
and utv.usertempl_id=15
;

This rewritten query should run about 2x. faster.

3. The planner believe that the e.name subselect will give 4926 rows
(instead of 16 in reality), due to this wrong estimate it will consider the
vat_funcvaratt_multi_idx index as not usefull. I don't know how to give the
planner more accurate info ...

--
Félix

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Félix GERZAGUET 2016-07-27 09:36:44 Re: Very slow query (3-4mn) on a table with 25millions rows
Previous Message Abadie Lana 2016-07-27 06:03:22 Re: Very slow query (3-4mn) on a table with 25millions rows