Re: Adding LIMIT changes PostgreSQL plan from good to a bad one

From: Michael Korbakov <rmihael(at)gmail(dot)com>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Adding LIMIT changes PostgreSQL plan from good to a bad one
Date: 2019-11-25 12:01:31
Message-ID: CACLnFeE5X3HQKD7CLBeCW5ZmGm1cmHOUz2+Ouwdr4MiybdkPmg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On November 22, 2019 at 20:28:39, Michael Lewis (mlewis(at)entrata(dot)com) wrote:

I try to avoid DISTINCT and use GROUP BY when feasible, as well as avoiding
OR condition. If you combined anon1 and anon2 with UNION ALL, and did
(inner) join instead of left, or even moved all of that to EXISTS, perhaps
that gives you better consistent performance. Something like this-

SELECT contacts.id
FROM contacts
JOIN contacts__aggregated AS contacts__aggregated_1 ON
contacts__aggregated_1.company_id = contacts.company_id
AND contacts__aggregated_1.contact_id = contacts.id
AND contacts__aggregated_1.field_name = 'names'
WHERE contacts.company_id = '\x4c2118ad54397f271b000000'

AND EXISTS (
SELECT
FROM contacts_values
WHERE contacts_values.field_id =
'\x000000000000000000000000'
AND contacts_values.field_name = 'facebook'
AND
nimble_contact_value_normalize(nimble_skip_long_values(contacts_values.value))
= nimble_contact_value_normalize('http://www.facebook.com/jon.ferrara'))
AND contacts_values.company_id = contacts.company_id
AND contacts_values.id = contacts.id
UNION ALL
SELECT
FROM contacts_values
WHERE contacts_values.field_id =
'\x000000000000000000000000'
AND contacts_values.field_name = 'last_name'
AND
nimble_contact_value_normalize(nimble_skip_long_values(contacts_values.value))
= nimble_contact_value_normalize('Ferrara')
AND contacts_values.company_id = contacts.company_id
AND contacts_values.id = contacts.id
)

ORDER BY contacts__aggregated_1.value ASC
LIMIT 30 OFFSET 0;

Hi Michael!

Thank you, your suggestion about replacing DISTINCT with GROUP BY improved
the plan for my query significantly. It is still not perfect, but at least
it is something we can live with.

However, rewriting OR with UNION does not change things. Here's the plan
for it: https://explain.depesz.com/s/c6Ec. Judging from the number of loops
it uses some form of nested loop internally.

It appears that any way of expressing OR ruins something for the planner.
My performance is great for a single criterion and for multiple criteria
joined by AND. Adding any OR into the mix results in giant JOINs and
misestimations.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message İlyas Derse 2019-11-25 14:46:29 Call Stored Procedure with inout parameters in c#
Previous Message Jill Jade 2019-11-25 11:02:26 Re: Pgadmin 4 schema visibility