From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | "Erdmann, Markus (at) Bellevue" <Markus(dot)Erdmann(at)cbre(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Regression in query optimizer when upgrading from 9.4.5 to 9.5.2? |
Date: | 2016-06-17 18:08:43 |
Message-ID: | CAKFQuwZ5yyz0b-9EhSM4p0qyRFwAFt8qRR7UPVAHtBTnJ1YvJA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Jun 16, 2016 at 8:40 PM, Erdmann, Markus @ Bellevue <
Markus(dot)Erdmann(at)cbre(dot)com> wrote:
> Hello,
>
> We’re trying to debug a performance issue affecting our staging database,
> and we’ve narrowed it down to a difference in the query optimizer in 9.5.2.
> Upgrading to 9.5 is important for us because we need the ability to import
> foreign schemas.
>
> This is the query we’re running:
>
> CREATE TEMP TABLE tmp_joined_transactions_9gkgptn5xcp9 ( transaction_id
> integer PRIMARY KEY );
>
> INSERT INTO tmp_joined_transactions_9gkgptn5xcp9 (transaction_id)
>
This...
> SELECT DISTINCT ON ("transactions_transaction"."id")
> "transactions_transaction"."id"
>
DISTINCT is a code smell. DISTINCT ON less so - it helps to avoid
self-joins - but your inclusion of ON here is pointless since the only
output column is "id".
As written there should be no way to get duplicate "id"s into the output
result. Or, if the tmp_joined_transactions relationship is 1-to-many you
should instead use a semi-join instead of an inner join.
FROM "transactions_transaction" , "tmp_joined_transactions_75chlsokrsev"
> WHERE
>
Here...
(NOT ("transactions_transaction"."id"
> IN (SELECT U0."id" AS Col1
> FROM "transactions_transaction" U0
> LEFT OUTER JOIN "transactions_commission" U1
> ON ( U0."id" = U1."transaction_id" )
> WHERE U1."id" IS NULL))
>
Making this an anti-join (NOT EXISTS instead of NOT IN) stands out as an
easy improvement to try:
It also makes the logic clearer since you seem to have a double-negative
here which means you really want a semi-join (which I wrote below)
WHERE EXISTS (SELECT 1 FROM transactions_commission WHERE
transactions_transaction.id = transactions_commission.transaction_id)
I won't promise this gives the same answer...I don't have enough spare
brain power or the ability to test it...but its seems correct.
AND "transactions_transaction"."date_created" >= '2010-01-01'::date
> AND "transactions_transaction"."date_created" <= '2015-12-31'::date
> AND "transactions_transaction"."deal_status" IN (1)
>
Also...
> AND (transactions_transaction.id =
> tmp_joined_transactions_75chlsokrsev.transaction_id))
>
This is style but I'm really a fan of using ANSI JOIN syntax...turning the
above into a <JOIN tmp_* ON>
ORDER BY "transactions_transaction"."id" ASC;
>
>
The regression itself someone else would need to comment on.
David J.
From | Date | Subject | |
---|---|---|---|
Next Message | Paul Jungwirth | 2016-06-17 18:09:28 | Re: PostgresSQL and HIPAA compliance |
Previous Message | Durgamahesh Manne | 2016-06-17 18:07:35 | Re: Re: regarding schema only migration from sqlserver to postgres with runmtk.sh |