Re: Performance Issue -- "Materialize"

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: anand086 <anand086(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performance Issue -- "Materialize"
Date: 2017-08-21 20:28:07
Message-ID: CAMkU=1zhOzfWd_nSAjMsxL3Bnj8XAwcxU10tDRcas2ZjwoMrTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, Aug 19, 2017 at 10:37 AM, anand086 <anand086(at)gmail(dot)com> wrote:

Your email is very hard to read, the formatting and line wrapping is
heavily mangled. You might want to attach the plans as files attachments
instead of or in addition to putting the in the body.

> -> Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1
> (cost=0.56..2.03 rows=1 width=8) |
>
> Index Cond: ((import_num = '4520460'::numeric) AND (login IS NOT NULL))
>
>
It looks like the statistics for your table are desperately out of date, as
a later query showed there are 762599 rows (unless login is null for all of
them) but the above is estimating there is only one. When was the table
last analyzed?

Cheers,

Jeff

On Sat, Aug 19, 2017 at 10:37 AM, anand086 <anand086(at)gmail(dot)com> wrote:

> I am a Postgres Newbie and trying to learn :) We have a scenario wherein,
> one of the SQL with different input value for import_num showing different
> execution plan. As an example, with import_num = '4520440' the execution
> plan shows Nested Loop and is taking ~12secs. With import_num = '4520460'
> execution plan showed using "Materialize" and never completed. After I set
> enable_material to off, the execution plan is changed using Hash Semi Join
> and completes in less than 3 secs. SELECT count(*) FROM test_tab WHERE
> login IN (SELECT DISTINCT login FROM test_tab WHERE import_num = '4520440'
> AND login IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab WHERE
> import_num = '0' AND login IS NOT NULL) AND import_num = '4520440';
> +--------+ | count | +--------+ | 746982 | +--------+ (1 row) Time:
> 12054.274 ms
>
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------+
> | QUERY PLAN |
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------+
> | Aggregate (cost=351405.08..351405.09 rows=1 width=8) |
> | -> Nested Loop (cost=349846.23..350366.17 rows=415562 width=0) |
> | -> HashAggregate (cost=349845.67..349847.67 rows=200 width=96) |
> | Group Key: ("ANY_subquery".login)::text |
> | -> Subquery Scan on "ANY_subquery" (cost=340828.23..348557.47 rows=515282 width=96) |
> | -> SetOp Except (cost=340828.23..343404.65 rows=515282 width=100) |
> | -> Sort (cost=340828.23..342116.44 rows=515283 width=100) |
> | Sort Key: "*SELECT* 1".login |
> | -> Append (cost=0.56..275836.74 rows=515283 width=100) |
> | -> Subquery Scan on "*SELECT* 1" (cost=0.56..275834.70 rows=515282 width=12) |
> | -> Unique (cost=0.56..270681.88 rows=515282 width=8) |
> | -> Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1 (cost=0.56..268604.07 rows=831125 width=8) |
> | Index Cond: ((import_num = '4520440'::numeric) AND (login IS NOT NULL)) |
> | -> Subquery Scan on "*SELECT* 2" (cost=0.56..2.04 rows=1 width=12) |
> | -> Unique (cost=0.56..2.03 rows=1 width=8) |
> | -> Index Only Scan using ui_nkey_test_tab on test_tab test_tab_2 (cost=0.56..2.03 rows=1 width=8) |
> | Index Cond: ((import_num = '0'::numeric) AND (login IS NOT NULL)) |
> | -> Index Only Scan using ui_nkey_test_tab on test_tab (cost=0.56..2.58 rows=1 width=8) |
> | Index Cond: ((import_num = '4520440'::numeric) AND (login = ("ANY_subquery".login)::text)) |
> +-----------------------------------------------------------------------------------------------------------------------------------------------------------+
> (19 rows)
>
> SELECT count(*) FROM test_tab WHERE import_num = '4520460' and login IN
> (SELECT DISTINCT login FROM test_tab WHERE import_num = '4520460' AND login
> IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab WHERE import_num =
> '0' AND login IS NOT NULL); The SQL was never completing and had the below
> SQL execution plan --
>
> +-------------------------------------------------------------------------------------------------------------------------------------------+
> | QUERY PLAN |
> +-------------------------------------------------------------------------------------------------------------------------------------------+
> | Aggregate (cost=6.14..6.15 rows=1 width=8) |
> | -> Nested Loop Semi Join (cost=1.12..6.13 rows=1 width=0) |
> | Join Filter: ((test_tab.login)::text = ("ANY_subquery".login)::text) |
> | -> Index Only Scan using ui_nkey_test_tab on test_tab (cost=0.56..2.02 rows=1 width=8) |
> | Index Cond: (import_num = '4520460'::numeric) |
> | -> Materialize (cost=0.56..4.10 rows=1 width=96) |
> | -> Subquery Scan on "ANY_subquery" (cost=0.56..4.09 rows=1 width=96) |
> | -> HashSetOp Except (cost=0.56..4.08 rows=1 width=100) |
> | -> Append (cost=0.56..4.08 rows=2 width=100) |
> | -> Subquery Scan on "*SELECT* 1" (cost=0.56..2.04 rows=1 width=12) |
> | -> Unique (cost=0.56..2.03 rows=1 width=8) |
> | -> Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1 (cost=0.56..2.03 rows=1 width=8) |
> | Index Cond: ((import_num = '4520460'::numeric) AND (login IS NOT NULL)) |
> | -> Subquery Scan on "*SELECT* 2" (cost=0.56..2.04 rows=1 width=12) |
> | -> Unique (cost=0.56..2.03 rows=1 width=8) |
> | -> Index Only Scan using ui_nkey_test_tab on test_tab test_tab_2 (cost=0.56..2.03 rows=1 width=8) |
> | Index Cond: ((import_num = '0'::numeric) AND (login IS NOT NULL)) |
> +-------------------------------------------------------------------------------------------------------------------------------------------+
> (17 rows)
>
> ############################################# # After I set
> enable_material to off; #############################################
> SELECT count(*) FROM test_tab WHERE import_num = '4520460' and login IN
> (SELECT DISTINCT login FROM test_tab WHERE import_num = '4520460' AND login
> IS NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab WHERE import_num =
> '0' AND login IS NOT NULL); +--------+ | count | +--------+ | 762599 |
> +--------+ (1 row) Time: 2116.889 ms
>
> +-------------------------------------------------------------------------------------------------------------------------------------------+
> | QUERY PLAN |
> +-------------------------------------------------------------------------------------------------------------------------------------------+
> | Aggregate (cost=6.13..6.14 rows=1 width=8) |
> | -> Hash Semi Join (cost=4.67..6.13 rows=1 width=0) |
> | Hash Cond: ((test_tab.login)::text = ("ANY_subquery".login)::text) |
> | -> Index Only Scan using ui_nkey_test_tab on test_tab (cost=0.56..2.02 rows=1 width=8) |
> | Index Cond: (import_num = '4520460'::numeric) |
> | -> Hash (cost=4.09..4.09 rows=1 width=96) |
> | -> Subquery Scan on "ANY_subquery" (cost=0.56..4.09 rows=1 width=96) |
> | -> HashSetOp Except (cost=0.56..4.08 rows=1 width=100) |
> | -> Append (cost=0.56..4.08 rows=2 width=100) |
> | -> Subquery Scan on "*SELECT* 1" (cost=0.56..2.04 rows=1 width=12) |
> | -> Unique (cost=0.56..2.03 rows=1 width=8) |
> | -> Index Only Scan using ui_nkey_test_tab on test_tab test_tab_1 (cost=0.56..2.03 rows=1 width=8) |
> | Index Cond: ((import_num = '4520460'::numeric) AND (login IS NOT NULL)) |
> | -> Subquery Scan on "*SELECT* 2" (cost=0.56..2.04 rows=1 width=12) |
> | -> Unique (cost=0.56..2.03 rows=1 width=8) |
> | -> Index Only Scan using ui_nkey_test_tab on test_tab test_tab_2 (cost=0.56..2.03 rows=1 width=8) |
> | Index Cond: ((import_num = '0'::numeric) AND (login IS NOT NULL)) |
> +-------------------------------------------------------------------------------------------------------------------------------------------+
> (17 rows)
>
> Looking at the row count for import_num select import_num, count(*) from
> test_tab group by import_num order by 2; +------------+--------+ |
> import_num | count | +------------+--------+ | 4520440 | 746982 | | 4520460
> | 762599 | +------------+--------+ (37 rows) With different value of
> import_num we are having different execution plan. Is there a way to force
> the same Hash semi Join plan to sql with import_num 4520440, currently
> doing nested loop. I tried /*+HashJoin(a1 ANY_subquery)*/ but the sql
> execution plan doesn't change. SELECT /*+HashJoin(a1 ANY_subquery)*/
> count(*) FROM test_tab a1 WHERE import_num = '4520440' and login IN (SELECT
> DISTINCT login FROM test_tab a2 WHERE import_num = '4520440' AND login IS
> NOT NULL EXCEPT SELECT DISTINCT login FROM test_tab a3 WHERE import_num =
> '0' AND login IS NOT NULL); Regards, Anand
> ------------------------------
> View this message in context: Performance Issue -- "Materialize"
> <http://www.postgresql-archive.org/Performance-Issue-Materialize-tp5979128.html>
> Sent from the PostgreSQL - performance mailing list archive
> <http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html>
> at Nabble.com.
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message kimaidou 2017-08-21 20:34:27 Re: Query plan for views and WHERE clauses, Luke is not using the index
Previous Message Carlos Augusto Machado 2017-08-21 19:19:10 Re: Performance Issue -- "Materialize"