Re: Force a merge join?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Doug Fields <dfields-pg-general(at)pexicom(dot)com>
Cc: "Ian Harding" <ianh(at)tpchd(dot)org>, pgsql-general(at)postgresql(dot)org
Subject: Re: Force a merge join?
Date: 2002-05-18 23:24:07
Message-ID: 24819.1021764247@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Doug Fields <dfields-pg-general(at)pexicom(dot)com> writes:
> At 04:19 PM 5/18/2002, Tom Lane wrote:
>> Doug Fields <dfields-pg-general(at)pexicom(dot)com> writes:
> In fact, yes it does. How do I know? Very simple: I did a SELECT * INTO
>> ...
> to copy my real table to a testing table so I could refactor it. Then I
>> did
> the usual EXPLAIN ANALYZE queries, and it was using merge joins. Then, I
> did an "ANALYZE" (which is like VACUUM ANALYZE without the slow VACUUM)
>> and
> voila - nested loops and half second queries turning into five minute
> nightmares. Then enable_nestloop would fix the problem again after that.
>>
>> Could we see the usual details here? Before and after EXPLAIN ANALYZE,
>> and the schemas and pg_stats rows for the tables involved.

> Hi Tom,

> I gave some of this information in my first post, but I will repeat it here
> with the pg_stats and the exact before and after information. Thanks.

> delete from pg_statistic where
> starelid = (select oid from pg_class where relname =
> 'test_list_entries');

> EXPLAIN ANALYZE SELECT 745, a.list_entry_id
> FROM test_list_entries a, test_list_entries b
> WHERE (a.list_id=148 OR a.list_id=146 OR a.list_id=145 OR
> a.list_id=147 OR a.list_id=144)
> AND (b.list_id=247 OR b.list_id=433 OR b.list_id=249 OR
> b.list_id=434 OR b.list_id=238 OR b.list_id=340 OR b.list_id=339 OR
> b.list_id=418)
> AND a.lower_email = b.lower_email;

> NOTICE: QUERY PLAN:

> Merge Join (cost=1926.12..1940.21 rows=247 width=140) (actual
> time=448.32..521.99 rows=3674 loops=1)
> -> Sort (cost=739.66..739.66 rows=176 width=72) (actual
> time=437.18..447.71 rows=15859 loops=1)
> -> Index Scan using test_list_id_idx, test_list_id_idx,
> test_list_id_idx, test_list_id_idx, test_list_id_idx on test_list_entries
> a (cost=0.00..733.09 rows=176 width=72) (actual time=0.04..139.32
> rows=15859 loops=1)
> -> Sort (cost=1186.45..1186.45 rows=280 width=68) (actual
> time=11.12..13.67 rows=3783 loops=1)
> -> Index Scan using test_list_id_idx, test_list_id_idx,
> test_list_id_idx, test_list_id_idx, test_list_id_idx, test_list_id_idx,
> test_list_id_idx, test_list_id_idx on test_list_entries
> b (cost=0.00..1175.08 rows=280 width=68) (actual time=0.06..4.75 rows=573
> loops=1)
> Total runtime: 528.83 msec

> ANALYZE;

> EXPLAIN SELECT 745, a.list_entry_id
> FROM test_list_entries a, test_list_entries b
> WHERE (a.list_id=148 OR a.list_id=146 OR a.list_id=145 OR
> a.list_id=147 OR a.list_id=144)
> AND (b.list_id=247 OR b.list_id=433 OR b.list_id=249 OR
> b.list_id=434 OR b.list_id=238 OR b.list_id=340 OR b.list_id=339 OR
> b.list_id=418)
> AND a.lower_email = b.lower_email;

> NOTICE: QUERY PLAN:

> Nested Loop (cost=0.00..1176.19 rows=4 width=140)
> -> Index Scan using test_list_id_idx, test_list_id_idx,
> test_list_id_idx, test_list_id_idx, test_list_id_idx on test_list_entries
> a (cost=0.00..454.33 rows=84 width=72)
> -> Index Scan using test_lower_email_idx on test_list_entries
> b (cost=0.00..8.52 rows=1 width=68)

> EXPLAIN ANALYZE same thing...

> NOTICE: QUERY PLAN:

> Nested Loop (cost=0.00..1176.19 rows=4 width=140) (actual
> time=11.52..388320.86 rows=3674 loops=1)
> -> Index Scan using test_list_id_idx, test_list_id_idx,
> test_list_id_idx, test_list_id_idx, test_list_id_idx on test_list_entries
> a (cost=0.00..454.33 rows=84 width=72) (actual time=0.06..190.30
> rows=15859 loops=1)
> -> Index Scan using test_lower_email_idx on test_list_entries
> b (cost=0.00..8.52 rows=1 width=68) (actual time=14.86..24.47 rows=0
> loops=15859)
> Total runtime: 388330.10 msec

> select attname, null_frac, avg_width, n_distinct, most_common_freqs,
> correlation from pg_stats where tablename = 'test_list_entries';
> -- Note, I removed some irrelevant rows

> attname | null_frac | avg_width | n_distinct
> | most_common_freqs
> | correlation
> ---------------+-----------+-----------+------------+-----------------------------------------------------------------------------------------------------------------+-------------
> list_id | 0 | 4 | 184 |
> {0.383667,0.115667,0.0163333,0.0126667,0.0113333,0.01,0.00966667,0.00933333,0.009,0.009}
> | 0.842899
> list_entry_id | 0 | 4 | -1
> |
> | 0.719433
> lower_email | 0.387667 | 68 | -0.38239 |
> {0.0156667,0.000666667,0.000666667,0.000666667,0.000666667}
> | 0.00150877
> (27 rows)

> \d test_list_entries
> -- Note I removed some irrelevant columns
> Table "test_list_entries"
> Column | Type | Modifiers
> ---------------+--------------------------+-----------
> list_id | integer |
> list_entry_id | integer |
> lower_email | character(64) |
> Indexes: test_list_entries_pkey,
> test_list_id_idx,
> test_lower_email_idx

> Cheers,

> Doug

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2002-05-18 23:34:47 Re: Force a merge join?
Previous Message Robert J. Sanford, Jr. 2002-05-18 22:56:26 how to get id of last insert on a serial type?