Re: Analyze makes queries slow...

From: Jacek Rembisz <jr178783(at)zodiac(dot)mimuw(dot)edu(dot)pl>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Analyze makes queries slow...
Date: 2003-08-12 09:57:07
Message-ID: 20030812095707.GA22395@melina.szuwary.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Aug 11, 2003 at 03:58:41PM +0200, Stef wrote:

> I have attached an sql statement that normally runs under 3 minutes.
> That is, until I vacuum analyze the database (or just the tables in the query),
> then the same query runs longer than 12 hours, and I have to kill it.

Hmm, I have noticed similar problem with a query with order by ... limit clause.Although it runs only 10 times slower after analyze :)

The query joins one big table (20 000 rows) with several small tables
(200-4000 rows) than order by "primary key of big table" limit 20

Without this order by ... limit clause the query is 5 times faster after
analyze.

Looking into explain analyze outputs:
1. Before vacuum analyze a planer chooses nested loop, the deepest is:
-> Nested Loop (cost=0.00..116866.54 rows=19286 width=96) (actual time=0.14..1.39 rows=21 loops=1)
-> Index Scan Backward using big_table_pkey on big_table k (cost=0.00..1461.15 rows=19286 width=52) (actual time=0.07..0.47 rows=21 loops=1)
-> Index Scan using 4000rows_table_pkey on 4000rows_table zs (cost=0.00..5.97 rows=1 width=44) (actual time=0.02..0.02 rows=0 loops=21)

2. After analyze uses hashjoins

When I remove this order by limit clause the query after analyze takes
the same time and the query before analyze is much more slower.

I won't blame the planer. How he could learn that he should first
take those 20 rows and than perform joins? There is a where clause
with complex exists(subquery) condition regarding one of big_table fields,
but removing this condition does not change the query plan.

Pure joining without any additional conditions and only primary key of big
table in select clause runs 4 times slower then whole query before
vacuuum analyze :)

Does in all the planer take in the consideration the limit clause?

Probably I'm missing something. I don't know much about the planer.

Finaly I have redesigned the query.

Regards,
Jacek

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-08-12 13:11:53 Re: Odd problem with performance in duplicate database
Previous Message mixo 2003-08-12 07:16:25 Re: Perfomance Tuning