Re: INDEX Performance Issue

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Mark Davidson <mark(at)4each(dot)co(dot)uk>
Cc: Kevin Grittner <kgrittn(at)ymail(dot)com>, Greg Williamson <gwilliamson39(at)yahoo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: INDEX Performance Issue
Date: 2013-04-15 20:16:17
Message-ID: CAMkU=1zbqMMxpJ+nkE=WyCiGiYscDikv10T45O9HnWrnss95=A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Apr 8, 2013 at 10:02 AM, Mark Davidson <mark(at)4each(dot)co(dot)uk> wrote:

> Been trying to progress with this today. Decided to setup the database on
> my local machine to try a few things and I'm getting much more sensible
> results and a totally different query plan http://explain.depesz.com/s/KGdin this case the query took about a minute but does sometimes take around
> 80 seconds.
>
> The config is exactly the same between the two database. The databases
> them selves are identical with all indexes the same on the tables.
>
> The server has an 2 x Intel Xeon E5420 running at 2.5Ghz each, 16GB RAM
> and the database is just on a SATA HDD which is a Western Digital
> WD5000AAKS.
> My desktop has a single i5-3570K running at 3.4Ghz, 16GB RAM and the
> database is running on a SATA HDD which is a Western Digital WD1002FAEX-0
>
> Could anyone offer any reasoning as to why the plan would be so different
> across the two machines?
>

The estimated costs of the two plans are very close to each other, so it
doesn't take much to cause a switch to happen.

Is the test instance a binary copy of the production one (i.e. created from
a base backup) or is it only a logical copy (e.g. pg_dump followed by a
restore)? A logical copy will probably be more compact than the original
and so will have different slightly estimates.

You could check pg_class for relpages on all relevant tables and indexes on
both servers.

Also, since ANALYZE uses a random sampling for large tables, the estimates
can move around just by chance. If you repeat the query several times with
an ANALYZE in between, does the plan change, or if not how much does the
estimated cost change within the plan? You could check pg_stats for the
relevant tables and columns between the two servers to see how similar they
are.

The estimated cost of a hash join is very dependent on how frequent the
most common value of the hashed column is thought to be. And the estimate
of this number can be very fragile if ANALYZE is based on a small fraction
of the table. Turning up the statistics for those columns might be
worthwhile.

Cheers,

Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Anne Rosset 2013-04-15 20:59:50 Poor performance on an aggregate query
Previous Message Jeff Janes 2013-04-15 19:37:33 Re: INDEX Performance Issue