Re: INDEX Performance Issue

From: Mark Davidson <mark(at)4each(dot)co(dot)uk>
To: Kevin Grittner <kgrittn(at)ymail(dot)com>
Cc: Greg Williamson <gwilliamson39(at)yahoo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: INDEX Performance Issue
Date: 2013-04-08 17:02:59
Message-ID: CAPpwVW36gmHUG1d6KJ6tNyKXT7udJBsUDyUNf6gkBM_28rSe3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Been trying to progress with this today. Decided to setup the database on
my local machine to try a few things and I'm getting much more sensible
results and a totally different query plan
http://explain.depesz.com/s/KGdin this case the query took about a
minute but does sometimes take around
80 seconds.

The config is exactly the same between the two database. The databases them
selves are identical with all indexes the same on the tables.

The server has an 2 x Intel Xeon E5420 running at 2.5Ghz each, 16GB RAM and
the database is just on a SATA HDD which is a Western Digital WD5000AAKS.
My desktop has a single i5-3570K running at 3.4Ghz, 16GB RAM and the
database is running on a SATA HDD which is a Western Digital WD1002FAEX-0

Could anyone offer any reasoning as to why the plan would be so different
across the two machines? I would have thought that the server would perform
a lot better since it has more cores or is postgres more affected by the
CPU speed? Could anyone suggest a way to bench mark the machines for their
postgres performance?

Thanks again for everyones input,

Mark

On 7 April 2013 23:22, Mark Davidson <mark(at)4each(dot)co(dot)uk> wrote:

> Takes a little longer with the INNER join unfortunately. Takes about ~3.5
> minutes, here is the query plan http://explain.depesz.com/s/EgBl.
>
> With the JOIN there might not be a match if the data does not fall within
> one of the areas that is selected in the IN query.
>
> So if we have data id (10) that might fall in areas ( 1, 5, 8, 167 ) but
> the user might be querying areas ( 200 ... 500 ) so no match in area would
> be found just to be absolutely clear.
>
> Is it worth considering adding additional statistics on any of the
> columns? And / Or additional INDEXES or different types INDEX? Would it be
> worth restructuring the query starting with areas and working to join data
> to that?
>
>
> On 7 April 2013 16:15, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>
>> Greg Williamson <gwilliamson39(at)yahoo(dot)com> wrote:
>>
>> >> Thanks for your response. I tried doing what you suggested so
>> >> that table now has a primary key of
>> >> ' CONSTRAINT data_area_pkey PRIMARY KEY(area_id , data_id ); '
>> >> and I've added the INDEX of
>> >> 'CREATE INDEX data_area_data_id_index ON data_area USING btree
>> (data_id );'
>>
>> Yeah, that is what I was suggesting.
>>
>> >> unfortunately it hasn't resulted in an improvement of the query
>> >> performance.
>>
>> > Did you run analyze on the table after creating the index ?
>>
>> That probably isn't necessary. Statistics are normally on relations
>> and columns; there are only certain special cases where an ANALYZE
>> is needed after an index build, like if the index is on an
>> expression rather than a list of columns.
>>
>> Mark, what happens if you change that left join to a normal (inner)
>> join? Since you're doing an inner join to data_area and that has a
>> foreign key to area, there should always be a match anyway, right?
>> The optimizer doesn't recognize that, so it can't start from the
>> area and just match to the appropriate points.
>>
>> --
>> Kevin Grittner
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Vasilis Ventirozos 2013-04-08 17:19:21 Re: INDEX Performance Issue
Previous Message Tom Lane 2013-04-08 16:46:26 Re: What happens between end of explain analyze and end of query execution ?