Re: INDEX Performance Issue

From: Vasilis Ventirozos <v(dot)ventirozos(at)gmail(dot)com>
To: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: INDEX Performance Issue
Date: 2013-04-08 20:02:58
Message-ID: CAF8jcqrCi1ad3AOWWrd=bkzaieedwSeX2up5pgmhLjdec9_ydw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

-c 10 means 10 clients so that should take advantage of all your cores (see
bellow)

%Cpu0 : 39.3 us, 21.1 sy, 0.0 ni, 38.7 id, 0.9 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu1 : 38.0 us, 25.0 sy, 0.0 ni, 26.0 id, 4.2 wa, 0.0 hi, 6.8 si, 0.0 st
%Cpu2 : 39.3 us, 20.4 sy, 0.0 ni, 39.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu3 : 40.0 us, 18.7 sy, 0.0 ni, 40.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu4 : 13.9 us, 7.1 sy, 0.0 ni, 79.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu5 : 13.1 us, 8.4 sy, 0.0 ni, 78.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu6 : 14.8 us, 6.4 sy, 0.0 ni, 78.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
%Cpu7 : 15.7 us, 6.7 sy, 0.0 ni, 77.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st

i am pasting you the results of the same test on a i7-2600 16gb with a
sata3 SSD and the results from a VM with 2 cores and a normal 7200 rpm hdd

-- DESKTOP
vasilis(at)Disorder ~ $ pgbench -c 10 -t 10000 bench
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 100000/100000
tps = 1713.338111 (including connections establishing)
tps = 1713.948478 (excluding connections establishing)

-- VM

postgres(at)pglab1:~/postgresql-9.2.4/contrib/pgbench$ ./pgbench -c 10 -t
10000 bench
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
number of transactions per client: 10000
number of transactions actually processed: 100000/100000
tps = 1118.976496 (including connections establishing)
tps = 1119.180126 (excluding connections establishing)

i am assuming that you didn't populate your pgbench db with the default
values , if you tell me how you did i will be happy to re run the test and
see the differences.

On Mon, Apr 8, 2013 at 10:31 PM, Mark Davidson <mark(at)4each(dot)co(dot)uk> wrote:

> Thanks for your response Vasillis. I've run pgbench on both machines
> `./pgbench -c 10 -t 10000 pgbench` getting 99.800650 tps on my local
> machine and 23.825332 tps on the server so quite a significant difference.
> Could this purely be down to the CPU clock speed or is it likely something
> else causing the issue?
> I have run ANALYZE on both databases and tried the queries a number of
> times on each to make sure the results are consistent, this is the case.
>
>
> On 8 April 2013 18:19, Vasilis Ventirozos <v(dot)ventirozos(at)gmail(dot)com> wrote:
>
>>
>> Hello Mark,
>> PostgreSQL currently doesn't support parallel query so a faster cpu even
>> if it has less cores would be faster for a single query, about benchmarking
>> you can try pgbench that you will find in the contrib,
>> the execution plan may be different because of different statistics, have
>> you analyzed both databases when you compared the execution plans ?
>>
>> Vasilis Ventirozos
>>
>>
>> Been trying to progress with this today. Decided to setup the database
>>> on my local machine to try a few things and I'm getting much more sensible
>>> results and a totally different query plan
>>> http://explain.depesz.com/s/KGd in this case the query took about a
>>> minute but does sometimes take around 80 seconds.
>>>
>>> The config is exactly the same between the two database. The databases
>>> them selves are identical with all indexes the same on the tables.
>>>
>>> The server has an 2 x Intel Xeon E5420 running at 2.5Ghz each, 16GB RAM
>>> and the database is just on a SATA HDD which is a Western Digital
>>> WD5000AAKS.
>>> My desktop has a single i5-3570K running at 3.4Ghz, 16GB RAM and the
>>> database is running on a SATA HDD which is a Western Digital WD1002FAEX-0
>>>
>>> Could anyone offer any reasoning as to why the plan would be so
>>> different across the two machines? I would have thought that the server
>>> would perform a lot better since it has more cores or is postgres more
>>> affected by the CPU speed? Could anyone suggest a way to bench mark the
>>> machines for their postgres performance?
>>>
>>> Thanks again for everyones input,
>>>
>>> Mark
>>>
>>>
>>> On 7 April 2013 23:22, Mark Davidson <mark(at)4each(dot)co(dot)uk> wrote:
>>>
>>>> Takes a little longer with the INNER join unfortunately. Takes about
>>>> ~3.5 minutes, here is the query plan http://explain.depesz.com/s/EgBl.
>>>>
>>>> With the JOIN there might not be a match if the data does not fall
>>>> within one of the areas that is selected in the IN query.
>>>>
>>>> So if we have data id (10) that might fall in areas ( 1, 5, 8, 167 )
>>>> but the user might be querying areas ( 200 ... 500 ) so no match in area
>>>> would be found just to be absolutely clear.
>>>>
>>>> Is it worth considering adding additional statistics on any of the
>>>> columns? And / Or additional INDEXES or different types INDEX? Would it be
>>>> worth restructuring the query starting with areas and working to join data
>>>> to that?
>>>>
>>>>
>>>> On 7 April 2013 16:15, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>>>>
>>>>> Greg Williamson <gwilliamson39(at)yahoo(dot)com> wrote:
>>>>>
>>>>> >> Thanks for your response. I tried doing what you suggested so
>>>>> >> that table now has a primary key of
>>>>> >> ' CONSTRAINT data_area_pkey PRIMARY KEY(area_id , data_id ); '
>>>>> >> and I've added the INDEX of
>>>>> >> 'CREATE INDEX data_area_data_id_index ON data_area USING btree
>>>>> (data_id );'
>>>>>
>>>>> Yeah, that is what I was suggesting.
>>>>>
>>>>> >> unfortunately it hasn't resulted in an improvement of the query
>>>>> >> performance.
>>>>>
>>>>> > Did you run analyze on the table after creating the index ?
>>>>>
>>>>> That probably isn't necessary. Statistics are normally on relations
>>>>> and columns; there are only certain special cases where an ANALYZE
>>>>> is needed after an index build, like if the index is on an
>>>>> expression rather than a list of columns.
>>>>>
>>>>> Mark, what happens if you change that left join to a normal (inner)
>>>>> join? Since you're doing an inner join to data_area and that has a
>>>>> foreign key to area, there should always be a match anyway, right?
>>>>> The optimizer doesn't recognize that, so it can't start from the
>>>>> area and just match to the appropriate points.
>>>>>
>>>>> --
>>>>> Kevin Grittner
>>>>> EnterpriseDB: http://www.enterprisedb.com
>>>>> The Enterprise PostgreSQL Company
>>>>>
>>>>
>>>>
>>>
>>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Davidson 2013-04-08 20:18:52 Re: INDEX Performance Issue
Previous Message Mark Davidson 2013-04-08 19:31:02 Re: INDEX Performance Issue