Re: INDEX Performance Issue

From: Mark Davidson <mark(at)4each(dot)co(dot)uk>
To: Vasilis Ventirozos <v(dot)ventirozos(at)gmail(dot)com>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: INDEX Performance Issue
Date: 2013-04-08 20:57:04
Message-ID: CAPpwVW301WHdHmRj6Lv=tLGXqu0PmoxF8SNJuu8BWDbs1hffDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Sorry Vasillis I missed you asking that I just did './pgbench -i pgbench'
didn't specific set any values. I can try some specific ones if you can
suggest any.

On 8 April 2013 21:28, Vasilis Ventirozos <v(dot)ventirozos(at)gmail(dot)com> wrote:

>
>
>
> On Mon, Apr 8, 2013 at 11:18 PM, Mark Davidson <mark(at)4each(dot)co(dot)uk> wrote:
>
>> Wow my results are absolutely appalling compared to both of those which
>> is really interesting. Are you running postgres 9.2.4 on both instances?
>> Any specific configuration changes?
>> Thinking there must be something up with my setup to be getting such a
>> low tps compared with you.
>>
>
> Both installations are 9.2.4 and both db's have absolutely default
> configurations, i can't really explain why there is so much difference
> between our results, i can only imagine the initialization, thats why i
> asked how you populated your pgbench database (scale factor / fill factor).
>
> Vasilis Ventirozos
>
>
>> On 8 April 2013 21:02, Vasilis Ventirozos <v(dot)ventirozos(at)gmail(dot)com> wrote:
>>
>>>
>>> -c 10 means 10 clients so that should take advantage of all your cores
>>> (see bellow)
>>>
>>> %Cpu0 : 39.3 us, 21.1 sy, 0.0 ni, 38.7 id, 0.9 wa, 0.0 hi, 0.0 si, 0.0 st
>>> %Cpu1 : 38.0 us, 25.0 sy, 0.0 ni, 26.0 id, 4.2 wa, 0.0 hi, 6.8 si, 0.0 st
>>> %Cpu2 : 39.3 us, 20.4 sy, 0.0 ni, 39.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st
>>> %Cpu3 : 40.0 us, 18.7 sy, 0.0 ni, 40.0 id, 1.3 wa, 0.0 hi, 0.0 si, 0.0 st
>>> %Cpu4 : 13.9 us, 7.1 sy, 0.0 ni, 79.1 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
>>> %Cpu5 : 13.1 us, 8.4 sy, 0.0 ni, 78.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
>>> %Cpu6 : 14.8 us, 6.4 sy, 0.0 ni, 78.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
>>> %Cpu7 : 15.7 us, 6.7 sy, 0.0 ni, 77.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
>>>
>>> i am pasting you the results of the same test on a i7-2600 16gb with a
>>> sata3 SSD and the results from a VM with 2 cores and a normal 7200 rpm hdd
>>>
>>> -- DESKTOP
>>> vasilis(at)Disorder ~ $ pgbench -c 10 -t 10000 bench
>>> starting vacuum...end.
>>> transaction type: TPC-B (sort of)
>>> scaling factor: 1
>>> query mode: simple
>>> number of clients: 10
>>> number of threads: 1
>>> number of transactions per client: 10000
>>> number of transactions actually processed: 100000/100000
>>> tps = 1713.338111 (including connections establishing)
>>> tps = 1713.948478 (excluding connections establishing)
>>>
>>> -- VM
>>>
>>> postgres(at)pglab1:~/postgresql-9.2.4/contrib/pgbench$ ./pgbench -c 10 -t
>>> 10000 bench
>>> starting vacuum...end.
>>> transaction type: TPC-B (sort of)
>>> scaling factor: 1
>>> query mode: simple
>>> number of clients: 10
>>> number of threads: 1
>>> number of transactions per client: 10000
>>> number of transactions actually processed: 100000/100000
>>> tps = 1118.976496 (including connections establishing)
>>> tps = 1119.180126 (excluding connections establishing)
>>>
>>> i am assuming that you didn't populate your pgbench db with the default
>>> values , if you tell me how you did i will be happy to re run the test and
>>> see the differences.
>>>
>>>
>>>
>>> On Mon, Apr 8, 2013 at 10:31 PM, Mark Davidson <mark(at)4each(dot)co(dot)uk> wrote:
>>>
>>>> Thanks for your response Vasillis. I've run pgbench on both machines
>>>> `./pgbench -c 10 -t 10000 pgbench` getting 99.800650 tps on my local
>>>> machine and 23.825332 tps on the server so quite a significant difference.
>>>> Could this purely be down to the CPU clock speed or is it likely
>>>> something else causing the issue?
>>>> I have run ANALYZE on both databases and tried the queries a number of
>>>> times on each to make sure the results are consistent, this is the case.
>>>>
>>>>
>>>> On 8 April 2013 18:19, Vasilis Ventirozos <v(dot)ventirozos(at)gmail(dot)com>wrote:
>>>>
>>>>>
>>>>> Hello Mark,
>>>>> PostgreSQL currently doesn't support parallel query so a faster cpu
>>>>> even if it has less cores would be faster for a single query, about
>>>>> benchmarking you can try pgbench that you will find in the contrib,
>>>>> the execution plan may be different because of different statistics,
>>>>> have you analyzed both databases when you compared the execution plans ?
>>>>>
>>>>> Vasilis Ventirozos
>>>>>
>>>>>
>>>>> Been trying to progress with this today. Decided to setup the
>>>>>> database on my local machine to try a few things and I'm getting much more
>>>>>> sensible results and a totally different query plan
>>>>>> http://explain.depesz.com/s/KGd in this case the query took about a
>>>>>> minute but does sometimes take around 80 seconds.
>>>>>>
>>>>>> The config is exactly the same between the two database. The
>>>>>> databases them selves are identical with all indexes the same on the
>>>>>> tables.
>>>>>>
>>>>>> The server has an 2 x Intel Xeon E5420 running at 2.5Ghz each, 16GB
>>>>>> RAM and the database is just on a SATA HDD which is a Western Digital
>>>>>> WD5000AAKS.
>>>>>> My desktop has a single i5-3570K running at 3.4Ghz, 16GB RAM and the
>>>>>> database is running on a SATA HDD which is a Western Digital WD1002FAEX-0
>>>>>>
>>>>>> Could anyone offer any reasoning as to why the plan would be so
>>>>>> different across the two machines? I would have thought that the server
>>>>>> would perform a lot better since it has more cores or is postgres more
>>>>>> affected by the CPU speed? Could anyone suggest a way to bench mark the
>>>>>> machines for their postgres performance?
>>>>>>
>>>>>> Thanks again for everyones input,
>>>>>>
>>>>>> Mark
>>>>>>
>>>>>>
>>>>>> On 7 April 2013 23:22, Mark Davidson <mark(at)4each(dot)co(dot)uk> wrote:
>>>>>>
>>>>>>> Takes a little longer with the INNER join unfortunately. Takes about
>>>>>>> ~3.5 minutes, here is the query plan
>>>>>>> http://explain.depesz.com/s/EgBl.
>>>>>>>
>>>>>>> With the JOIN there might not be a match if the data does not fall
>>>>>>> within one of the areas that is selected in the IN query.
>>>>>>>
>>>>>>> So if we have data id (10) that might fall in areas ( 1, 5, 8, 167 )
>>>>>>> but the user might be querying areas ( 200 ... 500 ) so no match in area
>>>>>>> would be found just to be absolutely clear.
>>>>>>>
>>>>>>> Is it worth considering adding additional statistics on any of the
>>>>>>> columns? And / Or additional INDEXES or different types INDEX? Would it be
>>>>>>> worth restructuring the query starting with areas and working to join data
>>>>>>> to that?
>>>>>>>
>>>>>>>
>>>>>>> On 7 April 2013 16:15, Kevin Grittner <kgrittn(at)ymail(dot)com> wrote:
>>>>>>>
>>>>>>>> Greg Williamson <gwilliamson39(at)yahoo(dot)com> wrote:
>>>>>>>>
>>>>>>>> >> Thanks for your response. I tried doing what you suggested so
>>>>>>>> >> that table now has a primary key of
>>>>>>>> >> ' CONSTRAINT data_area_pkey PRIMARY KEY(area_id , data_id ); '
>>>>>>>> >> and I've added the INDEX of
>>>>>>>> >> 'CREATE INDEX data_area_data_id_index ON data_area USING btree
>>>>>>>> (data_id );'
>>>>>>>>
>>>>>>>> Yeah, that is what I was suggesting.
>>>>>>>>
>>>>>>>> >> unfortunately it hasn't resulted in an improvement of the query
>>>>>>>> >> performance.
>>>>>>>>
>>>>>>>> > Did you run analyze on the table after creating the index ?
>>>>>>>>
>>>>>>>> That probably isn't necessary. Statistics are normally on relations
>>>>>>>> and columns; there are only certain special cases where an ANALYZE
>>>>>>>> is needed after an index build, like if the index is on an
>>>>>>>> expression rather than a list of columns.
>>>>>>>>
>>>>>>>> Mark, what happens if you change that left join to a normal (inner)
>>>>>>>> join? Since you're doing an inner join to data_area and that has a
>>>>>>>> foreign key to area, there should always be a match anyway, right?
>>>>>>>> The optimizer doesn't recognize that, so it can't start from the
>>>>>>>> area and just match to the appropriate points.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Kevin Grittner
>>>>>>>> EnterpriseDB: http://www.enterprisedb.com
>>>>>>>> The Enterprise PostgreSQL Company
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Davidson 2013-04-08 21:01:03 Re: INDEX Performance Issue
Previous Message Jeff Janes 2013-04-08 20:39:23 Re: INDEX Performance Issue