Re: High rate of transaction failure with the Serializable Isolation Level

From: Ryan Johnson <ryan(dot)johnson(at)cs(dot)utoronto(dot)ca>
To: Reza Taheri <rtaheri(at)vmware(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: High rate of transaction failure with the Serializable Isolation Level
Date: 2014-08-02 18:12:26
Message-ID: 53DD2A0A.2020904@cs.utoronto.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Great, thanks. I'll look into when I get a few minutes.

Ryan

On 28/07/2014 11:57 PM, Reza Taheri wrote:
> Hi Ryan,
> We presented a paper at the TPCTC of last year's VLDB (attached). It described the architecture of the kit, and some of the tuning. Another tuning change was setting /proc/sys/vm/dirty_background_bytes to a small value (like 10000000) on very-large memory machines, which was a problem I brought up on this same mailing list a while ago and got great advice. Also, make sure you do a SQLFreeStmt(stmt, SQL_DROP) at the end of transactions, not SQL_CLOSE.
>
> Let me know if you have any question about the paper
>
> Thanks,
> Reza
>
>> -----Original Message-----
>> From: Ryan Johnson [mailto:ryan(dot)johnson(at)cs(dot)utoronto(dot)ca]
>> Sent: Saturday, July 26, 2014 4:51 PM
>> To: Reza Taheri
>> Cc: pgsql-performance(at)postgresql(dot)org
>> Subject: Re: High rate of transaction failure with the Serializable Isolation
>> Level
>>
>> That does sound pretty similar, modulo the raw performance difference. I
>> have no idea how many MEE threads there were; it was just a quick run with
>> exactly zero tuning, so I use whatever dbt5 does out of the box.
>> Actually, though, if you have any general tuning tips for TPC-E I'd be
>> interested to learn them (PM if that's off topic for this discussion).
>>
>> Regards,
>> Ryan
>>
>> On 26/07/2014 7:33 PM, Reza Taheri wrote:
>>> Hi Ryan,
>>> Thanks a lot for sharing this. When I run with 12 CE threads and 3-5 MEE
>> threads (how many MEE threads do you have?) @ 80-90 tps, I get something
>> in the 20-30% of trade-result transactions rolled back depending on how I
>> count. E.g., in a 5.5-minute run with 3 MEE threads, I saw 87.5 tps. There
>> were 29200 successful trade-result transactions. Of these, 5800 were rolled
>> back, some more than once for a total of 8450 rollbacks. So I'd say your
>> results and ours tell similar stories!
>>> Thanks,
>>> Reza
>>>
>>>> -----Original Message-----
>>>> From: pgsql-performance-owner(at)postgresql(dot)org [mailto:pgsql-
>>>> performance-owner(at)postgresql(dot)org] On Behalf Of Ryan Johnson
>>>> Sent: Saturday, July 26, 2014 2:06 PM
>>>> To: Reza Taheri
>>>> Cc: pgsql-performance(at)postgresql(dot)org
>>>> Subject: Re: High rate of transaction failure with the Serializable
>>>> Isolation Level
>>>>
>>>> Dredging through some old run logs, 12 dbt-5 clients gave the
>>>> following when everything was run under SSI (fully serializable, even
>>>> the transactions that allow repeatable read isolation). Not sure how that
>> translates to your results.
>>>> Abort rates were admittedly rather high, though perhaps lower than
>>>> what you report.
>>>>
>>>> Transaction % Average: 90th % Total Rollbacks % Warning Invalid
>>>> ----------------- ------- --------------- ------- -------------- ------- -------
>>>> Trade Result 5.568 0.022: 0.056 2118 417 19.69% 0 91
>>>> Broker Volume 5.097 0.009: 0.014 1557 0 0.00% 0 0
>>>> Customer Position 13.530 0.016: 0.034 4134 1 0.02% 0 0
>>>> Market Feed 0.547 0.033: 0.065 212 45 21.23% 0 69
>>>> Market Watch 18.604 0.031: 0.061 5683 0 0.00% 0 0
>>>> Security Detail 14.462 0.015: 0.020 4418 0 0.00% 0 0
>>>> Trade Lookup 8.325 0.059: 0.146 2543 0 0.00% 432 0
>>>> Trade Order 9.110 0.006: 0.008 3227 444 13.76% 0 0
>>>> Trade Status 19.795 0.030: 0.046 6047 0 0.00% 0 0
>>>> Trade Update 1.990 0.064: 0.145 608 0 0.00% 432 0
>>>> Data Maintenance N/A 0.012: 0.012 1 0 0.00% 0 0
>>>> ----------------- ------- --------------- ------- --------------
>>>> ------- -------
>>>> 28.35 trade-result transactions per second (trtps)
>>>>
>>>> Regards,
>>>> Ryan
>>>>
>>>> On 26/07/2014 3:55 PM, Reza Taheri wrote:
>>>>> Hi Ryan,
>>>>> That's a very good point. We are looking at dbt5. One question: what
>>>> throughput rate, and how many threads of execution did you use for
>> dbt5?
>>>> The failure rates I reported were at ~120 tps with 15 trade-result threads.
>>>>> Thanks,
>>>>> Reza
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: pgsql-performance-owner(at)postgresql(dot)org [mailto:pgsql-
>>>>>> performance-owner(at)postgresql(dot)org] On Behalf Of Ryan Johnson
>>>>>> Sent: Friday, July 25, 2014 2:36 PM
>>>>>> To: pgsql-performance(at)postgresql(dot)org
>>>>>> Subject: Re: High rate of transaction failure with the Serializable
>>>>>> Isolation Level
>>>>>>
>>>>>> On 25/07/2014 2:58 PM, Reza Taheri wrote:
>>>>>>> Hi Craig,
>>>>>>>
>>>>>>>> According to the attached SQL, each frame is a separate phase in
>>>>>>>> the
>>>>>> operation and performs many different operations.
>>>>>>>> There's a *lot* going on here, so identifying possible
>>>>>>>> interdependencies isn't something I can do in a ten minute skim
>>>>>>>> read over
>>>>>> my morning coffee.
>>>>>>> You didn't think I was going to bug you all with a trivial
>>>>>>> problem, did you? :-) :-)
>>>>>>>
>>>>>>> Yes, I am going to have to take an axe to the code and see what
>>>>>>> pops
>>>> out.
>>>>>> Just to put this in perspective, the transaction flow and its
>>>>>> statements are borrowed verbatim from the TPC-E benchmark. There
>>>> have
>>>>>> been dozens of TPC-E disclosures with MS SQL Server, and there are
>>>>>> Oracle and DB2 kits that, although not used in public disclosures
>>>>>> for various non-technical reasons, are used internally in by the DB
>>>>>> and server companies. These 3 products, and perhaps more, were
>> used
>>>> extensively in the prototyping phase of TPC-E.
>>>>>>> So, my hope is that if there is a "previously unidentified
>>>>>>> interdependency
>>>>>> between transactions" as you point out, it will be due to a mistake
>>>>>> we made in coding this for PGSQL. Otherwise, we will have a hard
>>>>>> time convincing all the council member companies that we need to
>>>>>> change the schema or the business logic to make the kit work with
>> PGSQL.
>>>>>>> Just pointing out my uphill battle!!
>>>>>> You might compare against dbt-5 [1], just to see if the same
>>>>>> problem occurs. I didn't notice such high abort rates when I ran
>>>>>> that workload a few weeks ago. Just make sure to use the latest
>>>>>> commit, because the "released" version has fatal bugs.
>>>>>>
>>>>>> [1]
>>>>>>
>> https://urldefense.proofpoint.com/v1/url?u=https://github.com/peterge
>>>>>> og
>> hegan/dbt5&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=b9TKmA0CPjr
>> oD2HLPTHU27nI9PJr8wgKO2rU9QZyZZU%3D%0A&m=6E%2F9fWJPMGjpMyP
>> xtY0nsamLLW%2FNsTXu7FP9Wzauj10%3D%0A&s=b3f269216d419410f3f07bb
>>>>>> 774a27b7d377744c9d423df52a3e62324d9279958
>>>>>>
>>>>>> Ryan
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sent via pgsql-performance mailing list
>>>>>> (pgsql-performance(at)postgresql(dot)org)
>>>>>> To make changes to your subscription:
>>>>>>
>> https://urldefense.proofpoint.com/v1/url?u=http://www.postgresql.org/
>>>>>> m
>>>>>> ailpref/pgsql-
>>>>>>
>> performance&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=b9TKmA0CP
>> jroD2HLPTHU27nI9PJr8wgKO2rU9QZyZZU%3D%0A&m=6E%2F9fWJPMGjpMy
>> PxtY0nsamLLW%2FNsTXu7FP9Wzauj10%3D%0A&s=45ab94ce068dbe28956af
>>>>>> 8bb3f999e9a91138dd1e3c3345c036e87e902da1ef1
>>>>
>>>> --
>>>> Sent via pgsql-performance mailing list
>>>> (pgsql-performance(at)postgresql(dot)org)
>>>> To make changes to your subscription:
>>>>
>> https://urldefense.proofpoint.com/v1/url?u=http://www.postgresql.org/
>>>> m
>>>> ailpref/pgsql-
>>>>
>> performance&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=b9TKmA0CP
>> jroD2HLPTHU27nI9PJr8wgKO2rU9QZyZZU%3D%0A&m=gzdXAra2QlJIiMTFSjH
>> cKAsSKNR5LST%2FrsLWdeb7Y9c%3D%0A&s=673454322b6239edd9d02472e95
>>>> e8a6c15cb1a095d2afb9c981642e44fb40672

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Rural Hunter 2014-08-04 07:23:11 Re: Very slow planning performance on partition table
Previous Message Reza Taheri 2014-08-02 16:28:55 Re: High rate of transaction failure with the Serializable Isolation Level