Re: Introducing the TPC-V benchmark, and its relationship to PostgreSQL

From: Reza Taheri <rtaheri(at)vmware(dot)com>
To: Greg Smith <greg(at)2ndQuadrant(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Introducing the TPC-V benchmark, and its relationship to PostgreSQL
Date: 2012-07-06 03:22:36
Message-ID: 66CE997FB523C04E9749452273184C6C137CC5C395@exch-mbx-113.vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi Greg,
Yes, a single-instance benchmark is a natural fall-out from the TPC-V kit. Our coding team (4 people working directly on the benchmark with another 3-4 folks helping in various consulting capacities) is tasked with creating a multi-VM benchmark. The benchmark is still missing the critical Market Exchange Emulator function. Once that's done, it would be natural for someone else in the TPC to take our working prototype and simplify it for a single-system, TPC-E (not V) reference kit. The conversion is not technically difficult, but releasing kits is a new path for the TPC, and will take some work.

Cheers,
Reza

> -----Original Message-----
> From: Greg Smith [mailto:greg(at)2ndQuadrant(dot)com]
> Sent: Thursday, July 05, 2012 6:25 PM
> To: Reza Taheri
> Cc: pgsql-performance(at)postgresql(dot)org; Andy Bond (abond(at)redhat(dot)com);
> Greg Kopczynski; Jignesh Shah
> Subject: Re: [PERFORM] Introducing the TPC-V benchmark, and its
> relationship to PostgreSQL
>
> On 07/03/2012 07:08 PM, Reza Taheri wrote:
> > TPC-V is a new benchmark under development for virtualized databases.
> > A TPC-V configuration has:
> >
> > - multiple virtual machines running a mix of DSS, OLTP, and business
> > logic apps
> >
> > - VMs running with throughputs ranging from 10% to 40% of the total
> > system ..
>
> I think this would be a lot more interesting to the traditional, dedicated
> hardware part of the PostgreSQL community if there was a clear way to run
> this with only a single active machine too. If it's possible for us to use this to
> compare instances of PostgreSQL on dedicated hardware, too, that is
> enormously more valuable to people with larger installations. It might be
> helpful to VMWare as well. Being able to say "this VM install gets X% of the
> performance of a bare-metal install"
> answers a question I get asked all the time--when people want to decide
> between dedicated and virtual setups.
>
> The PostgreSQL community could use a benchmark like this for its own
> performance regression testing too. A lot of that work is going to happen on
> a dedicated machines.
>
> > After waving our hands through a number
> > of small differences between the platforms, we have calculated a CPU
> > cost of around 3.2ms/transaction for the published MS SQL results,
> > versus a measurement of 8.6ms/transaction for PostgreSQL. (TPC
> > benchmarks are typically pushed to full CPU utilization. One removes
> > all bottlenecks in storage, networking, etc., to achieve the 100% CPU
> usage.
> > So CPU cost/tran is the final decider of performance.) So we need to
> > cut the CPU cost of transactions in half to make publications with
> > PostgreSQL comparable to commercial databases.
>
> I appreciate that getting close to parity here is valuable. This situation is so
> synthetic though--removing other bottlenecks and looking at CPU timing--
> that it's hard to get too excited about optimizing for it. There's a lot of
> things in PostgreSQL that we know are slower than commercial databases
> because they're optimized for flexibility (the way operators are
> implemented is the best example) or for long-term code maintenance.
> Microsoft doesn't care if they have terribly ugly code that runs faster,
> because no one sees that code. PostgreSQL does care.
>
> The measure that's more fair is a system cost based ones. What I've found is
> that a fair number of people note PostgreSQL's low-level code isn't quite as
> fast as some of the less flexible alternatives--hard coding operators is surely
> cheaper than looking them up each time--but the license cost savings more
> than pays for bigger hardware to offset that. I wish I had any customer
> whose database was CPU bound, that would be an awesome world to live
> in.
>
> Anyway, guessing at causes here is premature speculation. When there's
> some code for the test kit published, at that point discussing the particulars
> of why it's not running well will get interesting.
>
> --
> Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Reza Taheri 2012-07-06 03:33:13 Re: The need for clustered indexes to boost TPC-V performance
Previous Message Craig Ringer 2012-07-06 03:08:16 Re: The need for clustered indexes to boost TPC-V performance