Re: [GENERAL] PostgreSQL TPC-H test result?

From: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
To: "Jignesh K(dot) Shah" <J(dot)K(dot)Shah(at)sun(dot)com>
Cc: "Greg Smith" <gsmith(at)gregsmith(dot)com>, Amber <guxiaobo1982(at)hotmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: [GENERAL] PostgreSQL TPC-H test result?
Date: 2008-09-12 04:46:55
Message-ID: 36e682920809112146y2a1f43cdk5bc2b085de92e16d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-performance

On Thu, Sep 11, 2008 at 11:30 PM, Jignesh K. Shah <J(dot)K(dot)Shah(at)sun(dot)com> wrote:
> Moving this thread to Performance alias as it might make more sense for
> folks searching on this topic:

You should be using DBT-3. Similarly, a scale factor of 10 is
pointless. How many data warehouses are only 10GB? Also, it's
well-known that MonetDB will quickly fall over when you run a test
larger than can fit in memory. In the real benchmark, the minimum
scale factor is 100GB; try it and see what you get.

If you have the resources and want to compare it to something, compare
it with Oracle on the exact same system. If tuned properly, Oracle
10g (Standard Edition with the exact same tables/indexes/queries as
Postgres) is ~5-10x faster and Enterprise Edition is ~50-100x faster.
To be fair, an Oracle Enterprise Edition configuration for TPC-H uses
advanced partitioning and materialized views, both of which Postgres
does not support, which makes it an apples-to-oranges comparison. I
haven't tried 11g, but I expect it will perform a bit better in this
area given several of the enhancements. Also, while it's not widely
known, if you wanted to compare systems and don't want to set it all
up yourself, Oracle released Oracle-compatible versions of OSDL's
Database Test Suite for DBT-2 (TPC-C) and DBT-3 (TPC-H) as part of
their Linux Test Kit which can be found at oss.oracle.com. Due to
Oracle's license, I can't give you exact timings, but I have confirmed
with several other benchmark professionals that the results mentioned
above have been confirmed by others as well.

To be clear, I'm not trying to bash on PG and I don't want to start a
flame-war. I just think that people should be aware of where we stand
in comparison to commercial systems and understand that there's quite
a bit of work to be done in the VLDB area. Specifically, I don't
think we should be striving for great TPC-H performance, but I believe
there is some areas we could improve based on it. Similarly, this is
an area where a properly-utilized fadvise may show some benefit.

As for running the TPC-H on Postgres, you need a
default_statistics_target of at least 250. IIRC, the last time I
checked (on 8.3), you really needed a statistics target around
400-500. For the most part, the planner is choosing a bad plan for
several of the queries. After you resolve that, you'll quickly notice
that Postgres' buffer manager design and the lack of a good
multi-block read quickly comes into play. The hash join
implementation also has a couple issues which I've recently seen
mentioned in other threads.

Use DBT-3, it will save you quite a few headaches :)

--
Jonah H. Harris, Senior DBA
myYearbook.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Artacus 2008-09-12 05:11:13 Re: declare column update expression
Previous Message Tom Lane 2008-09-12 03:48:06 Re: [GENERAL] PostgreSQL TPC-H test result?

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew Wakeling 2008-09-12 07:45:10 Re: Choosing a filesystem
Previous Message Greg Smith 2008-09-12 04:09:21 Re: Effects of setting linux block device readahead size