Re: TPC-R benchmarks

From: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>
To: Oleg Lebedev <oleg(dot)lebedev(at)waterford(dot)org>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: TPC-R benchmarks
Date: 2003-10-01 21:59:41
Message-ID: Pine.LNX.4.33.0310011556110.22864-100000@css120.ihs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

For troubleshooting, can you try it with "set enable_nestloop = false" and
rerun the query and see how long it takes?

It looks like the estimates of rows returned is WAY off (estimate is too
low compared to what really comes back.)

Also, you might try to alter the table.column to have a higher target on
the rows p_partkey and ps_partkey and any others where the estimate is so
far off of the reality.

On Wed, 1 Oct 2003, Oleg Lebedev wrote:

> All right, my query just finished running with EXPLAIN ANALYZE.
> I show the plan below and also attached it as a file.
> Any ideas?
>
> -> Sort (cost=54597.49..54597.50 rows=1 width=121) (actual
> time=6674562.03..6674562.15 rows=175 loops=1)
> Sort Key: nation.n_name, date_part('year'::text,
> orders.o_orderdate)
> -> Aggregate (cost=54597.45..54597.48 rows=1 width=121)
> (actual time=6668919.41..6674522.48 rows=175 loops=1)
> -> Group (cost=54597.45..54597.47 rows=3 width=121)
> (actual time=6668872.68..6672136.96 rows=348760 loops=1)
> -> Sort (cost=54597.45..54597.46 rows=3
> width=121) (actual time=6668872.65..6669499.95 rows=348760 loops=1)
> Sort Key: nation.n_name,
> date_part('year'::text, orders.o_orderdate)
> -> Hash Join (cost=54596.00..54597.42
> rows=3
> width=121) (actual time=6632768.89..6650192.67 rows=348760 loops=1)
> Hash Cond: ("outer".n_nationkey =
> "inner".s_nationkey)
> -> Seq Scan on nation
> (cost=0.00..1.25 rows=25 width=33) (actual time=6.75..7.13 rows=25
> loops=1)
> -> Hash (cost=54596.00..54596.00
> rows=3
> width=88) (actual time=6632671.96..6632671.96 rows=0 loops=1)
> -> Nested Loop
> (cost=0.00..54596.00 rows=3 width=88) (actual time=482.41..6630601.46
> rows=348760 loops=1)
> Join Filter:
> ("inner".s_suppkey = "outer".l_suppkey)
> -> Nested Loop
> (cost=0.00..54586.18 rows=3 width=80) (actual time=383.87..6594984.40
> rows=348760 loops=1)
> -> Nested Loop
> (cost=0.00..54575.47 rows=4 width=68) (actual time=199.95..3580882.07
> rows=348760 loops=1)
> Join Filter:
> ("outer".p_partkey = "inner".ps_partkey)
> -> Nested Loop
> (cost=0.00..22753.33 rows=9343 width=49) (actual time=146.85..3541433.10
> rows=348760 loops=1)
> -> Seq
> Scan on part (cost=0.00..7868.00 rows=320 width=4) (actual
> time=33.64..15651.90 rows=11637 loops=1)
>
> Filter: (p_name ~~ '%green%'::text)
> -> Index
> Scan using i_l_partkey on lineitem (cost=0.00..46.15 rows=29 width=45)
> (actual time=10.71..302.67 rows=30 loops=11637)
>
> Index
> Cond: ("outer".p_partkey = lineitem.l_partkey)
> -> Index Scan
> using pk_partsupp on partsupp (cost=0.00..3.39 rows=1 width=19) (actual
> time=0.09..0.09 rows=1 loops=348760)
> Index
> Cond: ((partsupp.ps_partkey = "outer".l_partkey) AND
> (partsupp.ps_suppkey =
> "outer".l_suppkey))
> -> Index Scan using
> pk_orders on orders (cost=0.00..3.01 rows=1 width=12) (actual
> time=8.62..8.62 rows=1 loops=348760)
> Index Cond:
> (orders.o_orderkey = "outer".l_orderkey)
> -> Index Scan using
> pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual
> time=0.08..0.08 rows=1 loops=348760)
> Index Cond:
> ("outer".ps_suppkey = supplier.s_suppkey) Total runtime: 6674724.23
> msec (28 rows)
>
>
> -----Original Message-----
> From: Oleg Lebedev
> Sent: Wednesday, October 01, 2003 12:00 PM
> To: Josh Berkus; scott.marlowe
> Cc: pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] TPC-R benchmarks
> Importance: Low
>
>
> Sure, below is the query. I attached the plan to this posting.
>
> select
> nation,
> o_year,
> sum(amount) as sum_profit
> from
> (
> select
> n_name as nation,
> extract(year from o_orderdate) as o_year,
> l_extendedprice * (1 - l_discount) -
> ps_supplycost * l_quantity as amount
> from
> part,
> supplier,
> lineitem,
> partsupp,
> orders,
> nation
> where
> s_suppkey = l_suppkey
> and ps_suppkey = l_suppkey
> and ps_partkey = l_partkey
> and p_partkey = l_partkey
> and o_orderkey = l_orderkey
> and s_nationkey = n_nationkey
> and p_name like '%green%'
> ) as profit
> group by
> nation,
> o_year
> order by
> nation,
> o_year desc;
>
>
> -----Original Message-----
> From: Josh Berkus [mailto:josh(at)agliodbs(dot)com]
> Sent: Wednesday, October 01, 2003 11:42 AM
> To: Oleg Lebedev; scott.marlowe
> Cc: pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] TPC-R benchmarks
>
>
> Oleg,
>
> > The output of the query should contain about 200 rows. So, I guess the
>
> > planer is off assuming that the query should return 1 row.
>
> Oh, also did you post the query before? Can you re-post it with the
> planner
> results?
>
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-10-01 22:19:43 Re: TPC-R benchmarks
Previous Message Oleg Lebedev 2003-10-01 21:02:43 Re: TPC-R benchmarks