dbt3 data with 10GB scale factor

From: Mark Wong <markw(at)osdl(dot)org>
To: pgsql-hackers(at)postgresql(dot)org, testperf-general(at)pgfoundry(dot)org
Cc: maryedie(at)osdl(dot)org
Subject: dbt3 data with 10GB scale factor
Date: 2005-08-02 21:22:43
Message-ID: 200508022122.j72LMDjA022577@smtp.osdl.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've started scaling dbt3 up to the 10GB scale factor against CVS and
the fast COPY patch:

http://www.testing.osdl.org/projects/dbt3testing/results/dev4-010/53/

I'm sure there are some better database parameters I should use so
please let me know what to try. ;) What I've found interesting is the
difference in the time it takes for Q9 to run in the power test than the
throughput test:

Power Test 00:56:27
Throughput Stream 1 00:38:13
Throughput Stream 2 00:41:33
Throughput Stream 3 00:20:16
Throughput Stream 4 00:18:11

Diffing the query plans between the Power Test and the individual
streams in the Throughput Test, I only see a Materialize and Seq Scan on
the nation table versus just a Seq Scan on the nation table between two
of the plans. But this doesn't appear to account for the execution time
difference as the query with the same plan executes just as fast during
the Throughput Test. Here are the plans in full:

EXPLAIN ANALYZE in the Power Test for Q9:

EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%light%' ) as profit group by nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=182648.08..182648.22 rows=5 width=49) (actual time=3375680.779..3387407.186 rows=175 loops=1)
-> Sort (cost=182648.08..182648.09 rows=5 width=49) (actual time=3375639.390..3379116.547 rows=3254907 loops=1)
Sort Key: nation.n_name, date_part('year'::text, (orders.o_orderdate)::timestamp without time zone)
-> Nested Loop (cost=1.27..182648.02 rows=5 width=49) (actual time=43.412..3276365.695 rows=3254907 loops=1)
-> Nested Loop (cost=1.27..182632.87 rows=5 width=49) (actual time=14.798..2510769.669 rows=3254907 loops=1)
Join Filter: ("outer".s_nationkey = "inner".n_nationkey)
-> Nested Loop (cost=0.00..182628.78 rows=5 width=24) (actual time=14.685..2415773.175 rows=3254907 loops=1)
-> Nested Loop (cost=0.00..182613.65 rows=5 width=28) (actual time=14.654..2377308.988 rows=3254907 loops=1)
-> Nested Loop (cost=0.00..105392.76 rows=12804 width=16) (actual time=0.071..10991.718 rows=433752 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=3200 width=4) (actual time=0.023..3698.415 rows=108438 loops=1)
Filter: ((p_name)::text ~~ '%light%'::text)
-> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00 rows=148 width=12) (actual time=0.045..0.059 rows=4 loops=108438)
Index Cond: ("outer".p_partkey = partsupp.ps_partkey)
-> Index Scan using i_l_suppkey_partkey on lineitem (cost=0.00..6.02 rows=1 width=24) (actual time=0.891..5.441 rows=8 loops=433752)
Index Cond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.008..0.009 rows=1 loops=3254907)
Index Cond: (supplier.s_suppkey = "outer".l_suppkey)
-> Materialize (cost=1.27..1.52 rows=25 width=33) (actual time=0.000..0.011 rows=25 loops=3254907)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.005..0.047 rows=25 loops=1)
-> Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.229..0.230 rows=1 loops=3254907)
Index Cond: (orders.o_orderkey = "outer".l_orderkey)
Total runtime: 3387491.286 ms
(22 rows)

Throughput Stream 1 Q9:

EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%ivory%' ) as profit group by nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=182648.08..182648.22 rows=5 width=49) (actual time=2280062.422..2293138.084 rows=175 loops=1)
-> Sort (cost=182648.08..182648.09 rows=5 width=49) (actual time=2280017.683..2283991.830 rows=3401050 loops=1)
Sort Key: nation.n_name, date_part('year'::text, (orders.o_orderdate)::timestamp without time zone)
-> Nested Loop (cost=1.27..182648.02 rows=5 width=49) (actual time=7.639..2171994.940 rows=3401050 loops=1)
-> Nested Loop (cost=1.27..182632.87 rows=5 width=49) (actual time=7.572..1383633.029 rows=3401050 loops=1)
Join Filter: ("outer".s_nationkey = "inner".n_nationkey)
-> Nested Loop (cost=0.00..182628.78 rows=5 width=24) (actual time=7.450..1266801.542 rows=3401050 loops=1)
-> Nested Loop (cost=0.00..182613.65 rows=5 width=28) (actual time=7.391..1219339.594 rows=3401050 loops=1)
-> Nested Loop (cost=0.00..105392.76 rows=12804 width=16) (actual time=7.262..18789.674 rows=451992 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=3200 width=4) (actual time=0.026..7964.747 rows=112998 loops=1)
Filter: ((p_name)::text ~~ '%ivory%'::text)
-> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00 rows=148 width=12) (actual time=0.075..0.086 rows=4 loops=112998)
Index Cond: ("outer".p_partkey = partsupp.ps_partkey)
-> Index Scan using i_l_suppkey_partkey on lineitem (cost=0.00..6.02 rows=1 width=24) (actual time=0.564..2.641 rows=8 loops=451992)
Index Cond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.009..0.010 rows=1 loops=3401050)
Index Cond: (supplier.s_suppkey = "outer".l_suppkey)
-> Materialize (cost=1.27..1.52 rows=25 width=33) (actual time=0.001..0.014 rows=25 loops=3401050)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.017..0.062 rows=25 loops=1)
-> Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.225..0.226 rows=1 loops=3401050)
Index Cond: (orders.o_orderkey = "outer".l_orderkey)
Total runtime: 2293355.531 ms
(22 rows)

Throughput Stream 2 Q9:

EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%goldenrod%' ) as profit group by nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=74055.04..74055.07 rows=1 width=49) (actual time=2476208.097..2493189.788 rows=175 loops=1)
-> Sort (cost=74055.04..74055.05 rows=1 width=49) (actual time=2476149.653..2482032.976 rows=3538739 loops=1)
Sort Key: nation.n_name, date_part('year'::text, (orders.o_orderdate)::timestamp without time zone)
-> Nested Loop (cost=0.00..74055.03 rows=1 width=49) (actual time=50.669..2362896.628 rows=3538739 loops=1)
-> Nested Loop (cost=0.00..74052.00 rows=1 width=49) (actual time=36.915..1110819.931 rows=3538739 loops=1)
Join Filter: ("outer".s_nationkey = "inner".n_nationkey)
-> Nested Loop (cost=0.00..74050.44 rows=1 width=24) (actual time=36.534..827220.019 rows=3538739 loops=1)
-> Nested Loop (cost=0.00..74047.41 rows=1 width=28) (actual time=31.045..739895.872 rows=3538739 loops=1)
-> Nested Loop (cost=0.00..73926.79 rows=20 width=16) (actual time=15.880..25654.447 rows=471212 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=5 width=4) (actual time=0.008..6843.928 rows=117803 loops=1)
Filter: ((p_name)::text ~~ '%goldenrod%'::text)
-> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00 rows=148 width=12) (actual time=0.133..0.149 rows=4 loops=117803)
Index Cond: ("outer".p_partkey = partsupp.ps_partkey)
-> Index Scan using i_l_suppkey_partkey on lineitem (cost=0.00..6.02 rows=1 width=24) (actual time=0.293..1.498 rows=8 loops=471212)
Index Cond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.019..0.021 rows=1 loops=3538739)
Index Cond: (supplier.s_suppkey = "outer".l_suppkey)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.005..0.050 rows=25 loops=3538739)
-> Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.346..0.347 rows=1 loops=3538739)
Index Cond: (orders.o_orderkey = "outer".l_orderkey)
Total runtime: 2493372.155 ms
(21 rows)

Throughput Stream 3 Q9:

EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%firebrick%' ) as profit group by nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=74055.04..74055.07 rows=1 width=49) (actual time=1202351.569..1216140.153 rows=175 loops=1)
-> Sort (cost=74055.04..74055.05 rows=1 width=49) (actual time=1202303.507..1206470.228 rows=3692230 loops=1)
Sort Key: nation.n_name, date_part('year'::text, (orders.o_orderdate)::timestamp without time zone)
-> Nested Loop (cost=0.00..74055.03 rows=1 width=49) (actual time=8.594..1086270.124 rows=3692230 loops=1)
-> Nested Loop (cost=0.00..74052.00 rows=1 width=49) (actual time=4.970..772224.811 rows=3692230 loops=1)
Join Filter: ("outer".s_nationkey = "inner".n_nationkey)
-> Nested Loop (cost=0.00..74050.44 rows=1 width=24) (actual time=4.915..476254.721 rows=3692230 loops=1)
-> Nested Loop (cost=0.00..74047.41 rows=1 width=28) (actual time=4.873..388805.585 rows=3692230 loops=1)
-> Nested Loop (cost=0.00..73926.79 rows=20 width=16) (actual time=3.056..13289.660 rows=490268 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=5 width=4) (actual time=0.026..3358.087 rows=122567 loops=1)
Filter: ((p_name)::text ~~ '%firebrick%'::text)
-> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00 rows=148 width=12) (actual time=0.060..0.071 rows=4 loops=122567)
Index Cond: ("outer".p_partkey = partsupp.ps_partkey)
-> Index Scan using i_l_suppkey_partkey on lineitem (cost=0.00..6.02 rows=1 width=24) (actual time=0.130..0.749 rows=8 loops=490268)
Index Cond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.018..0.020 rows=1 loops=3692230)
Index Cond: (supplier.s_suppkey = "outer".l_suppkey)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.005..0.050 rows=25 loops=3692230)
-> Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.077..0.079 rows=1 loops=3692230)
Index Cond: (orders.o_orderkey = "outer".l_orderkey)
Total runtime: 1216237.044 ms
(21 rows)

Throughput Stream 4 Q9:

EXPLAIN ANALYZE select nation, o_year, sum(amount) as sum_profit from ( select n_name as nation, extract(year from o_orderdate) as o_year, l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity as amount from part, supplier, lineitem, partsupp, orders, nation where s_suppkey = l_suppkey and ps_suppkey = l_suppkey and ps_partkey = l_partkey and p_partkey = l_partkey and o_orderkey = l_orderkey and s_nationkey = n_nationkey and p_name like '%cyan%' ) as profit group by nation, o_year order by nation, o_year desc;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=617707.12..617707.84 rows=26 width=49) (actual time=1076603.089..1091550.720 rows=175 loops=1)
-> Sort (cost=617707.12..617707.19 rows=26 width=49) (actual time=1076549.073..1081134.008 rows=3877975 loops=1)
Sort Key: nation.n_name, date_part('year'::text, (orders.o_orderdate)::timestamp without time zone)
-> Nested Loop (cost=1.27..617706.51 rows=26 width=49) (actual time=25.395..954764.745 rows=3877975 loops=1)
-> Nested Loop (cost=1.27..617627.70 rows=26 width=49) (actual time=17.100..615326.489 rows=3877975 loops=1)
Join Filter: ("outer".s_nationkey = "inner".n_nationkey)
-> Nested Loop (cost=0.00..617611.80 rows=26 width=24) (actual time=17.076..448739.290 rows=3877975 loops=1)
-> Nested Loop (cost=0.00..617530.10 rows=27 width=28) (actual time=17.029..355106.680 rows=3877975 loops=1)
-> Nested Loop (cost=0.00..231443.74 rows=64017 width=16) (actual time=10.339..19179.080 rows=515500 loops=1)
-> Seq Scan on part (cost=0.00..73877.55 rows=15999 width=4) (actual time=0.033..3776.024 rows=128875 loops=1)
Filter: ((p_name)::text ~~ '%cyan%'::text)
-> Index Scan using i_ps_partkey on partsupp (cost=0.00..8.00 rows=148 width=12) (actual time=0.095..0.109 rows=4 loops=128875)
Index Cond: ("outer".p_partkey = partsupp.ps_partkey)
-> Index Scan using i_l_suppkey_partkey on lineitem (cost=0.00..6.02 rows=1 width=24) (actual time=0.142..0.635 rows=8 loops=515500)
Index Cond: (("outer".ps_partkey = lineitem.l_partkey) AND ("outer".ps_suppkey = lineitem.l_suppkey))
-> Index Scan using pk_supplier on supplier (cost=0.00..3.01 rows=1 width=8) (actual time=0.019..0.020 rows=1 loops=3877975)
Index Cond: (supplier.s_suppkey = "outer".l_suppkey)
-> Materialize (cost=1.27..1.52 rows=25 width=33) (actual time=0.001..0.018 rows=25 loops=3877975)
-> Seq Scan on nation (cost=0.00..1.25 rows=25 width=33) (actual time=0.005..0.038 rows=25 loops=1)
-> Index Scan using pk_orders on orders (cost=0.00..3.01 rows=1 width=8) (actual time=0.080..0.081 rows=1 loops=3877975)
Index Cond: (orders.o_orderkey = "outer".l_orderkey)
Total runtime: 1091647.945 ms
(22 rows)

--
Mark

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-08-02 21:30:54 Re: [8.0.3] Not dumping all sequences ...
Previous Message Tom Lane 2005-08-02 20:34:54 Re: bgwriter, inherited temp tables TODO items?