Fast insert, but slow join and updates for table with 4 billion rows

From: Lars Aksel Opsahl <Lars(dot)Opsahl(at)nibio(dot)no>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Fast insert, but slow join and updates for table with 4 billion rows
Date: 2016-10-24 08:11:48
Message-ID: ecea7a5ae84345d2bbe12bb9e3f1edf6@nibio.no
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Hi

I have two main problems and that is slow updates and joins, but when I build up the table met_vaer_wisline.nora_bc25_observation with more than 4 billion we are able to insert about 85.000 rows pr sekund so thats ok.

The problems start when I need to update or joins with other tables using this table.

In this example I have two tables one with 4 billion rows and another with 50000 rows and then I try to do a standard simple join between this two tables and this takes 397391 ms. with this SQL (the query plan is added is further down)

SELECT o.*

FROM

met_vaer_wisline.nora_bc25_observation o,

met_vaer_wisline.new_data n

WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch

but if I use this SQL it takes 25727 ms (the query plan is added is further down).

SELECT

o.*

FROM

(

SELECT o.*

FROM

met_vaer_wisline.nora_bc25_observation o

WHERE

EXISTS (SELECT 1 FROM (SELECT distinct epoch FROM met_vaer_wisline.new_data) AS n WHERE n.epoch = o.epoch )

AND

EXISTS (SELECT 1 FROM (SELECT distinct id_point FROM met_vaer_wisline.new_data) AS n WHERE n.id_point = o.point_uid_ref )

) AS o,

met_vaer_wisline.new_data n

WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch

The columns are indexed and I did run vacuum analyze on both tables before I tested. work_mem is 200MB but I also tested with much more work_mem but that does not change the execution time.

The CPU goes to 100% when the query is running and there is no IOWait while the SQL is running.

Why is the second SQL 15 times faster ?

Is this normal or have I done something wrong here ?

I have tested clustering around a index but that did not help.

Is the only way to fix slow updates and joins to use partitioning ?

https://www.postgresql.org/docs/9.6/static/ddl-partitioning.html

Here are the SQL and more info

EXPLAIN analyze

SELECT o.*

FROM

met_vaer_wisline.nora_bc25_observation o,

met_vaer_wisline.new_data n

WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch

-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Merge Join (cost=0.87..34374722.51 rows=52579 width=16) (actual time=0.127..397379.844 rows=50000 loops=1)

-[ RECORD 2 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Merge Cond: (n.id_point = o.point_uid_ref)

-[ RECORD 3 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Join Filter: (o.epoch = n.epoch)

-[ RECORD 4 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Rows Removed by Join Filter: 2179150000

-[ RECORD 5 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | -> Index Scan using idx_met_vaer_wisline_new_data_id_point on new_data n (cost=0.29..23802.89 rows=50000 width=8) (actual time=0.024..16.736 rows=50000 loops=1)

-[ RECORD 6 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | -> Index Scan using idx_met_vaer_wisline_nora_bc25_observation_point_uid_ref on nora_bc25_observation o (cost=0.58..2927642364.25 rows=4263866624 width=16) (actual time=0.016..210486.136 rows=2179200001 loops=1)

-[ RECORD 7 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Total runtime: 397383.663 ms

Time: 397391.388 ms

EXPLAIN analyze

SELECT

o.*

FROM

(

SELECT o.*

FROM

met_vaer_wisline.nora_bc25_observation o

WHERE

EXISTS (SELECT 1 FROM (SELECT distinct epoch FROM met_vaer_wisline.new_data) AS n WHERE n.epoch = o.epoch )

AND

EXISTS (SELECT 1 FROM (SELECT distinct id_point FROM met_vaer_wisline.new_data) AS n WHERE n.id_point = o.point_uid_ref )

) AS o,

met_vaer_wisline.new_data n

WHERE o.point_uid_ref = n.id_point AND o.epoch = n.epoch

-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Hash Semi Join (cost=1019.70..1039762.81 rows=54862 width=16) (actual time=359.284..25717.838 rows=50096 loops=1)

-[ RECORD 2 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Hash Cond: (o.point_uid_ref = new_data_1.id_point)

-[ RECORD 3 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | -> Nested Loop (cost=0.87..972602.28 rows=24964326 width=16) (actual time=0.287..24412.088 rows=24262088 loops=1)

-[ RECORD 4 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | -> Unique (cost=0.29..1014.29 rows=248 width=4) (actual time=0.117..6.849 rows=248 loops=1)

-[ RECORD 5 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | -> Index Only Scan using idx_met_vaer_wisline_new_data_epoch on new_data (cost=0.29..889.29 rows=50000 width=4) (actual time=0.115..4.521 rows=50000 loops=1)

-[ RECORD 6 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Heap Fetches: 0

-[ RECORD 7 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | -> Index Scan using idx_met_vaer_wisline_nora_bc25_observation_epoch on nora_bc25_observation o (cost=0.58..2911.05 rows=100663 width=16) (actual time=0.014..89.512 rows=97831 loops=248)

-[ RECORD 8 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Index Cond: (epoch = new_data.epoch)

-[ RECORD 9 ]----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | -> Hash (cost=1016.31..1016.31 rows=202 width=4) (actual time=16.636..16.636 rows=202 loops=1)

-[ RECORD 10 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Buckets: 1024 Batches: 1 Memory Usage: 8kB

-[ RECORD 11 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | -> Unique (cost=0.29..1014.29 rows=202 width=4) (actual time=0.046..16.544 rows=202 loops=1)

-[ RECORD 12 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | -> Index Only Scan using idx_met_vaer_wisline_new_data_id_point on new_data new_data_1 (cost=0.29..889.29 rows=50000 width=4) (actual time=0.046..11.315 rows=50000 loops=1)

-[ RECORD 13 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Heap Fetches: 0

-[ RECORD 14 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

QUERY PLAN | Total runtime: 25719.120 ms

Time: 25727.097 ms

select version();

version

--------------------------------------------------------------------------------------------------------------

PostgreSQL 9.3.9 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

(1 row)

\d met_vaer_wisline.nora_bc25_observation;

Table "met_vaer_wisline.nora_bc25_observation"

Column | Type | Modifiers

--------------------+---------+-----------

point_uid_ref | integer | not null

epoch | integer | not null

windspeed_10m | real |

air_temperature_2m | real |

Indexes:

"idx_met_vaer_wisline_nora_bc25_observation_epoch" btree (epoch)

"idx_met_vaer_wisline_nora_bc25_observation_point_uid_ref" btree (point_uid_ref)

\d met_vaer_wisline.new_data ;

Unlogged table "met_vaer_wisline.new_data"

Column | Type | Modifiers

--------------------+-------------------+-----------

windspeed_10m | real |

air_temperature_2m | real |

lon | character varying | not null

lat | character varying | not null

epoch | integer |

epoch_as_numeric | numeric | not null

rest | character varying |

id_point | integer |

Indexes:

"idx_met_vaer_wisline_new_data_epoch" btree (epoch)

"idx_met_vaer_wisline_new_data_id_point" btree (id_point)

vacuum analyze met_vaer_wisline.nora_bc25_observation;

vacuum analyze met_vaer_wisline.new_data;

SELECT count(*) from met_vaer_wisline.new_data;

count

-------

50000

(1 row)

SELECT count(*) from met_vaer_wisline.nora_bc25_observation ;

count

------------

4263866304

Thanks .

Lars

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2016-10-24 12:52:08 Re: Fast insert, but slow join and updates for table with 4 billion rows
Previous Message negora 2016-10-19 17:07:13 Re: Performance of a nested loop, whose inner loop uses an index scan.