Performance on Bulk Insert to Partitioned Table

From: Charles Gomes <charlesrg(at)outlook(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Performance on Bulk Insert to Partitioned Table
Date: 2012-12-20 17:29:19
Message-ID: BLU002-W2777115565E1CBCF867857AB370@phx.gbl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello guys

 

I’m doing 1.2 Billion inserts into a table partitioned in
15.

 

When I target the MASTER table on all the inserts and let
the trigger decide what partition to choose from it takes 4 hours.

If I target the partitioned table directly during the
insert I can get 4 times better performance. It takes 1 hour.

 

 

I’m trying to get more performance while still using the
trigger to choose the table, so partitions can be changed without changing the
application that inserts the data.

 

What I noticed that iostat is not showing an I/O bottle
neck.

iostat –xN 1

Device:        
rrqm/s   wrqm/s     r/s    
w/s   rsec/s   wsec/s avgrq-sz avgqu-sz  
await  svctm  %util

Pgresql--data    
0.00     0.00    0.00
8288.00     0.00 66304.00    
8.00    60.92    7.35  
0.01   4.30

 

iostat –m 1

Device:           
tps    MB_read/s    MB_wrtn/s   
MB_read    MB_wrtn

dm-2          
4096.00        
0.00       
16.00         
0         16

 

I also don’t see a CPU bottleneck or context switching
bottle neck.

 

Postgresql does not seem to write more than 16MB/s or 4K
transactions per second unless I target each individual partition.

 

Did anybody have done some studies on partitioning bulk
insert performance?

 

Any suggestions on a way to accelerate it ?

 

 

Running pgsql 9.2.2 on RHEL 6.3

 

My trigger is pretty straight forward:

CREATE OR REPLACE FUNCTION quotes_insert_trigger()

RETURNS trigger AS $$

DECLARE

tablename varchar(24);

bdate varchar(10);

edate varchar(10);

BEGIN

tablename = 'quotes_' ||
to_char(new.received_time,'YYYY_MM_DD');

EXECUTE 'INSERT INTO '|| tablename ||' VALUES (($1).*)'
USING NEW ;

RETURN NULL;

END;

$$

LANGUAGE plpgsql;

 

CREATE TRIGGER quotes_insert_trigger

BEFORE INSERT ON quotes

FOR EACH ROW EXECUTE PROCEDURE quotes_insert_trigger();

 

 

Thanks

Charles

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2012-12-20 17:39:25 Re: Performance on Bulk Insert to Partitioned Table
Previous Message Tom Lane 2012-12-20 15:49:03 Re: Why does the query planner use two full indexes, when a dedicated partial index exists?