From: | Misa Simic <misa(dot)simic(at)gmail(dot)com> |
---|---|
To: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Slow concurrent processing |
Date: | 2013-03-12 03:55:26 |
Message-ID: | CAH3i69mZ-e1rQshbimu24GVKHV5pKx-oPHaELqnZYo6Ed_y5TA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi all,
We have one table with list of "records for processing"...
We loop trough that table and call one long runing function:
do_the_math_for_record(record_id)
which use different tables for select related rows for input record_id, do
some calculations and insert results in two tables...
and we have made 1 function process_all_records()
what simply does: SELECT do_the_math_for_record(record_id) FROM
records_for_processing
When we run that function - it last about 4 minutes...
There are about 300 rows in records_for_processing... we have logged the
time on the beginning of do_the_math, and the time in end of do the math...
and noticed that processing each row, last between 0.5 to 2 seconds...
so our do_the_math looks like:
PERFORM log_time(record_id, clock_timestamp(), 1)
PERFORM do_the_math_and_save_results(record_id);
PERFORM log_time(record_id, clock_timestamp(), 2)
Then we thought, if we take all "records for processing" and process each
in separate connection - it should last longer...
but - got worse result! (using 30 concurrent connections...)... about 7
mins...
if we reduce concurrent connections on 10 - we got result in approx the
same time as sequential processing...
but - if replace do_the_math_and_save_results with pg_sleep(1); To simulate
long running function so processing each row - last 1 sec...
Sequential processing last as expected 300 seconds!
Concurrent processing last faster with higher number of
concurrent connections - about 30 seconds with 30 connections! (much faster
- and expected...)
however, if we return our: do_the_math_and_save_results - we can't get
better results in concurrent processing...
with higher number of conccurent connections - result is worse... also we
have noticed that for some records difference between end_time and
start_time si even longer than 1 min - but it is random - not always on the
same id... i.e. in this concurrent run lasts 1 min - in next 1 sec - but
some other takes about 1 min...
Any idea - why? :)
It says to me - that there is somewhere lock on some tables - so probably
our concurrent connections wait - to other finish... but I cant figure out:
what and why...
do_the_math_and_save results - selects data from 10 other tables,
calculates something, and results inserts in other tables...
there are about 3 tracking tables with (record_id - other data...... and
about 7 settings tables what we join to tracking tables to get all
info...), then do the math with that info - and insert results..
we don't do any update... (to have possibility two connections want to
update the same row in the same table)
data from tracking_tables - should be separate sets of data for two
differenet record_ids...
(joined rows from settings tables could be common - for two sets of
different record_id)
but - even they are the same set - SELECTs should not lock the rows in
tables...
There are places where we do:
INSERT INTO result_table (columns)
SELECT query (tracking and settings tables joined)
Is there a chance it does some lock somewhere?
can above query be run "concurrently"?
Many thanks,
Misa
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Adams - NOAA Affiliate | 2013-03-12 13:08:31 | Re: Large Table - Slow Window Functions (Better Approach?) |
Previous Message | Tom Lane | 2013-03-11 23:56:59 | Re: Slow query when used in a view |