From: | Jorge Arevalo <jorgearevalo(at)libregis(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Query optimization |
Date: | 2014-10-29 01:03:22 |
Message-ID: | CAMhtMNNTDmcTXZ8yjyxtaXvaYYaW04_tnqXRqajH=Kz1Dswmtw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello,
I'm trying to optimize a query that takes too much time. This is what I have
table1(field1, field2... field14): contains about 8.5 million rows
table2(f1, f2, f3): contains about 2.5 million rows
table3: is empty, and must be filled with data coming from table1 and table2
To fill table3, I'm using a query that looks like this:
WITH
records_to_insert(field1,field2,field3,field4,field5,field6_hstore,field7,field8_array)
AS
(SELECT value1,value2,value3,value4,value5, hstore(ARRAY['field9',
'field10', 'field11', 'field12', 'field13', 'field14'], ARRAY[field9,
field10, field11, field12, field13, field14) as metadata, value7, (select
array((select row(f1, f2) from table2 p where p.field7 = field7))) as
values_array FROM table1)
SELECT
fill_table3_function(field1,field2,field3,field4,field5,field6_hstore,field7,field8_array)
FROM records_to_insert
So, I first generate a common table records_to_insert, using data from
table1 and table2, and then call a function fill_table3_function, in order
to insert the values into table3 (I do more things apart from insert,
that's reason to call a function instead of just raising an insert query).
There are indexes created on all the columns that need them.
I'm having problems because the query takes a lot of time, and the server
returns a timeout error.
I think the bottleneck is the array built for the last argument of my
common table, and maybe the hstore too (not sure if it's a heavy process or
not). First thing I've tried is to limit the query to build the common
table, using LIMIT .. OFFSET after 'FROM table1', and make a loop to seek
table1 (the big one, with 8 million rows). But still getting timeout errors.
The problem is I don't have access to Postgres configuration, in order to
increase the timeout for user queries. And anyway, I don't think that
increasing the timeout is a real solution (It'll just make the server
suffer for more time).
So, is there anything obviously wrong with my query? Any changes to make it
faster?
Many thanks in advance, and best regards,
--
Jorge Arevalo
Freelance developer
From | Date | Subject | |
---|---|---|---|
Next Message | Romu Hu | 2014-10-29 01:44:23 | Re: [GENERAL] Need guidance on regression.diffs |
Previous Message | Thom Brown | 2014-10-28 21:47:34 | Re: How to find earlest possible start times for given duration excluding reservations |