From: | Jorge Arevalo <jorgearevalo(at)libregis(dot)org> |
---|---|
To: | David G Johnston <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Query optimization |
Date: | 2014-10-29 18:06:03 |
Message-ID: | CAMhtMNOoGMB3fSMdjjYf2ukq=DmhDp4eHsnM8H_cdfATCjTRyQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello David, many thanks for your responses,
Sorry for not providing the content of the fill_table3_function, but it
just executes 3 insert queries in 3 different tables. And I've checked the
time consuming operation is in this query (by the way, there was a little
mistake in the name of the fields of the inner select, I've corrected it)
SELECT value1,value2,value3,value4,
value5, hstore(ARRAY['field9', 'field10', 'field11', 'field12', 'field13',
'field14'], ARRAY[field9, field10, field11, field12, field13, field14) as
metadata, value7, (select array((select row(f1, f2) from table2 p where
p.f3 = field7))) as values_array FROM table1
This is the result of EXPLAIN ANALYZE
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using table1_pkey on table1 (cost=67846.38..395773.45
rows=8419127 width=88) (actual time=7122.704..22670.680 rows=8419127
loops=1)
InitPlan 2 (returns $1)
-> Result (cost=67846.29..67846.29 rows=1 width=0) (actual
time=7009.063..7009.065 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Seq Scan on table2 p (cost=0.00..67846.29 rows=12689
width=20) (actual time=14.971..5069.840 rows=2537787 loops=1)
Filter: (f3 = field7)
So, there's a sequential scan over table2 (in the query to get
values_array), instead of a index scan. Could it be because the SELECT
returns more than approximately 5-10% of all rows in the table? (I've heard
that, under those conditions, a sequential scan is faster than index scan,
because the amount of I/O operations required for each row)
Anyway, if I understood well, I should try:
- Avoiding that inner query by using a JOIN instead
- Return a composite type instead of an array
Am I right? What kind of additional context information would you need?
Many thanks!!
--
Jorge
On Wed, Oct 29, 2014 at 2:54 AM, David G Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> Jorge Arévalo-2 wrote
> > (SELECT value1,value2,value3,value4,value5, hstore(ARRAY['field9',
> > 'field10', 'field11', 'field12', 'field13', 'field14'], ARRAY[field9,
> > field10, field11, field12, field13, field14) as metadata, value7, (select
> > array((select row(f1, f2) from table2 p where p.field7 = field7))) as
> > values_array FROM table1)
>
> You might try seeing whether:
>
> FROM table1 JOIN (
> SELECT field7, array_agg(row(f1, f2)) AS values_array FROM table2 GROUP BY
> field7
> ) tbl2_agg USING (field7)
>
> helps...
>
> I'm also dubious (though this isn't necessarily a performance issue) of:
>
> array[...] AS metadata
>
> Without context I would say this would be better as a composite type
> instead
> of an array. You may find it useful to use named composite types elsewhere
> too...
>
> David J.
>
>
>
>
> --
> View this message in context:
> http://postgresql.1045698.n5.nabble.com/Query-optimization-tp5824739p5824746.html
> Sent from the PostgreSQL - general mailing list archive at Nabble.com.
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
--
Jorge Arevalo
Freelance developer
From | Date | Subject | |
---|---|---|---|
Next Message | David Johnston | 2014-10-29 18:24:24 | Re: Query optimization |
Previous Message | Adrian Klaver | 2014-10-29 14:07:46 | Re: Appending new data to existing field of Json data type |