Re: PostgreSQL Query Speed Issues

From: Joseph Pravato <joseph(at)pluto(dot)nomagic(dot)com>
To: Douglas J Hunley <doug(dot)hunley(at)gmail(dot)com>
Cc: Joseph Pravato <joseph(dot)pravato(at)nomagic(dot)com>, pgsql-novice(at)postgresql(dot)org
Subject: Re: PostgreSQL Query Speed Issues
Date: 2013-02-21 22:51:22
Message-ID: 5126A4EA.6000407@pluto.nomagic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On 2/21/2013 3:00 PM, Douglas J Hunley wrote:
> On Thu, Feb 21, 2013 at 3:56 PM, Joseph Pravato
> <joseph(dot)pravato(at)nomagic(dot)com> wrote:
>> NOTE: All database tests were done without changing or updating any settings after install.
> The defaults are sub-optimal. You really do need to tune them to the
> server in question.
Do you have an recommendations that are optimized for database with
600,000 user records and tables with up to 2,000,000 records?

>
>> # rows in contact: 574939
>> # rows in contact_address_map: 574924
> After loading this data, did you run an ANALYZE?
No we did not, why would this be useful?

>
>> select c.*
>> from contact c
>> left join CONTACT_ADDRESS_MAP cam on cam.CONTACT_ID = c.id
>> where cam.ADDRESS_ID is null
> Table definitions would probably help. You might be missing indexes.
Table definitions link (http://pastebin.com/GyCsYpBn) See index comment
below.

>
>> Result from an 'explain analyze':
>>
>> QUERY PLAN
>> Merge Left Join (cost=83512.87..158059.12 rows=1 width=952) (actual time=5224.171..5224.270 rows=15 loops=1)
>> Merge Cond: (c.id = cam.contact_id)
>> Filter: (cam.address_id IS NULL)
>> Rows Removed by Filter: 574924
>> -> Index Scan using contact_pkey on contact c (cost=0.00..63048.48 rows=574917 width=952) (actual time=0.009..852.708 rows=574939 loops=1)
>> -> Materialize (cost=83512.59..86387.17 rows=574917 width=16) (actual time=1407.387..3107.246 rows=574924 loops=1)
>> -> Sort (cost=83512.59..84949.88 rows=574917 width=16) (actual time=1407.379..2027.946 rows=574924 loops=1)
>> Sort Key: cam.contact_id
>> Sort Method: external sort Disk: 14616kB
> disk sorts imply work_mem isn't big enough.
I was surprised that the work_mem default was so low, we changed it from
1MB to 50MB and this is the `explain analyze' for
select * from contact where id not in (select contact_id from
contact_address_map)

QUERY PLAN
Seq Scan on contact (cost=10294.55..31256.01 rows=287458 width=952)
(actual time=1555.473..1582.885 rows=16 loops=1)
Filter: (NOT (hashed SubPlan 1))
Rows Removed by Filter: 574924
SubPlan 1
-> Seq Scan on contact_address_map (cost=0.00..8857.24 rows=574924
width=8) (actual time=0.013..586.107 rows=574924 loops=1)
Total runtime: 1597.773 ms

Increasing work_mem did not improve speeds for an unrelated query, it
ended up returning in 26 minutes. We had set the work_mem to 1000MB for
that test, we will try to get a simplified version of the query that
does not expose company data to you tomorrow.

>
>> -> Seq Scan on contact_address_map cam (cost=0.00..8857.17 rows=574917 width=16) (actual time=0.018..578.348 rows=574924 loops=1)
> This should probably be using an index
We added indexes to both columns of contact_address_map, but they made
no difference in speed for both queries.

Any additional assistance is appreciated.

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Joseph Pravato 2013-02-21 22:55:03 Re: PostgreSQL Query Speed Issues
Previous Message Douglas J Hunley 2013-02-21 21:00:20 Re: PostgreSQL Query Speed Issues