[PERFORMANCE] Performance index and table

From: Oscar Camuendo <oscar(dot)dny86(at)hotmail(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: [PERFORMANCE] Performance index and table
Date: 2016-07-22 13:34:55
Message-ID: BN3PR20MB04990FF1BE4ACCBC75924C71F00A0@BN3PR20MB0499.namprd20.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I'm working on Postgresql 9.5.3 and executed a query which takes 5 or 7 seconds and it should not take more than 0.30 milliseconds, the query is:

-----------QUERY--------------------------------------------------------------------------------------

with recursive t(level,parent_id,id) as (
select 0,parent_id,id from parties where parent_id = 105
union
select t.level + 1,c.parent_id,c.id from parties c join t on c.parent_id = t.id
)
select distinct id from t order by id;

--------------------------------------------------------------------------------------------------------------

The parties table has 245512 rows and one index named "index_parties_on_parent_id" , so I added an EXPLAIN ANALYZE VERBOSE to get more details and it was the result:

--------RESULT--------------------------------------------------------------------------------------

Sort (cost=21237260.78..21237261.28 rows=200 width=4) (actual time=6850.338..6850.343 rows=88 loops=1)
Output: t.id
Sort Key: t.id
Sort Method: quicksort Memory: 29kB
CTE t
-> Recursive Union (cost=0.43..20562814.38 rows=29974967 width=12) (actual time=0.072..6850.180 rows=88 loops=1)
-> Index Scan using index_parties_on_parent_id on public.parties (cost=0.43..3091.24 rows=807 width=8) (actual time=0.064..0.154 rows=23 loops=1)
Output: 0, parties.parent_id, parties.id
Index Cond: (parties.parent_id = 105)
-> Hash Join (cost=777279.14..1996022.38 rows=2997416 width=12) (actual time=2245.623..2283.290 rows=22 loops=3)
Output: (t_1.level + 1), c.parent_id, c.id
Hash Cond: (t_1.id = c.parent_id)
-> WorkTable Scan on t t_1 (cost=0.00..161.40 rows=8070 width=8) (actual time=0.002..0.009 rows=29 loops=3)
Output: t_1.level, t_1.id
-> Hash (cost=606642.73..606642.73 rows=10400673 width=8) (actual time=2206.149..2206.149 rows=1742 loops=3)
Output: c.parent_id, c.id
Buckets: 2097152 Batches: 16 Memory Usage: 16388kB
-> Seq Scan on public.parties c (cost=0.00..606642.73 rows=10400673 width=8) (actual time=71.070..2190.318 rows=244249 loops=3)
Output: c.parent_id, c.id
-> HashAggregate (cost=674436.76..674438.76 rows=200 width=4) (actual time=6850.291..6850.305 rows=88 loops=1)
Output: t.id
Group Key: t.id
-> CTE Scan on t (cost=0.00..599499.34 rows=29974967 width=4) (actual time=0.075..6850.236 rows=88 loops=1)
Output: t.id
Planning time: 0.815 ms
Execution time: 7026.026 ms

----------------------------------------------------------------------------------------------------------------------

So, I could see that index_parties_on_parent_id showed 10400673 rows and checking index_parties_on_parent_id index I get this information: num_rows = 10400673 and index_size = 310 MB

Could Anybody explain me why the difference between parties table = 245512 and index_parties_on_parent_id index = 10400673? and How could I improve this index and its response time?

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2016-07-22 14:29:30 Re: [PERFORMANCE] Performance index and table
Previous Message Robert Klemme 2016-07-22 08:37:20 Re: Seeing execution plan of foreign key constraint check?