Re: PostgreSQL OR performance

From: Віталій Тимчишин <tivv00(at)gmail(dot)com>
To:
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL OR performance
Date: 2008-11-07 09:14:01
Message-ID: 331e40660811070114rd70cf0dy194020a9f9ea8802@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

>
>
> Yes, the query should output exactly same result as in "Union" plan. I will
> run "slow" explain analyze now and will repost after it will complete
> (tomorrow?).
> BTW: I'd say planner should think rows estimated as sum of "ORs" estimation
> minus intersection, but no more then sum or ORs (if intersection is 0). For
> first condition it has rows=525975, for second it has rows=2403 (with other
> plans, of course), so it's strange it has such a high estimation.... It's
> exactly 50% of full cartesian join of merge, so it does think that every
> second pair would succeed, that is not true.
>
>
I am sorry, I've emptied atom_match table, so one part produce 0 result, but
anyway here is explain:

"Merge Join (cost=518771.07..62884559.80 rows=1386158171 width=32) (actual
time=30292.802..755751.242 rows=34749 loops=1)"
" Merge Cond: (production.run.id = (production.company.run_id)::bigint)"
" Join Filter: (((production.company.name)::text =
(production.company.name)::text)
OR (hashed subplan))"
" -> Sort (cost=45474.92..45606.54 rows=52648 width=38) (actual
time=562.928..595.128 rows=15507 loops=1)"
" Sort Key: production.run.id"
" Sort Method: external sort Disk: 880kB"
" -> Nested Loop (cost=1184.82..39904.24 rows=52648 width=38)
(actual time=90.571..530.925 rows=15507 loops=1)"
" -> HashAggregate (cost=1.55..1.56 rows=1 width=8) (actual
time=3.077..3.078 rows=1 loops=1)"
" -> Seq Scan on run (cost=0.00..1.55 rows=1 width=8)
(actual time=3.066..3.068 rows=1 loops=1)"
" Filter: ((name)::text = 'test'::text)"
" -> Nested Loop (cost=1183.27..39376.19 rows=52648 width=30)
(actual time=87.489..484.605 rows=15507 loops=1)"
" -> HashAggregate (cost=1.55..1.56 rows=1 width=8)
(actual time=0.016..0.019 rows=1 loops=1)"
" -> Seq Scan on run (cost=0.00..1.55 rows=1
width=8) (actual time=0.009..0.011 rows=1 loops=1)"
" Filter: ((name)::text = 'test'::text)"
" -> Bitmap Heap Scan on company
(cost=1181.72..38592.03 rows=62608 width=30) (actual time=87.465..441.014
rows=15507 loops=1)"
" Recheck Cond:
((production.company.run_id)::bigint = production.run.id)"
" Filter: ((production.company.status)::text =
'unprocessed'::text)"
" -> Bitmap Index Scan on comp_run
(cost=0.00..1166.07 rows=62608 width=0) (actual time=65.828..65.828
rows=15507 loops=1)"
" Index Cond:
((production.company.run_id)::bigint = production.run.id)"
" -> Materialize (cost=469981.13..498937.42 rows=2316503 width=30)
(actual time=15915.639..391938.338 rows=242752539 loops=1)"
" -> Sort (cost=469981.13..475772.39 rows=2316503 width=30) (actual
time=15915.599..19920.912 rows=2316503 loops=1)"
" Sort Key: production.company.run_id"
" Sort Method: external merge Disk: 104896kB"
" -> Seq Scan on company (cost=0.00..58808.03 rows=2316503
width=30) (actual time=22.244..7476.954 rows=2316503 loops=1)"
" SubPlan"
" -> Nested Loop (cost=2267.65..3314.94 rows=22 width=1038) (actual
time=0.009..0.009 rows=0 loops=1)"
" -> Hash Join (cost=2267.65..3141.36 rows=22 width=523) (actual
time=0.006..0.006 rows=0 loops=1)"
" Hash Cond: ((atom_match.atom1_id)::integer = s1.id)"
" -> Seq Scan on atom_match (cost=0.00..30.38 rows=1630
width=8) (actual time=0.002..0.002 rows=0 loops=1)"
" Filter: ((match_function_id)::integer = 2)"
" -> Hash (cost=1292.04..1292.04 rows=12209 width=523)
(never executed)"
" -> Index Scan using atomstr_typ on atoms_string s1
(cost=0.00..1292.04 rows=12209 width=523) (never executed)"
" Index Cond: ((atom_type_id)::integer = (-1))"
" -> Index Scan using pk_atoms_string on atoms_string s2
(cost=0.00..7.88 rows=1 width=523) (never executed)"
" Index Cond: (s2.id = (atom_match.atom2_id)::integer)"
"Total runtime: 755802.686 ms"

P.S. May be I've chosen wrong list and my Q better belongs to -hackers?

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Wilson 2008-11-07 10:07:32 Re: PostgreSQL OR performance
Previous Message David Rees 2008-11-07 02:21:59 Re: Create and drop temp table in 8.3.4