Quick Links

A plan returned by explain doesn't make sense to me

From:	"Nick Fankhauser" <nickf(at)ontko(dot)com>
To:	"pgsql-admin" <pgsql-admin(at)postgresql(dot)org>
Subject:	A plan returned by explain doesn't make sense to me
Date:	2002-04-05 13:48:58
Message-ID:	NEBBLAAHGLEEPCGOBHDGIEFNELAA.nickf@ontko.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin

Hi-

I'm trying to squeeze better performance out of a query, but the explain
returns a plan that really puzzles me.

A little background is in order- the table that I think is central to my
problem is what I call an "assignment table" I'm working with court data, so
I have people and cases. A person can be involved in 0 or more cases and a
case can involve 1 or more persons. To represent this, I have the
actor_case_assignment table, which is essentially three fields- a primary
key, an actor_id (person), and a case_id. I have four indexes: One for each
id, and then a *unique* index on (actor_id, case_id) because although either
id on it's own may be duplicated many times in the assignment table, no
person should ever be assigned to a case more than once,

The problem is that I have a view using this table that runs way too slow.
It looks like this:

create view criminal_detail as
select case_data.case_id,
case_data.court_id,
actor_case_assignment.actor_id,
charge.charge_count,
charge.charge_number,
charge.charge_type,
charge.charge_version,
charge.charge_class,
charge.charge_text,
charge.charge_qualifier,
charge.charge_code,
charge.charge_desc,
charge.plea_code,
charge.plea_desc,
charge.charge_date,
charge.statute_reference,
charge.charge_status_desc,
disposition_number,
cr_disposition_date,
cr_disposition_desc,
cr_disposition_code,
cr_disposition_type,
cr_disposition_mode
from (charge left outer join criminal_disposition on
(charge.charge_id = criminal_disposition.charge_id)
),
actor_case_assignment,
case_data
where actor_case_assignment.actor_case_assignment_id =
charge.actor_case_assignment_id
and actor_case_assignment.case_id = case_data.case_id
;

The explain looks like this:

develop=# explain SELECT * FROM CRIMINAL_DETAIL WHERE case_id =
'102SC01353';
NOTICE: QUERY PLAN:

Hash Join (cost=155.06..9263.84 rows=1 width=320)
-> Hash Join (cost=120.53..6034.05 rows=79880 width=260)
-> Seq Scan on charge (cost=0.00..2664.80 rows=79880 width=184)
-> Hash (cost=109.82..109.82 rows=4282 width=76)
-> Seq Scan on criminal_disposition (cost=0.00..109.82
rows=4282 width=76)
-> Hash (cost=34.53..34.53 rows=4 width=60)
-> Nested Loop (cost=0.00..34.53 rows=4 width=60)
-> Index Scan using case_data_case_id on case_data
(cost=0.00..4.01 rows=1 width=24)
-> Index Scan using actor_case_assignment_both on
actor_case_assignment (cost=0.00..30.42 rows=7 width=36)

The part that looks wrong to me is that on the last line of the explain, it
is using the index "actor_case_assignment_both" (the index on
actor_id,case_id) even though actor_id is never referenced in this query and
and index on case_id alone is available.

Looking from the outside in, this query should be very fast- an index scan
on case_id reduces the number of matches in actor_case_assignment to about
5, and then then joins these 5 rows to other tables based on indexed primary
(unique) keys. The question is, how can I re-arrange my query to help the
planner come to the same conclusion, and most puzzling, why does the planner
choose an index that involves actor_id?

Many thanks to those of you who read through all of this! Any suggestions?

-Nick

--------------------------------------------------------------------------
Nick Fankhauser nickf(at)ontko(dot)com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/

Responses

Re: A plan returned by explain doesn't make sense to me at 2002-04-05 15:53:42 from Tom Lane

Browse pgsql-admin by date

	From	Date	Subject
Next Message	Nick Fankhauser	2002-04-05 14:33:18	error in my last post
Previous Message	postgresql	2002-04-05 11:43:26	Re: connections