SQL performance help: self join or static var

From: Steve Midgley <public(at)misuse(dot)org>
To: pgsql-sql(at)postgresql(dot)org
Subject: SQL performance help: self join or static var
Date: 2007-09-18 00:42:27
Message-ID: 20070918004246.87CBC9F99EC@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Hi,

Given a table (truncated some real fields for simplicity):

CREATE TABLE city
(
id serial NOT NULL,
muni_city_id integer,
post_code_city_id integer,
alias_city_id integer,
city_type character varying(15),
post_code_type character varying(15),
CONSTRAINT city_pkey PRIMARY KEY (id)
)
WITHOUT OIDS;
CREATE INDEX index_city_on_muni_city_id
ON city
USING btree
(muni_city_id);
CREATE INDEX index_city_on_post_code_type
ON city
USING btree
(post_code_type);

Filled with ~168,000 records

Which of the following SQL statements should I expect better
performance on?

select * from city
where post_code_type in ('P', 'R')

EXPLAIN ANALYZE:
"Seq Scan on city (cost=0.00..4492.82 rows=76172 width=290) (actual
time=0.039..163.564 rows=30358 loops=1)"
" Filter: ((post_code_type)::text = ANY (('{P,R}'::character
varying[])::text[]))"
"Total runtime: 231.947 ms"

OR

select * from city
where id = muni_city_id

EXPLAIN ANALYZE:
"Seq Scan on city (cost=0.00..3535.41 rows=383 width=290) (actual
time=0.022..124.463 rows=30200 loops=1)"
" Filter: (muni_city_id = id)"
"Total runtime: 195.342 ms"

In my case both statements are semantically equivalent and I'm trying
to figure out if I should prefer the search of a varchar field for
static values over the "self join" constraint to an indexed integer
column?

My (almost completely ignorant) eyes say that the latter
(id=muni_city_id) is faster by a little - any voices of support or
other ideas in this regard? Caveats?

Thanks,

Steve

Browse pgsql-sql by date

  From Date Subject
Next Message Sabin Coanda 2007-09-18 10:55:08 error dropping operator
Previous Message Andrew Sullivan 2007-09-17 14:43:24 Re: Extracting hostname from URI column