From: | Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it> |
---|---|
To: | PGSQL <pgsql-general(at)postgresql(dot)org> |
Subject: | functional index not used, looping simpler query just faster |
Date: | 2008-07-10 09:40:40 |
Message-ID: | 20080710114040.33f8c07c@dawn.webthatworks.it |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
I've this:
CREATE TABLE catalog_brands
(
brandid serial NOT NULL,
"name" character varying(64) NOT NULL,
delivery smallint NOT NULL DEFAULT (24 * 15),
deliverymessage character varying(64),
brandtypeid integer,
brandgroupid integer,
CONSTRAINT catalog_brands_pkey PRIMARY KEY (brandid),
CONSTRAINT catalog_brands_brandgroupid_fkey FOREIGN KEY
(brandgroupid) REFERENCES catalog_brandgroup (brandgroupid) MATCH
SIMPLE ON UPDATE NO ACTION ON DELETE SET NULL,
CONSTRAINT catalog_brands_brandtypeid_fkey FOREIGN KEY
(brandtypeid) REFERENCES catalog_brandtype (brandtypeid) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE SET NULL
);
CREATE INDEX catalog_brands_name_index
ON catalog_brands
USING btree
(upper(name::text));
CREATE TABLE catalog_items
(
itemid bigint NOT NULL,
brand integer NOT NULL,
name character varying(256) NOT NULL,
/* snip */
datainserimento timestamp without time zone,
dapub smallint,
CONSTRAINT catalog_items_statusid_fkey FOREIGN KEY (statusid)
REFERENCES catalog_item_status (statusid) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
);
CREATE INDEX catalog_items_brands_index
ON catalog_items
USING btree
(upper(brands::text));
CREATE UNIQUE INDEX catalog_items_itemsid_index
ON catalog_items
USING btree
(itemid);
ALTER TABLE catalog_items CLUSTER ON catalog_items_itemsid_index;
catalog_items contains ~ 650K records
catalog_brands 44 records
Now I try this:
explain select i1.brands, i1.name, i1.dataPub, i1.datainserimento
from catalog_items i1
inner join catalog_brands b1 on upper(i1.brands)=upper(b1.name)
where i1.ItemID in (
select i2.ItemID from catalog_items i2
inner join catalog_brands b2 on upper(i2.brands)=upper(b2.name)
where i1.brands=i2.brands
and i2.dataPub>(now() - interval '8 month') and
i2.datainserimento>(now() - interval '6 month') order by
i2.datainserimento desc limit 3);
And I got this:
"Nested Loop (cost=0.00..6383568361.87 rows=74378 width=82)"
" -> Seq Scan on catalog_brands b1 (cost=0.00..1.44 rows=44 width=18)"
" -> Index Scan using catalog_items_brands_index on catalog_items i1 (cost=0.00..145081069.53 rows=1690 width=82)"
" Index Cond: (upper((i1.brands)::text) = upper(("outer".name)::text))"
" Filter: (subplan)"
" SubPlan"
" -> Limit (cost=42906.81..42906.82 rows=1 width=16)"
" -> Sort (cost=42906.81..42906.82 rows=1 width=16)"
" Sort Key: i2.datainserimento"
" -> Nested Loop (cost=0.00..42906.80 rows=1 width=16)"
" Join Filter: (upper(("outer".brands)::text) = upper(("inner".name)::text))"
" -> Seq Scan on catalog_items i2 (cost=0.00..42904.59 rows=1 width=34)"
" Filter: ((($0)::text = (brands)::text) AND (datapub > (now() - '8 mons'::interval)) AND (datainserimento > (now() - '6 mons'::interval)))"
" -> Seq Scan on catalog_brands b2 (cost=0.00..1.44 rows=44 width=18)"
I never waited enough to see results from the above... several
minutes over a 2xXeon 4Gb ram.
A simpler
select name, brands from catalog_items where
upper(brands)=upper('LARGEST GROUP') order by datainserimento desc
limit 3;
finishes in few seconds. Iterating over 44 groups does look to be
much faster than the more complicated query.
"Limit (cost=9503.62..9503.63 rows=3 width=74)"
" -> Sort (cost=9503.62..9512.08 rows=3381 width=74)"
" Sort Key: datainserimento"
" -> Bitmap Heap Scan on catalog_items (cost=29.84..9305.44 rows=3381 width=74)"
" Recheck Cond: (upper((brands)::text) = 'CAMBRIDGE UNIVERSITY PRESS'::text)"
" -> Bitmap Index Scan on catalog_items_brands_index (cost=0.00..29.84 rows=3381 width=0)"
" Index Cond: (upper((brands)::text) = 'CAMBRIDGE UNIVERSITY PRESS'::text)"
Even
select count(*), i1.brands from catalog_items i1
inner join catalog_brands b1 on
upper(b1.name)=upper(i1.brands)
group by i1.brands order by count(*)
takes from few seconds to less than 1 sec.
I could actually loop inside plpgsql but... well I'd like to
understand how things work.
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2008-07-10 09:50:01 | Re: functional index not used, looping simpler query just faster |
Previous Message | Leif B. Kristensen | 2008-07-10 09:08:46 | Re: User-Defined Variables |