Performance in subquery

From: "Brian Haney" <brian(at)ibsystems(dot)com>
To: <pgsql-general(at)postgresql(dot)org>
Subject: Performance in subquery
Date: 2000-04-17 19:52:19
Message-ID: 002601bfa8a6$70aa26e0$8101a8c0@specter.fresno.cybernaut.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have multivendor product catalog and want to be able to search for
arbitrary text in a table of product attributes. Two of the several
tables are:

Table = products
+----------------------------------+----------------------------------
+-------+
| Field | Type
| Length|
+----------------------------------+----------------------------------
+-------+
| prod_id | int4 not null default nextval
| 4 |
| vendor_id | int4
| 4 |
| category_id | int4
| 4 |
| prod_name | text not null
| var |
| prod_description | text not null
| var |
| prod_price_low | numeric
| 30.6 |
| prod_price_high | numeric
| 30.6 |
| prod_promo | text
| var |
| prod_promo_url | text
| var |
| prod_datasht_url | text
| var |
| prod_buynow_url | text
| var |
| prod_asp_url | text
| var |
| prod_active | bool default 'false'
| 1 |
| last_modified | timestamp
| 4 |
| prod_demo_url | text
| var |
| prod_ephone | text
| var |
+----------------------------------+----------------------------------
+-------+
Index: products_pkey

Table = prod_attr
+----------------------------------+----------------------------------
+-------+
| Field | Type
| Length|
+----------------------------------+----------------------------------
+-------+
| attr_id | int4 not null
| 4 |
| prod_id | int4 not null
| 4 |
| prod_attr_text | text not null
| var |
+----------------------------------+----------------------------------
+-------+
Indices: prod_attr_pkey
prod_attr_text_idx

The products table has 538 records and the prod_attr table has 7870.

When I enter the query:

SELECT prod_id FROM prod_attr WHERE prod_attr_text LIKE '%Linux%';

it performs quite well to give me a list of the product IDs almost
instantaneously.

But when I query:

select prod_name from products where prod_id in ( SELECT prod_id FROM
prod_attr
WHERE prod_attr_text LIKE '%Linux%');

It takes over 30 seconds to get the results. Here are the EXPLAINs:

First query:

Seq Scan on prod_attr (cost=317.84 rows=1 width=4)

Second query:

Seq Scan on products (cost=31.75 rows=538 width=12)
SubPlan
-> Seq Scan on prod_attr (cost=317.84 rows=1 width=4)

As you can see, I have created and index for prod_attr
(prod_attr_text), but it has had no discernable effect.

The explains imply to me that the real cost of the second query is
scanning the 7870 records, but then I would expect the second query to
take only slightly longer than the first. I suspect something else is
causing the second query to be so doggone slow.

Any help would be greatly appreciated.

--
Brian Haney VP Engineering/CTO
brian(at)ibsystems(dot)com Internet Business Systems, Inc.

Browse pgsql-general by date

  From Date Subject
Next Message Hal Snyder 2000-04-17 20:41:24 Re: excell to postgres
Previous Message Jeffrey 2000-04-17 19:48:12 Re: excell to postgres