From: | Ian Barwick <barwick(at)gmx(dot)net> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | TEXT column and indexing |
Date: | 2003-11-19 09:18:18 |
Message-ID: | 200311191018.18492.barwick@gmx.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I have this table:
db=> \d object_property_value
Table "db.object_property_value"
Column | Type | Modifiers
-----------------------+------------------------+--------------------
obj_property_value_id | integer | not null default nextval(...
obj_property_id | integer | not null
value | text |
Indexes:
"object_property_value_pkey" primary key, btree (obj_property_value_id)
"opv_obj_property_id_ix" btree (obj_property_id)
"opv_v_ix" btree (substr(value, 1, 128))
Foreign-key constraints:
"object_property_fkey" FOREIGN KEY (obj_property_id)
REFERENCES object_property(obj_property_id)
ON UPDATE CASCADE ON DELETE CASCADE
(long lines edited for readability).
The table contains about 250,000 records and will grow at regular intervals.
The 'value' column contains text of various lengths. The table is VACUUMed
and ANALYZEd regularly and waxed on Sunday mornings. Database encoding is
Unicode. Server is 7.4RC1 or 7.4RC2 and will be 7.4 ASAP.
I want to query this table to match a specific value along
the lines of:
SELECT obj_property_id
FROM object_property_value opv
WHERE opv.value = 'foo'
There will only be a few (at the moment 2 or 3) rows exactly matching
'foo'. This query will only be performed with values containing less
than around 100 characters, which account for ca. 10% of all rows in the
table.
The performance is of course lousy:
db=> EXPLAIN
db-> SELECT obj_property_id
db-> FROM object_property_value opv
db-> WHERE opv.value = 'foo';
QUERY PLAN
-----------------------------------------------------------------------------
Seq Scan on object_property_value opv (cost=0.00..12258.26 rows=2 width=4)
Filter: (value = 'foo'::text)
(2 rows)
However, if I create a VARCHAR field containing the first 128 characters of
the text field and index that, an index scan is used:
db=> EXPLAIN
db-> SELECT obj_property_id
db-> FROM object_property_value opv
db-> WHERE opv.opv_vc = 'foo';
QUERY PLAN
-------------------------------------------------------------------------------------------
Index Scan using opv_vc_ix on object_property_value opv (cost=0.00..6.84
rows=2 width=4)
Index Cond: ((opv_vc)::text = 'foo'::text)
The question is therefore: can I get an index to work on the TEXT column? It
is currently indexed with:
"opv_v_ix" btree (substr(value, 1, 128))
which doesn't appear to have any effect. I am probably missing something
obvious though. I can live with maintaining an extra VARCHAR column but
would like to keep the table as simple as possible.
(For anyone wondering: yes, I can access the data using tsearch2 - via
a different table in this case - but this is not always appropriate).
Thanks for any hints.
Ian Barwick
barwick(at)gmx(dot)net
From | Date | Subject | |
---|---|---|---|
Next Message | Laurent Martelli | 2003-11-19 09:52:43 | Re: Join on incompatible types |
Previous Message | Shridhar Daithankar | 2003-11-19 06:54:04 | Re: Join on incompatible types |