From: | Chander Ganesan <chander(at)otg-nc(dot)com> |
---|---|
To: | badlydrawnbhoy <badlydrawnbhoy(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: How to optimize query that concatenates strings? |
Date: | 2006-07-07 16:37:34 |
Message-ID: | 44AE8DCE.2070301@otg-nc.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
You could build a function-based index that contains the "simplified"
version of each URL (in your case, the field with the '/' stripped).
Then use the same function on the URL going in. In that case PostgreSQL
will use the index that you created already.
Take a look at the PostgreSQL documentation for function-based indexes.
select from ... where simplify(url) <> url_col;
In the example above 'url_col' would have a function-based index that
was based on 'simplify(url_col)'
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC 27560
Phone: 877-258-8987/919-463-0999
badlydrawnbhoy wrote:
> Hi all,
>
> I've got a database of URLs, and when inserting new data into it I want
> to make sure that there are no functionally equivalent URLs already
> present. For example, 'umist.ac.uk' is functionally the same as
> 'umist.ac.uk/'.
>
> I find that searching for the latter form, using string concatentation
> to append the trailing slash, is much slower than searching for a
> simple string - the index on URL name isn't used to speed up the
> search.
>
> Here's an illustration
>
> url=# explain select exists(select * from url where url = 'umist.ac.uk'
> or url || '/' = 'umist.ac.uk') as present;
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------
> Result (cost=47664.01..47664.02 rows=1 width=0)
> InitPlan
> -> Seq Scan on url (cost=0.00..47664.01 rows=6532 width=38)
> Filter: ((url = 'umist.ac.uk'::text) OR ((url || '/'::text)
> = 'umist.ac.uk'::text))
> (4 rows)
>
> url=# explain select exists(select * from url where url =
> 'umist.ac.uk') as present;
> QUERY PLAN
> ----------------------------------------------------------------------------
> Result (cost=5.97..5.98 rows=1 width=0)
> InitPlan
> -> Index Scan using url_idx on url (cost=0.00..5.97 rows=1
> width=38)
> Index Cond: (url = 'umist.ac.uk'::text)
> (4 rows)
>
>
> Is there any way I can force postgres to use the index when using the
> string concatenation in the query?
>
> Thanks in advance,
>
> BBB
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>
From | Date | Subject | |
---|---|---|---|
Next Message | Parang Saraf | 2006-07-07 16:40:41 | How to insert .xls files into database |
Previous Message | Scott Marlowe | 2006-07-07 16:30:37 | Re: VACUUM and fsm_max_pages |