Re: Regex performance issue

From: "Alexandru Coseru" <alexandru(dot)coseru(at)totaltelecom(dot)ro>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Regex performance issue
Date: 2006-12-02 22:13:27
Message-ID: 022c01c7165f$17f49e80$3cb16956@alex
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello..

I cannot use the first advice , because i'm not aware of the prefix length
in the database...
This is why i'm ordering after length(prefix)..

On the 2nd one , i'm not sure that i can follow you..

Regards
Alex
----- Original Message -----
From: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
To: "Alexandru Coseru" <alexandru(dot)coseru(at)totaltelecom(dot)ro>
Cc: "Dave Dutcher" <dave(at)tridecap(dot)com>; <pgsql-performance(at)postgresql(dot)org>
Sent: Sunday, December 03, 2006 12:04 AM
Subject: Re: [PERFORM] Regex performance issue

> Alexandru Coseru wrote:
>> I cannot use LIKE , because the order of the match is reversed.
>> The prefix column is containing telephone destinations.
>> IE: ^001 - US , ^0039 Italy , etc..
>
> Maybe you could create a functional index on substr(<minimum length of
> prefix>)? It might restrict the result set prior to applying the regex
> just enough to make the performance acceptable.
>
>> asterisk=> select * from destlist LIMIT 10;
>> id | id_ent | dir | prefix | country | network | tip
>> ----+--------+-----+------------+-------------+--------------------+-----
>> 1 | -1 | 0 | (^0093) | AFGHANISTAN | AFGHANISTAN | 6
>> 2 | -1 | 0 | (^00937) | AFGHANISTAN | AFGHANISTAN Mobile | 5
>> 3 | -1 | 0 | (^00355) | ALBANIA | ALBANIA | 6
>> 4 | -1 | 0 | (^0035538) | ALBANIA | ALBANIA Mobile | 5
>> 5 | -1 | 0 | (^0035568) | ALBANIA | ALBANIA Mobile | 5
>> 6 | -1 | 0 | (^0035569) | ALBANIA | ALBANIA Mobile | 5
>> 7 | -1 | 0 | (^00213) | ALGERIA | ALGERIA | 6
>> 8 | -1 | 0 | (^0021361) | ALGERIA | ALGERIA Mobile | 5
>> 9 | -1 | 0 | (^0021362) | ALGERIA | ALGERIA Mobile | 5
>> 10 | -1 | 0 | (^0021363) | ALGERIA | ALGERIA Mobile | 5
>>
>> Now , I have to match a dialednumber (let's say 00213618833) and find
>> it's destination...(It's algeria mobile).
>> I tried to make with a query of using LIKE , but i was not able to get
>> something..
>
> Another idea would be to add some extra rows so that you could use normal
> inequality searches. For example, let's take the Albanian rows:
>
> 3 | -1 | 0 | 00355
> 4 | -1 | 0 | 0035538
> * 3 | -1 | 0 | 0035539
> 5 | -1 | 0 | 0035568
> 6 | -1 | 0 | 0035569
> * 3 | -1 | 0 | 0035570
>
> Now you can do "SELECT * FROM destlist WHERE ? >= prefix ORDER BY prefix
> LIMIT 1".
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
>
>
>
> --
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.409 / Virus Database: 268.15.4/563 - Release Date: 12/2/2006
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Heikki Linnakangas 2006-12-02 22:35:18 Re: Regex performance issue
Previous Message Heikki Linnakangas 2006-12-02 22:04:06 Re: Regex performance issue