Re: looking for a faster way to do that

From: hamann(dot)w(at)t-online(dot)de
To: pgsql-general(at)postgresql(dot)org
Subject: Re: looking for a faster way to do that
Date: 2011-09-26 16:18:07
Message-ID: wolfgang-1110926181807.A0430472@amadeus3.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Eduardo Morras <nec556(at)retena(dot)com> wrote:

>>
>> At 08:04 25/09/2011, hamann(dot)w(at)t-online(dot)de wrote:
>>
>> > >> select * from items where regexp_matches(items.code,'(ABC) (DE1)
>> > >> (any_substring)')<>{};
>> > >>
>> >
>> >Hi Eduardo,
>> >
>> >it is clear that scanning the table once with a list of matches will
>> >outperform
>> >rescanning the table for every string wanted. Now, my problem is
>> >that the patterns are
>> >dynamic as well. So if I could translate a table with one
>> >column and a few thousand rows
>> >into something like
>> >regexp_matches(code,'string1|string2|.....string2781')
>> >would ideally be a performant query. Unfortunately I have no idea
>> >how I could achieve this
>> >transformation inside the database. Doing it externally fails,
>> >because any single query cannot
>> >be more than so few characters.
>>
>> You can create a plsql function and pass a setof text that do it.
>> Sorry but instead saying you What Must You Type, WMYT(c), i prefer
>> the How Should You Do way, HSYD(c). Note that you can get the same
>> results using other approachs (f.ex. using FTS described in chapter 12)
>>
>> Check this topics:
>>
>> Function
>> Creation http://www.postgresql.org/docs/9.0/static/sql-createfunction.html
>>
>> Tutorial about Function
>> Creation http://www.adderpit.com/practical-postgresql/x10374.htm
>>

Hi,

I tried the pl/sql approach to convert the contents of that patterns table into a regex.
Results: 40 seconds runtime for 9500 candidates and 815 patterns
718 seconds for the same set of 9500 candidates, but using 4000 patterns instead.
So it seems that I am reaching limits of pattern match

As for the fulltext index (and the underlying tsquery): this is an exact match rather than prefix
match, so I would need to know match patterns in advance in order to build the index

I am thinking about that anyway (because ABC1234 likely should not match ABC123 pattern
in my context), but I would sort of prefer a system where I can state the rules when I
see the data set, rather than having to pre-create an index.

Thanks for the tutorial link :)
It seems that the responses on my post give all sorts of input that will help me on other
tasks

Regards
Wolfgang Hamann

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Edson Carlos Ericksson Richter 2011-09-26 16:54:56 RES: Does postgresql 9.0.4 use index on PREFIX%SUFFIX queries?
Previous Message hamann.w 2011-09-26 16:18:05 Re: looking for a faster way to do that