Re: grep -f keyword data query

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Hiroyuki Sato <hiroysato(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andreas Kretschmer <andreas(at)a-kretschmer(dot)de>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: grep -f keyword data query
Date: 2015-12-30 01:15:37
Message-ID: CAKJS1f8=Tr2Xo=Azvc9fMUbyxK9w8dxaW4E=9fHPYLBTXUNNmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 30 December 2015 at 13:56, Hiroyuki Sato <hiroysato(at)gmail(dot)com> wrote:

> 2015年12月30日(水) 6:04 David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>:
>
>> On 30 December 2015 at 04:21, Hiroyuki Sato <hiroysato(at)gmail(dot)com> wrote:
>>
>>> 2015年12月29日(火) 4:35 Jeff Janes <jeff(dot)janes(at)gmail(dot)com>:
>>>
>>>>
>>>>
>>> But, the planner refuses to use this index for your query anyway,
>>>> because it can't see that the patterns are all left-anchored.
>>>>
>>>> Really, your best bet is refactor your url data so it is stored with a
>>>> url_prefix and url_suffix column. Then you can do exact matching
>>>> rather than pattern matching.
>>>>
>>> I see, exact matching faster than pattern matting.
>>> But I need pattern match in path part
>>> (ie, http://www.yahoo.com/a/b/c/... )
>>> I would like to pattern match '/a/b/c' part.
>>>
>>
>> If your pattern matching is as simple as that, then why not split the
>> /a/b/c/ part out as mentioned by Jeff? Alternatively you could just write a
>> function which splits that out for you and returns it, then index that
>> function, and then just include a call to that function in the join
>> condition matching with the equality operator. That'll allow hash and merge
>> joins to be possible again.
>>
>
> Could you tell me more detail about Alternatively part?
>
> It is good idea to split host and part.
> I'll try it.
>
> My matching pattern is the following
> 1, http://www.yahoo.com/a/b/% (host equal, path like)
> 2, http://%.yahoo.com/a/b/% (host and path like )
>

It seems I misunderstood your pattern matching. The example you supplied
earlier indicated you just needed to match the document part (/a/b/c/) and
just ignore the protocol://host part, in which case you could have written
a function which took a text parameter, say: "http://www.yahoo.com/a/b/c/",
and returned "/a/b/c", then performed: create index on yourtable
(thatfunction(yourcolumn)); However that method won't help you, as it seems
your pattern matching is more complex than the previous example that you
supplied.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jim Nasby 2015-12-30 01:26:36 Re: efficient math vector operations on arrays
Previous Message Hiroyuki Sato 2015-12-30 00:56:18 Re: grep -f keyword data query