Re: efficient way to do "fuzzy" join

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Rémi Cura <remi(dot)cura(at)gmail(dot)com>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: efficient way to do "fuzzy" join
Date: 2014-04-12 13:04:23
Message-ID: 534939D7.4060606@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 04/12/2014 06:29 AM, Rémi Cura wrote:
> (please note that this random string function is NOT the good way to
> do it, i should random int then use it as index to an array
> containing all the letter)
>
> Thanks a lot for this new version! It seems to be slower than your
> first solution (no index use I guess, I gave up after 5 minutes vs 5
> sec for the previous). Morevover, I canno't make assumption about a
> fixed interval (2 sec in your example). But I think I see where you
> are going.
>
>
> After some test, the fastest is using BETWEEN and range. (it is way
> faster than using the <@, strangely)
>
> Here is the code :

Ah, sorry about that. I got pulled away to work on work stuff. I was trying to figure out how to use an index on the range query, but not sure, without adding a new column if it would even work.

I've never had the need for ranges yet, this is the first time I've gotten to play with them.

I would not have thought about between like that, good call. I'd have never guess it would be so fast.

If you can't use the fixed interval, then ranges are out.

I was thinking this could be improved:

select t,
(select t from a where a.t >= b.t order by a.t limit 1) as mint,
(select t from a where a.t < b.t order by a.t desc limit 1) as maxt
from b

It does two selects into a to find the nearest. Given this:

create table a(t float);

insert into a values (1), (5), (6);

could you write a single query to find the number nearest 3.5? If so we might cut the work by 50%.

-Andy

PS: This list prefers you don't top post.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jan Wieck 2014-04-12 18:27:36 Re: Cancelling of autovacuums considered harmful
Previous Message Rémi Cura 2014-04-12 11:29:50 Re: efficient way to do "fuzzy" join