Quick Links

efficient way to do "fuzzy" join

From:	Rémi Cura <remi(dot)cura(at)gmail(dot)com>
To:	PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject:	efficient way to do "fuzzy" join
Date:	2014-04-11 12:50:34
Message-ID:	CAJvUf_sUFAMdsPRPYRT2WNxrFqt0Bs=xYS-pvy=EAdOFyVg3fw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hey dear List,

I'm looking for some advice about the best way to perform a "fuzzy" join,
that is joining two table based on approximate matching.

It is about temporal matching
given a table A with rows containing data and a control_time (for instance
1 ; 5; 6; .. sec, not necessarly rounded of evenly-spaced)

given another table B with lines on no precise timing (eg control_time =
2.3 ; 5.8 ; 6.2 for example)

How to join every row of B to A based on
min(@(A.control_time-B.control_time))
(that is, for every row of B, get the row of A that is temporaly the
closest),
in an efficient way?
(to be explicit, 2.3 would match to 1, 5.8 to 6, 6.2 to 6)

Optionnaly, how to get interpolation efficiently (meaning one has to get
the previous time and next time for 1 st order interpolation, 2 before and
2 after for 2nd order interpolation, and so on)?
(to be explicit 5.8 would match to 5 and 6, the weight being 0.2 and 0.8
respectively)

Currently my data is spatial so I use Postgis function to interpolate a
point on a line, but is is far from efficient or general, and I don't have
control on interpolation (only the spatial values are interpolated).

Cheers,
Rémi-C

Responses

Re: efficient way to do "fuzzy" join at 2014-04-11 15:09:57 from Andy Colson
Re: efficient way to do "fuzzy" join at 2014-04-11 17:16:12 from Andy Colson

Browse pgsql-general by date

	From	Date	Subject
Next Message	Steve Litt	2014-04-11 13:16:04	Re: Linux vs FreeBSD
Previous Message	Alban Hertroys	2014-04-11 12:05:43	Re: Linux vs FreeBSD