Quick Links

Re: select with "like" from another table

From:	"Anton Maksimenkov" <anton200(at)gmail(dot)com>
To:	"chris smith" <dmagick(at)gmail(dot)com>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: select with "like" from another table
Date:	2006-05-29 15:39:51
Message-ID:	8cac8dd0605290839j1f1aa30bj124d617012da0009@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

> > I have 2 tables - one with calls numbers and another with calls codes.
> > The structure almost like this:
> ...
> How long does this query take?
>
> SELECT code FROM a_voip_codes c, a_voip v where v.called_station_id
> like c.code ||
> '%' order by code desc limit 1

billing=# explain analyze SELECT code FROM a_voip_codes c, a_voip v
where v.called_station_id like c.code || '%' order by code desc limit
1;

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..11.24 rows=1 width=4) (actual
time=15809.846..15809.848 rows=1 loops=1)
-> Nested Loop (cost=0.00..35877212.61 rows=3192650 width=4)
(actual time=15809.841..15809.841 rows=1 loops=1)
Join Filter: (("inner".called_station_id)::text ~~
(("outer".code)::text || '%'::text))
-> Index Scan Backward using a_voip_codes_pkey on
a_voip_codes c (cost=0.00..69.87 rows=2078 width=4) (actual
time=0.029..0.106 rows=6 loops=1)
-> Seq Scan on a_voip v (cost=0.00..11887.81 rows=307281
width=13) (actual time=1.696..935.368 rows=254472 loops=6)
Total runtime: 15810.088 ms
(6 rows)

> I wonder if you'll benefit from an index on a_voip(called_station_id)
> to speed up this join.

Yes, it's long. But index gives no help here:

billing=# CREATE INDEX a_voip_called_station_id ON a_voip(called_station_id);
CREATE INDEX
billing=# vacuum analyze;
VACUUM
billing=# explain analyze SELECT code FROM a_voip_codes c, a_voip v
where v.called_station_id like c.code || '%' order by code desc limit
1;

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..11.27 rows=1 width=4) (actual
time=15254.783..15254.785 rows=1 loops=1)
-> Nested Loop (cost=0.00..35767665.65 rows=3172732 width=4)
(actual time=15254.778..15254.778 rows=1 loops=1)
Join Filter: (("inner".called_station_id)::text ~~
(("outer".code)::text || '%'::text))
-> Index Scan Backward using a_voip_codes_pkey on
a_voip_codes c (cost=0.00..69.87 rows=2078 width=4) (actual
time=0.021..0.097 rows=6 loops=1)
-> Seq Scan on a_voip v (cost=0.00..11868.64 rows=305364
width=13) (actual time=0.006..750.337 rows=254472 loops=6)
Total runtime: 15255.066 ms
(6 rows)

The main problem with first (main) query:

SELECT user_name, called_station_id,
(SELECT code FROM a_voip_codes AS c where v.called_station_id LIKE
c.code || '%' order by code desc limit 1) AS code
FROM a_voip AS v WHERE user_name = 'dixi' AND tm between '2006-04-01'
and '2006-05-01' group by user_name, called_station_id;

is that internal (SELECT... v.called_station_id LIKE c.code || '%'...)
executed for each row, returned by external SELECT user_name... part.
So I looking how to avoid internal (SELECT ...) part of query.

Terrible oracle gives something like "over by (partition by ... order
by code desc) rnum ... where rnum = 1" which works like DISTINCT and
numerate similate rows, then we get just longest (rnum = 1) rows. But
I can't imagine how to implement some kind of this algorithm with
postgres.
--
engineer

In response to

Re: select with "like" from another table at 2006-05-29 10:36:43 from chris smith

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Anthony Ransley	2006-05-30 00:26:43	Split select completes, single select doesn't and becomes IO bound!
Previous Message	Cstdenis	2006-05-29 14:35:14	Re: How can I make this query faster (resend)