index on custom function; explain

From: "jan(dot)aerts(at)bbsrc(dot)ac(dot)uk" <jan(dot)aerts(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: index on custom function; explain
Date: 2005-10-03 15:14:11
Message-ID: 1128352451.312090.37640@g44g2000cwa.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I'm trying to include a custom function in my SQL-queries, which
unfortunately leaves the server hanging...

I basically search through two tables:
* TABLE_MAPPING: lists that 'abc' is mapped to 'def'
id1 | name1 | id2 | name2
-------------------------
1 | abc | 2 | def
3 | uvw | 4 | xyz
This data means that 'abc' is mapped_to 'def', and 'uvw' is mapped_to
'xyz'. About 1,500,000 records in total.

* TABLE ALIASES: lists different aliases of the same thing
id1 | name1 | id2 | name2
-------------------------
3 | uvw | 2 | def
This data means that 'uvw' and 'def' are essentially the same thing.
About 820,000 records in total.

I have indexes on all columns of the above tables.

Based on the two tables above, 'abc' is indirectly mapped_to 'xyz' as
well (through 'def' also-known-as 'uvw').

I wrote this little function: aliases_of
CREATE FUNCTION aliases_of(INTEGER) RETURNS SETOF integer
AS 'SELECT $1
UNION
SELECT id1 FROM aliases WHERE id2 = $1
UNION
SELECT id2 FROM aliases WHERE id1 = $1
'
LANGUAGE SQL
STABLE;

A simple SELECT aliases_of(2) shows:
aliases_of
----------
2
3

Now, when I want to traverse the aliases, I would write a query as
follows:
SELECT m1.name1, m1.name2, m2.name1, m2.name2
FROM mappings m1, mappings m2
WHERE m1.name1 = 'abc'
AND m2.name1 IN (SELECT aliases_of(m1.name2));

Unfortunately, this query seems to keep running and to never stop...

An EXPLAIN of the above query shows:
QUERY PLAN
-----------------------------------------------------------------------------
Nested Loop (cost=0.00..118379.45 rows=1384837 width=80)
Join Filter: (subplan)
-> Index Scan using ind_cmappings_object1_id on c_mappings m1
(cost=0.00..7.08 rows=2 width=40)
Index Cond: (name1 = 'abc')
-> Seq Scan on c_mappings m2 (cost=0.00..35935.05 rows=1423805
width=40)
SubPlan
-> Result (cost=0.00..0.01 rows=1 width=0)
(7 rows)

Strangely enough, I _do_ get output when I type the following query:
SELECT m1.name1, m1.name2, m2.name1, m2.name2
FROM mappings m1, mappings m2
WHERE m1.name1 = 'abc'
AND m2.name1 IN (
SELECT m1.name2
UNION
SELECT name2 FROM aliases WHERE name1 = m1.name2
UNION
SELECT name1 FROM aliases WHERE name2 = m2.name1
);

The EXPLAIN for this query is:
QUERY PLAN
-------------------------------------------------------------------------------
Nested Loop (cost=0.00..36712030.90 rows=1384837 width=80)
Join Filter: (subplan)
-> Index Scan using ind_cmappings_object1_id on c_mappings m1
(cost=0.00..7.08 rows=2 width=40)
Index Cond: (object1_id = 16575564)
-> Seq Scan on c_mappings m2 (cost=0.00..35935.05 rows=1423805
width=40)
SubPlan
-> Unique (cost=13.21..13.23 rows=1 width=4)
-> Sort (cost=13.21..13.22 rows=3 width=4)
Sort Key: object2_id
-> Append (cost=0.00..13.18 rows=3 width=4)
-> Subquery Scan "*SELECT* 1" (cost=0.00..0.01
rows=1 width=0)
-> Result (cost=0.00..0.01 rows=1
width=0)
-> Subquery Scan "*SELECT* 2" (cost=0.00..5.92
rows=1 width=4)
-> Index Scan using
ind_caliases_object2_id on c_aliases (cost=0.00..5.92 rows=1 width=4)
Index Cond: (object2_id = $0)
-> Subquery Scan "*SELECT* 3" (cost=0.00..7.25
rows=1 width=4)
-> Index Scan using
ind_caliases_object1_id on c_aliases (cost=0.00..7.25 rows=1 width=4)
Index Cond: (object1_id = $0)
(18 rows)

So my questions are:
* Does anyone have any idea how I can integrate a function that lists
all aliases for a given name into such a mapping query?
* Does the STABLE keyword in the function definition make the function
to read all its data into memory?
* Is there a way to let postgres use an "Index scan" on that function
instead of a "seq scan"?

Any help very much appreciated,
Jan Aerts

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Steven Rosenstein 2005-10-03 15:47:52 Alternative to a temporary table
Previous Message Tom Lane 2005-10-03 14:08:14 Re: URGENT: pg_statistic_relid_att_index has gone