Left joining table to setof function

From: Robert Fitzpatrick <lists(at)webtent(dot)net>
To: PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Left joining table to setof function
Date: 2007-09-07 22:43:26
Message-ID: 1189205006.2457.45.camel@columbus.webtent.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have a function that returns a set of a custom type...

trax=# \d client_search
Composite type "public.client_search"
Column | Type
----------------+------------------------
clientno | bigint
client | character varying(100)
contactno | bigint
city | character varying(50)
state | character varying(50)
contact_info | text
contact_title | character varying(100)
contact_phone | character varying(20)
contact_cell | character varying(20)
clientnamesort | text
lastnamesort | text

...and need to left join this with a table that can tell me if each
record exists in that table depending on certain values. When I get to
tens of thousands of records returned by the function, while my runtime
of the function itself is little changed, I am struggling to lower the
runtime as it gets very high when there are more and more matches in the
join. I have gone through to make sure all my comparisons in the join
are of same type, here is the table I am joining...

trax=# \d tblsearch_selections Table "public.tblsearch_selections"
Column | Type | Modifiers
---------------------+-----------------------+------------------------------------------------------------------------------------
search_selection_id | bigint | not null default nextval('tblsearch_selections_search_selection_id_seq'::regclass)
search_id | integer |
user_id | character varying(12) |
selection_value | text |
selected_value | bigint |
selection_type | character varying(10) | default 'client'::character varying
Indexes:
"tblsearch_selections_pkey" PRIMARY KEY, btree (search_selection_id)
"search_selection_unique_idx" UNIQUE, btree (search_id, user_id, selection_value, selection_type)
Foreign-key constraints:
"tblsearch_selections_search_fk" FOREIGN KEY (search_id) REFERENCES tblclientsearch(search_id) ON UPDATE CASCADE ON DELETE CASCADE
"tblsearch_selections_user_fk" FOREIGN KEY (user_id) REFERENCES tbluser(flduserid) ON UPDATE CASCADE ON DELETE CASCADE

Here is the analysis of the query with over 12000 join matches...

trax=# explain analyze select * from client_search_id_func(62) left join tblsearch_selections ON search_id = 62 and user_id = 'RF' and ((selected_value = clientno and selection_type = 'client') or (selected_value = contactno and selection_type = 'contact'));
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=301.98..9434.48 rows=1000 width=426) (actual time=288.130..234578.634 rows=18202 loops=1)
Join Filter: (((tblsearch_selections.selected_value = client_search_id_func.clientno) AND ((tblsearch_selections.selection_type)::text = 'client'::text)) OR ((tblsearch_selections.selected_value = client_search_id_func.contactno) AND ((tblsearch_selections.selection_type)::text = 'contact'::text)))
-> Function Scan on client_search_id_func (cost=0.00..12.50 rows=1000 width=382) (actual time=253.805..265.752 rows=18202 loops=1)
-> Materialize (cost=301.98..305.02 rows=304 width=44) (actual time=0.001..4.931 rows=12747 loops=18202)
-> Seq Scan on tblsearch_selections (cost=0.00..301.68 rows=304 width=44) (actual time=0.040..12.749 rows=12747 loops=1)
Filter: ((search_id = 62) AND ((user_id)::text = 'RF'::text))
Total runtime: 234673.875 ms
(7 rows)

Perhaps I need to use subselects or some other approach?

trax=# explain analyze select *, (select search_selection_id from tblsearch_selections where search_id = 62 and user_id = 'RF' and ((selected_value = clientno and selection_type = 'client') or (selected_value = contactno and selection_type = 'contact'))) from client_search_id_func(62);
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Function Scan on client_search_id_func (cost=0.00..315837.50 rows=1000 width=382) (actual time=259.520..148524.590 rows=18202 loops=1)
SubPlan
-> Seq Scan on tblsearch_selections (cost=0.00..315.82 rows=1 width=8) (actual time=3.555..8.131 rows=1 loops=18202)
Filter: ((search_id = 62) AND ((user_id)::text = 'RF'::text) AND (((selected_value = $0) AND ((selection_type)::text = 'client'::text)) OR ((selected_value = $1) AND ((selection_type)::text = 'contact'::text))))
Total runtime: 148540.593 ms

And with no matches, the winner so far is the left join...

trax=# delete from tblsearch_selections where search_id = 62;
DELETE 12747
trax=# explain analyze select * from client_search_id_func(62) left join tblsearch_selections ON search_id = 62 and user_id = 'RF' and ((selected_value = clientno and selection_type = 'client') or (selected_value = contactno and selection_type = 'contact'));
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop Left Join (cost=513.90..384586.40 rows=1000 width=426) (actual time=253.248..308.741 rows=18202 loops=1)
Join Filter: (((tblsearch_selections.selected_value = client_search_id_func.clientno) AND ((tblsearch_selections.selection_type)::text = 'client'::text)) OR ((tblsearch_selections.selected_value = client_search_id_func.contactno) AND ((tblsearch_selections.selection_type)::text = 'contact'::text)))
-> Function Scan on client_search_id_func (cost=0.00..12.50 rows=1000 width=382) (actual time=251.167..258.764 rows=18202 loops=1)
-> Materialize (cost=513.90..641.92 rows=12802 width=44) (actual time=0.001..0.001 rows=0 loops=18202)
-> Seq Scan on tblsearch_selections (cost=0.00..501.10 rows=12802 width=44) (actual time=2.066..2.066 rows=0 loops=1)
Filter: ((search_id = 62) AND ((user_id)::text = 'RF'::text))
Total runtime: 319.371 ms
(7 rows)

Thanks for any help!

--
Robert

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Paul Mendoza 2007-09-07 22:53:04 Regular expression on a string problem.
Previous Message Alvaro Herrera 2007-09-07 20:50:21 Re: Auto vacuum documentation