| From: | Robert Fitzpatrick <lists(at)webtent(dot)net> | 
|---|---|
| To: | PostgreSQL <pgsql-general(at)postgresql(dot)org> | 
| Subject: | Left joining table to setof function | 
| Date: | 2007-09-07 22:43:26 | 
| Message-ID: | 1189205006.2457.45.camel@columbus.webtent.org | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general | 
I have a function that returns a set of a custom type...
trax=# \d client_search
  Composite type "public.client_search"
     Column     |          Type          
----------------+------------------------
 clientno       | bigint
 client         | character varying(100)
 contactno      | bigint
 city           | character varying(50)
 state          | character varying(50)
 contact_info   | text
 contact_title  | character varying(100)
 contact_phone  | character varying(20)
 contact_cell   | character varying(20)
 clientnamesort | text
 lastnamesort   | text
...and need to left join this with a table that can tell me if each
record exists in that table depending on certain values. When I get to
tens of thousands of records returned by the function, while my runtime
of the function itself is little changed, I am struggling to lower the
runtime as it gets very high when there are more and more matches in the
join. I have gone through to make sure all my comparisons in the join
are of same type, here is the table I am joining...
trax=# \d tblsearch_selections                                               Table "public.tblsearch_selections"
       Column        |         Type          |                                     Modifiers                                      
---------------------+-----------------------+------------------------------------------------------------------------------------
 search_selection_id | bigint                | not null default nextval('tblsearch_selections_search_selection_id_seq'::regclass)
 search_id           | integer               | 
 user_id             | character varying(12) | 
 selection_value     | text                  | 
 selected_value      | bigint                | 
 selection_type      | character varying(10) | default 'client'::character varying
Indexes:
    "tblsearch_selections_pkey" PRIMARY KEY, btree (search_selection_id)
    "search_selection_unique_idx" UNIQUE, btree (search_id, user_id, selection_value, selection_type)
Foreign-key constraints:
    "tblsearch_selections_search_fk" FOREIGN KEY (search_id) REFERENCES tblclientsearch(search_id) ON UPDATE CASCADE ON DELETE CASCADE
    "tblsearch_selections_user_fk" FOREIGN KEY (user_id) REFERENCES tbluser(flduserid) ON UPDATE CASCADE ON DELETE CASCADE
Here is the analysis of the query with over 12000 join matches...
trax=# explain analyze select * from client_search_id_func(62) left join tblsearch_selections ON search_id = 62 and user_id = 'RF' and ((selected_value = clientno and selection_type = 'client') or (selected_value = contactno and selection_type = 'contact'));
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=301.98..9434.48 rows=1000 width=426) (actual time=288.130..234578.634 rows=18202 loops=1)
   Join Filter: (((tblsearch_selections.selected_value = client_search_id_func.clientno) AND ((tblsearch_selections.selection_type)::text = 'client'::text)) OR ((tblsearch_selections.selected_value = client_search_id_func.contactno) AND ((tblsearch_selections.selection_type)::text = 'contact'::text)))
   ->  Function Scan on client_search_id_func  (cost=0.00..12.50 rows=1000 width=382) (actual time=253.805..265.752 rows=18202 loops=1)
   ->  Materialize  (cost=301.98..305.02 rows=304 width=44) (actual time=0.001..4.931 rows=12747 loops=18202)
         ->  Seq Scan on tblsearch_selections  (cost=0.00..301.68 rows=304 width=44) (actual time=0.040..12.749 rows=12747 loops=1)
               Filter: ((search_id = 62) AND ((user_id)::text = 'RF'::text))
 Total runtime: 234673.875 ms
(7 rows)
Perhaps I need to use subselects or some other approach?
trax=# explain analyze select *, (select search_selection_id from tblsearch_selections where search_id = 62 and user_id = 'RF' and ((selected_value = clientno and selection_type = 'client') or (selected_value = contactno and selection_type = 'contact'))) from client_search_id_func(62);
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Function Scan on client_search_id_func  (cost=0.00..315837.50 rows=1000 width=382) (actual time=259.520..148524.590 rows=18202 loops=1)
   SubPlan
     ->  Seq Scan on tblsearch_selections  (cost=0.00..315.82 rows=1 width=8) (actual time=3.555..8.131 rows=1 loops=18202)
           Filter: ((search_id = 62) AND ((user_id)::text = 'RF'::text) AND (((selected_value = $0) AND ((selection_type)::text = 'client'::text)) OR ((selected_value = $1) AND ((selection_type)::text = 'contact'::text))))
 Total runtime: 148540.593 ms
And with no matches, the winner so far is the left join...
trax=# delete from tblsearch_selections where search_id = 62;
DELETE 12747
trax=# explain analyze select * from client_search_id_func(62) left join tblsearch_selections ON search_id = 62 and user_id = 'RF' and ((selected_value = clientno and selection_type = 'client') or (selected_value = contactno and selection_type = 'contact'));
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=513.90..384586.40 rows=1000 width=426) (actual time=253.248..308.741 rows=18202 loops=1)
   Join Filter: (((tblsearch_selections.selected_value = client_search_id_func.clientno) AND ((tblsearch_selections.selection_type)::text = 'client'::text)) OR ((tblsearch_selections.selected_value = client_search_id_func.contactno) AND ((tblsearch_selections.selection_type)::text = 'contact'::text)))
   ->  Function Scan on client_search_id_func  (cost=0.00..12.50 rows=1000 width=382) (actual time=251.167..258.764 rows=18202 loops=1)
   ->  Materialize  (cost=513.90..641.92 rows=12802 width=44) (actual time=0.001..0.001 rows=0 loops=18202)
         ->  Seq Scan on tblsearch_selections  (cost=0.00..501.10 rows=12802 width=44) (actual time=2.066..2.066 rows=0 loops=1)
               Filter: ((search_id = 62) AND ((user_id)::text = 'RF'::text))
 Total runtime: 319.371 ms
(7 rows)
Thanks for any help!
-- 
Robert
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Paul Mendoza | 2007-09-07 22:53:04 | Regular expression on a string problem. | 
| Previous Message | Alvaro Herrera | 2007-09-07 20:50:21 | Re: Auto vacuum documentation |