From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Julius Tuskenis <julius(at)nsoft(dot)lt> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: unexpected stable function behavior |
Date: | 2011-03-10 21:14:50 |
Message-ID: | AANLkTimTR_f98H4+h57iW6-S=BjZaMTSQ6VznRgXU5Ly@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Thu, Mar 10, 2011 at 10:26 AM, Julius Tuskenis <julius(at)nsoft(dot)lt> wrote:
> Hello, list
>
> Our company is creating a ticketing system. Of course the performance issues
> are very important to us (as to all of you I guess). To increase speed of
> some queries stable functions are used, but somehow they don't act exactly
> as I expect, so would you please explain what am I doing (or expecting)
> wrong...
>
> First of all I have the stable function witch runs fast and I have no
> problems with it at all.
> CREATE OR REPLACE FUNCTION web_select_extra_price(prm_price_id integer,
> prm_event_id integer, prm_cashier_id integer)
> RETURNS numeric AS
> '
> ........ some code here
> '
> LANGUAGE plpgsql STABLE
> COST 100;
>
> Now the test:
>
> 1) query without using the function
> explain analyze
> SELECT thtp_tick_id, price_id,
> price_price,
> price_color
> FROM ticket_price
> JOIN ticket_has_ticket_price ON (price_id = thtp_price_id)
> WHERE price_event_id = 7820 and (current_timestamp >= price_date AND
> current_timestamp <= price_date_till)
> ORDER BY price_id;
>
> Result:
> "Sort (cost=132.47..133.77 rows=518 width=25) (actual time=5.125..5.842
> rows=4335 loops=1)"
> " Sort Key: ticket_price.price_id"
> " Sort Method: quicksort Memory: 433kB"
> " -> Nested Loop (cost=0.00..109.12 rows=518 width=25) (actual
> time=0.037..3.148 rows=4335 loops=1)"
> " -> Index Scan using index_price_event_id on ticket_price
> (cost=0.00..8.52 rows=2 width=21) (actual time=0.014..0.026 rows=7
> loops=1)"
> " Index Cond: (price_event_id = 7820)"
> " Filter: ((now() >= price_date) AND (now() <=
> price_date_till))"
> " -> Index Scan using idx_thtp_price_id on ticket_has_ticket_price
> (cost=0.00..47.06 rows=259 width=8) (actual time=0.013..0.211 rows=619
> loops=7)"
> " Index Cond: (ticket_has_ticket_price.thtp_price_id =
> ticket_price.price_id)"
> "Total runtime: 6.425 ms"
>
>
> 2) Query using the function
> explain analyze
> SELECT thtp_tick_id, price_id,
> price_price, web_select_extra_price(price_id, price_event_id, 1),
> price_color
> FROM ticket_price
> JOIN ticket_has_ticket_price ON (price_id = thtp_price_id)
> WHERE price_event_id = 7820 and (current_timestamp >= price_date AND
> current_timestamp <= price_date_till)
> ORDER BY price_id;
>
> Result:
> "Sort (cost=261.97..263.27 rows=518 width=29) (actual time=704.224..704.927
> rows=4335 loops=1)"
> " Sort Key: ticket_price.price_id"
> " Sort Method: quicksort Memory: 433kB"
> " -> Nested Loop (cost=0.00..238.62 rows=518 width=29) (actual
> time=0.272..699.073 rows=4335 loops=1)"
> " -> Index Scan using index_price_event_id on ticket_price
> (cost=0.00..8.52 rows=2 width=25) (actual time=0.011..0.052 rows=7
> loops=1)"
> " Index Cond: (price_event_id = 7820)"
> " Filter: ((now() >= price_date) AND (now() <=
> price_date_till))"
> " -> Index Scan using idx_thtp_price_id on ticket_has_ticket_price
> (cost=0.00..47.06 rows=259 width=8) (actual time=0.017..0.582 rows=619
> loops=7)"
> " Index Cond: (ticket_has_ticket_price.thtp_price_id =
> ticket_price.price_id)"
> "Total runtime: 705.531 ms"
>
>
> Now what you can think is that executing web_select_extra_price takes the
> difference, but
> 3) As STABLE function should be executed once for every different set of
> parameters I do
> SELECT web_select_extra_price(price_id, 7820, 1) FROM (
>
> SELECT distinct price_id
> FROM ticket_price
> JOIN ticket_has_ticket_price ON (price_id = thtp_price_id)
> WHERE price_event_id = 7820 and (current_timestamp >= price_date AND
> current_timestamp <= price_date_till)
> ) as qq;
>
> Result:
> "Subquery Scan on qq (cost=110.34..110.88 rows=2 width=4) (actual
> time=7.265..8.907 rows=7 loops=1)"
> " -> HashAggregate (cost=110.34..110.36 rows=2 width=4) (actual
> time=6.866..6.873 rows=7 loops=1)"
> " -> Nested Loop (cost=0.00..109.05 rows=517 width=4) (actual
> time=0.037..4.643 rows=4335 loops=1)"
> " -> Index Scan using index_price_event_id on ticket_price
> (cost=0.00..8.52 rows=2 width=4) (actual time=0.014..0.038 rows=7 loops=1)"
> " Index Cond: (price_event_id = 7820)"
> " Filter: ((now() >= price_date) AND (now() <=
> price_date_till))"
> " -> Index Scan using idx_thtp_price_id on
> ticket_has_ticket_price (cost=0.00..47.04 rows=258 width=4) (actual
> time=0.019..0.336 rows=619 loops=7)"
> " Index Cond: (ticket_has_ticket_price.thtp_price_id =
> ticket_price.price_id)"
> "Total runtime: 8.966 ms"
>
>
> You can see the query has only 7 distinct parameter sets to pass to the
> function but...
> 4) Explain analyze
> SELECT web_select_extra_price(price_id, 7820, 1)
> FROM ticket_price
> JOIN ticket_has_ticket_price ON (price_id = thtp_price_id)
> WHERE price_event_id = 7820 and (current_timestamp >= price_date AND
> current_timestamp <= price_date_till)
>
> Result:
> "Nested Loop (cost=0.00..238.30 rows=517 width=4) (actual
> time=0.365..808.537 rows=4335 loops=1)"
> " -> Index Scan using index_price_event_id on ticket_price
> (cost=0.00..8.52 rows=2 width=4) (actual time=0.014..0.040 rows=7 loops=1)"
> " Index Cond: (price_event_id = 7820)"
> " Filter: ((now() >= price_date) AND (now() <= price_date_till))"
> " -> Index Scan using idx_thtp_price_id on ticket_has_ticket_price
> (cost=0.00..47.04 rows=258 width=4) (actual time=0.016..0.655 rows=619
> loops=7)"
> " Index Cond: (ticket_has_ticket_price.thtp_price_id =
> ticket_price.price_id)"
> "Total runtime: 810.143 ms"
>
>
> So I am totally confused... It seems that selecting 4335 rows is a joke for
> Postgresql, but the great job is done then adding one of 7 possible values
> to the result set... Please help me understand what I am missing here?...
>
> Finally the system:
> Server
> PG: Version string PostgreSQL 9.0.3 on i486-pc-linux-gnu, compiled by GCC
> gcc-4.4.real (Debian 4.4.5-10) 4.4.5, 32-bit
>
> Client
> Win XP SP3 with pgAdmin 1.12.2.
This is a huge problem with non trivial functions in the select list.
Pushing the result into and a subquery does NOT guarantee that the
inner result is materialized first. Try a CTE.
with foo as
(
select yadda;
)
select func(foo.a), foo.* from foo;
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Dan Ancona | 2011-03-10 21:25:24 | big joins not converging |
Previous Message | fork | 2011-03-10 18:04:39 | Re: Tuning massive UPDATES and GROUP BY's? |