From: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
---|---|
To: | Glenn Maynard <glennfmaynard(at)gmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Nested query performance issue |
Date: | 2009-04-09 11:29:11 |
Message-ID: | 49DDDC07.7080608@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Glenn Maynard wrote:
> This rewrite allows getting the top N scores. Unfortunately, this one
> takes 950ms for the same data. With 1000000 scores, it takes 14800ms.
>
> SELECT s.* FROM score s, game g
> WHERE s.game_id = g.id AND
> s.id IN (
> SELECT s2.id FROM score s2 WHERE s2.game_id=g.id ORDER BY s2.score
> DESC LIMIT 1
> );
You don't really need the join with game here, simplifying this into:
SELECT s.* FROM score s
WHERE s.id IN (
SELECT s2.id FROM score s2 WHERE s2.game_id=s.game_id ORDER BY
s2.score
DESC LIMIT 1
);
I don't think it makes it any faster, though.
You can also do this in a very nice and clean fashion using the upcoming
PG 8.4 window functions:
SELECT * FROM (
SELECT s.*, rank() OVER (PARTITION BY s.game_id ORDER BY score DESC)
AS rank FROM score s
) AS sub WHERE rank <= 5;
but I'm not sure how much faster it is. At least here on my laptop it
does a full index scan on score, which may or may not be faster than
just picking the top N values for each game using the index.
> This seems simple: for each game, search for the highest score, and
> then scan the tree to get the next N-1 highest scores. The first
> version does just that, but the second one is doing a seq scan over
> score.
You can do that approach with a SQL function:
CREATE FUNCTION topnscores(game_id int , n int) RETURNS SETOF score
LANGUAGE SQL AS $$
SELECT * FROM score s WHERE s.game_id = $1 ORDER BY score DESC LIMIT $2
$$;
SELECT (sub.ts).id, (sub.ts).score, (sub.ts).game_id
FROM (SELECT topnscores(g.id, 5) ts FROM game g) sub;
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff | 2009-04-09 12:19:22 | Re: Best replication solution? |
Previous Message | Віталій Тимчишин | 2009-04-09 09:25:26 | Re: Nested query performance issue |