Re: Need help optimizing this query

From: Michael Glaesemann <grzm(at)seespotcode(dot)net>
To: "Pat Maddox" <pergesu(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Need help optimizing this query
Date: 2007-07-18 21:01:19
Message-ID: FA3650C1-1B9E-43E4-8241-A9E9A2412AB3@seespotcode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On 7/18/07, Pat Maddox <pergesu(at)gmail(dot)com> wrote:
> I've got a query that's taking forever (as will be obvious when you
> see it and the explain output). I can't figure out what indexes to
> add to make this run faster. I'd appreciate any help.
>

I'm curious why it's choosing to use hash joins rather than taking
advantage of the indexes you have on the foreign key columns. What
are the table definitions? Are hit, logged_in, played, downloaded all
columns of videos_views?

Have you ANALYZEd these tables? What is the EXPLAIN ANALYZE output
for this query? You only provided the EXPLAIN output, which doesn't
compare the plan with the actual query cost.

I found it quite difficult to read you query. I reformatted it and
also used some SQL functions to abstract away the CASE expressions.
These SQL functions will probably be inlined so there should be very
little overhead. If you have a boolean column, you don't need to test
IS TRUE or IS FALSE: you can just use the value itself. I also find
it helpful to separate the join conditions (in the JOIN clause) from
the restrictions (the WHERE clause), which I've done below.

CREATE OR REPLACE FUNCTION ternary(BOOLEAN, INTEGER, INTEGER)
RETURNS INTEGER
LANGUAGE SQL AS $_$SELECT CASE WHEN $1 THEN $2 ELSE $3 END$_$;

CREATE OR REPLACE FUNCTION value_when(BOOLEAN, INTEGER)
RETURNS INTEGER
LANGUAGE SQL AS $_$SELECT ternary($1,$2,0)$_$;

CREATE OR REPLACE FUNCTION one_when(BOOLEAN)
RETURNS INTEGER
LANGUAGE SQL as $_$SELECT value_when($1,1)$_$;

SELECT
SUM (one_when(hit AND logged_in)) AS count_hits_console
, SUM (one_when(hit AND NOT logged_in)) AS count_hits_remote
, SUM (one_when(played AND logged_in)) AS count_played_console
, SUM (one_when(played AND NOT logged_in)) AS count_played_remote
, SUM (one_when(downloaded AND logged_in)) AS
count_downloaded_console
, SUM (one_when(downloaded AND NOT logged_in)) AS
count_downloaded_remote
, SUM (value_when((played OR downloaded) AND logged_in,
assets.size))
as download_size_console
, SUM (value_when((played OR downloaded) AND NOT logged_in),
assets.size)
as download_size_remote
, videos.id
, videos.title
, videos.guid
FROM video_views
JOIN assets ON (video_views.video_id=videos.id)
JOIN videos ON (video_views.asset_id=assets.id)
WHERE videos.company_id=1
GROUP BY videos.id
, videos.title
, videos.guid
ORDER BY count_hits_remote
DESC LIMIT 100

Michael Glaesemann
grzm seespotcode net

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andrej Ricnik-Bay 2007-07-18 21:07:40 Re: Sylph-Searcher 1.0.0 released
Previous Message Francisco Reyes 2007-07-18 20:46:21 Feature request: Per database search_path