Re: Need help optimizing this query

From: "Pat Maddox" <pergesu(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Need help optimizing this query
Date: 2007-07-18 21:12:14
Message-ID: 810a540e0707181412x5f520a8bg1aa3b9d003dc0c46@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 7/18/07, Michael Glaesemann <grzm(at)seespotcode(dot)net> wrote:
>
> On 7/18/07, Pat Maddox <pergesu(at)gmail(dot)com> wrote:
> > I've got a query that's taking forever (as will be obvious when you
> > see it and the explain output). I can't figure out what indexes to
> > add to make this run faster. I'd appreciate any help.
> >
>
> I'm curious why it's choosing to use hash joins rather than taking
> advantage of the indexes you have on the foreign key columns. What
> are the table definitions? Are hit, logged_in, played, downloaded all
> columns of videos_views?
>
> Have you ANALYZEd these tables? What is the EXPLAIN ANALYZE output
> for this query? You only provided the EXPLAIN output, which doesn't
> compare the plan with the actual query cost.
>
> I found it quite difficult to read you query. I reformatted it and
> also used some SQL functions to abstract away the CASE expressions.
> These SQL functions will probably be inlined so there should be very
> little overhead. If you have a boolean column, you don't need to test
> IS TRUE or IS FALSE: you can just use the value itself. I also find
> it helpful to separate the join conditions (in the JOIN clause) from
> the restrictions (the WHERE clause), which I've done below.
>
> CREATE OR REPLACE FUNCTION ternary(BOOLEAN, INTEGER, INTEGER)
> RETURNS INTEGER
> LANGUAGE SQL AS $_$SELECT CASE WHEN $1 THEN $2 ELSE $3 END$_$;
>
> CREATE OR REPLACE FUNCTION value_when(BOOLEAN, INTEGER)
> RETURNS INTEGER
> LANGUAGE SQL AS $_$SELECT ternary($1,$2,0)$_$;
>
> CREATE OR REPLACE FUNCTION one_when(BOOLEAN)
> RETURNS INTEGER
> LANGUAGE SQL as $_$SELECT value_when($1,1)$_$;
>
> SELECT
> SUM (one_when(hit AND logged_in)) AS count_hits_console
> , SUM (one_when(hit AND NOT logged_in)) AS count_hits_remote
> , SUM (one_when(played AND logged_in)) AS count_played_console
> , SUM (one_when(played AND NOT logged_in)) AS count_played_remote
> , SUM (one_when(downloaded AND logged_in)) AS
> count_downloaded_console
> , SUM (one_when(downloaded AND NOT logged_in)) AS
> count_downloaded_remote
> , SUM (value_when((played OR downloaded) AND logged_in,
> assets.size))
> as download_size_console
> , SUM (value_when((played OR downloaded) AND NOT logged_in),
> assets.size)
> as download_size_remote
> , videos.id
> , videos.title
> , videos.guid
> FROM video_views
> JOIN assets ON (video_views.video_id=videos.id)
> JOIN videos ON (video_views.asset_id=assets.id)
> WHERE videos.company_id=1
> GROUP BY videos.id
> , videos.title
> , videos.guid
> ORDER BY count_hits_remote
> DESC LIMIT 100
>
>
>
> Michael Glaesemann
> grzm seespotcode net
>
>
>

Michael,

I tried your SQL but it didn't work - it was missing the videos table
in the FROM clause. But when I add it, I get the error:

ERROR: invalid reference to FROM-clause entry for table "video_views"
LINE 20: JOIN assets ON (video_views.video_id=videos.id)
^
HINT: There is an entry for table "video_views", but it cannot be
referenced from this part of the query.

Not really sure what that means.

Here are the table definitions:

twistage_development=# \d video_views
Table "public.video_views"
Column | Type |
Modifiers
------------+-----------------------------+----------------------------------------------------------
id | integer | not null default
nextval('video_views_id_seq'::regclass)
video_id | integer |
created_at | timestamp without time zone |
asset_id | integer |
played | boolean | default false
downloaded | boolean | default false
hit | boolean | default false
logged_in | boolean | default false
Indexes:
"video_views_pkey" PRIMARY KEY, btree (id)
"index_video_views_on_asset_id" btree (asset_id)
"index_video_views_on_video_id" btree (video_id)
"index_video_views_on_video_id_and_asset_id_and_created_at" btree
(video_id, created_at, asset_id)

twistage_development=# \d videos
Table "public.videos"
Column | Type | Modi
fiers
-----------------------+-----------------------------+--------------------------
---------------------------
id | integer | not null default nextval(
'videos_id_seq'::regclass)
title | character varying(255) |
duration | double precision |
description | text |
status | character varying(255) |
user_id | integer |
created_at | timestamp without time zone |
upload_finished | boolean | default false
publisher_name | character varying(255) |
company_id | integer |
available_for_display | boolean | default true
guid | character varying(255) |
main_asset_id | integer |
container_type | character varying(255) |
codec | character varying(255) |
site_id | integer |
deleted_at | timestamp without time zone |
purged_at | timestamp without time zone |
remote_hits_count | integer | default 0
Indexes:
"videos_pkey" PRIMARY KEY, btree (id)
"complete_videos_without_deleted_at" btree (company_id, status)
"index_complete_videos" btree (deleted_at, purged_at, status, created_at, co
mpany_id)
"index_videos_on_company_id" btree (company_id)
"index_videos_on_deleted_at_and_status_and_site_id" btree (status, deleted_a
t, site_id)
"index_videos_on_guid" btree (guid)
"index_videos_on_publisher_name" btree (publisher_name)
"index_videos_on_site_id" btree (site_id)
"index_videos_on_user_id" btree (user_id)

twistage_development=# \d assets
Table "public.assets"
Column | Type |
Modifiers
-----------------+-----------------------------+-----------------------------------------------------
id | integer | not null default
nextval('assets_id_seq'::regclass)
video_id | integer |
video_format_id | integer |
guid | character varying(255) |
source_path | character varying(255) |
size | bigint |
vresolution | integer |
hresolution | integer |
video_bitrate | integer |
frame_rate | integer |
container | character varying(255) |
vcodec | character varying(255) |
status | character varying(255) |
deleted_at | timestamp without time zone |
audio_bitrate | integer |
acodec | character varying(255) |
duration | double precision |
Indexes:
"assets_pkey" PRIMARY KEY, btree (id)
"index_assets_on_video_format_id" btree (video_format_id)
"index_assets_on_video_id" btree (video_id)

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Michael Glaesemann 2007-07-18 21:14:24 Re: Need help optimizing this query
Previous Message Andrej Ricnik-Bay 2007-07-18 21:07:40 Re: Sylph-Searcher 1.0.0 released