Quick Links

Re: poor execution plan because column dependence

From:	Bob Lunney <bob_lunney(at)yahoo(dot)com>
To:	Václav Ovsík <vaclav(dot)ovsik(at)i(dot)cz>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: poor execution plan because column dependence
Date:	2011-04-13 00:14:29
Message-ID:	867169.80414.qm@web39703.mail.mud.yahoo.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Zito,

Using psql log in as the database owner and run "analyze verbose". Happiness will ensue.

Also, when requesting help with a query its important to state the database version ("select version();") and what, if any, configuration changes you have made in postgresql.conf. Listing ony the ones that have changed is sufficient.

Finally, the wiki has some good information on the care and feeding of a PostgreSQL database:

http://wiki.postgresql.org/wiki/Introduction_to_VACUUM,_ANALYZE,_EXPLAIN,_and_COUNT

Bob Lunney

--- On Tue, 4/12/11, Václav Ovsík <vaclav(dot)ovsik(at)i(dot)cz> wrote:

> From: Václav Ovsík <vaclav(dot)ovsik(at)i(dot)cz>
> Subject: [PERFORM] poor execution plan because column dependence
> To: pgsql-performance(at)postgresql(dot)org
> Date: Tuesday, April 12, 2011, 7:23 PM
> Hi,
> I have done migration of the Request Tracker 3.8.9
> (http://requesttracker.wikia.com/wiki/HomePage) from
> Mysql to
> PostgreSQL in testing environment.
> The RT schema used can be viewed at
> https://github.com/bestpractical/rt/blob/3.8-trunk/etc/schema.Pg.
> I have added full text search on table Attachments based on
> trigrams
> (and still experimenting with it), but is is not
> interesting for the
> problem (the problem is not caused by it directly).
> The full text search alone works quite good. A user testing
> a new RT instance
> reported a poor performance problem with a bit more complex
> query (more
> conditions resulting in table joins).
> Queries are constructed by module DBIx::SearchBuilder.
> The problematic query logged:
>
> rt=# EXPLAIN ANALYZE SELECT DISTINCT main.* FROM
> Tickets main JOIN Transactions Transactions_1 ON (
> Transactions_1.ObjectId = main.id ) JOIN Attachments
> Attachments_2 ON ( Attachments_2.TransactionId =
> Transactions_1.id ) WHERE (Transactions_1.ObjectType =
> 'RT::Ticket') AND (main.Status != 'deleted') AND
> (main.Status = 'resolved' AND main.LastUpdated >
> '2008-12-31 23:00:00' AND main.Created > '2005-12-31
> 23:00:00' AND main.Queue = '15' AND (
> Attachments_2.trigrams @@ text_to_trgm_tsquery('uir') AND
> Attachments_2.Content ILIKE '%uir%' ) ) AND (main.Type =
> 'ticket') AND (main.EffectiveId = main.id) ORDER BY
> main.id ASC;
>
>
>
>
>
>
>
>
>
>
>    QUERY
> PLAN
>
>
>
>
>
>
>
>
>
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> Unique (cost=23928.60..23928.67 rows=1 width=162)
> (actual time=5201.139..5207.965 rows=649 loops=1)
>    -> Sort
> (cost=23928.60..23928.61 rows=1 width=162) (actual
> time=5201.137..5201.983 rows=5280 loops=1)
>    Sort Key:
> main.effectiveid, main.issuestatement, main.resolution,
> main.owner, main.subject, main.initialpriority,
> main.finalpriority, main.priority, main.timeestimated,
> main.timeworked, main.timeleft, main.told, main.starts,
> main.started, main.due, main.resolved, main.lastupdatedby,
> main.lastupdated, main.creator, main.created, main.disabled
>    Sort Method:
> quicksort Memory: 1598kB
>    -> Nested
> Loop (cost=0.00..23928.59 rows=1 width=162) (actual
> time=10.060..5120.834 rows=5280 loops=1)
>
>    -> Nested Loop
> (cost=0.00..10222.38 rows=1734 width=166) (actual
> time=8.702..1328.970 rows=417711 loops=1)
>
>    -> Seq Scan on tickets
> main (cost=0.00..5687.88 rows=85 width=162) (actual
> time=8.258..94.012 rows=25410 loops=1)
>
>    Filter:
> (((status)::text <> 'deleted'::text) AND (lastupdated
> > '2008-12-31 23:00:00'::timestamp without time zone) AND
> (created > '2005-12-31 23:00:00'::timestamp without time
> zone) AND (effectiveid = id) AND (queue = 15) AND
> ((type)::text = 'ticket'::text) AND ((status)::text =
> 'resolved'::text))
>
>    -> Index Scan using
> transactions1 on transactions transactions_1
> (cost=0.00..53.01 rows=27 width=8) (actual time=0.030..0.039
> rows=16 loops=25410)
>
>    Index Cond:
> (((transactions_1.objecttype)::text = 'RT::Ticket'::text)
> AND (transactions_1.objectid = main.effectiveid))
>
>    -> Index Scan using attachments2
> on attachments attachments_2 (cost=0.00..7.89 rows=1
> width=4) (actual time=0.008..0.009 rows=0 loops=417711)
>
>    Index Cond:
> (attachments_2.transactionid = transactions_1.id)
>
>    Filter: ((attachments_2.trigrams @@
> '''uir'''::tsquery) AND (attachments_2.content ~~*
> '%uir%'::text))
> Total runtime: 5208.149 ms
> (14 rows)
>
> The above times are for already cached data (repeated
> query).
> I think the execution plan is poor. Better would be to
> filter table attachments
> at first and then join the rest. The reason is a bad
> estimate on number of rows
> returned from table tickets (85 estimated -> 25410 in
> the reality).
> Eliminating sub-condition...
>
>
> rt=# explain analyze select * from tickets where
> effectiveid = id;
>
>
>
> QUERY PLAN
>
>
>
> --------------------------------------------------------------------------------------------------------------
> Seq Scan on tickets (cost=0.00..4097.40 rows=530
> width=162) (actual time=0.019..38.130 rows=101869 loops=1)
>    Filter: (effectiveid = id)
> Total runtime: 54.318 ms
> (3 rows)
>
> Estimated 530 rows, but reality is 101869 rows.
>
> The problem is the strong dependance between id and
> effectiveid. The RT
> documentation says:
>
> EffectiveId:
> By default, a ticket's EffectiveId is the
> same as its ID. RT supports the
> ability to merge tickets together. When you
> merge a ticket into
> another one, RT sets the first ticket's
> EffectiveId to the second
> ticket's ID. RT uses this data to quickly
> look up which ticket
> you're really talking about when you
> reference a merged ticket.
>
>
> I googled the page http://wiki.postgresql.org/wiki/Cross_Columns_Stats
>
> Maybe I identified the already documented problem. What I
> can do with this
> situation? Some workaround?
>
> Thanks in advance for any suggestions.
> Best Regards
> --
> Zito
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

poor execution plan because column dependence at 2011-04-12 23:23:43 from Václav Ovsík

Responses

Re: poor execution plan because column dependence at 2011-04-13 07:55:37 from Václav Ovsík

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2011-04-13 00:52:15	Re: poor execution plan because column dependence
Previous Message	Václav Ovsík	2011-04-12 23:23:43	poor execution plan because column dependence