Re: Performance of query

From: Roman Konoval <rkonoval(at)gmail(dot)com>
To: Misa Simic <misa(dot)simic(at)gmail(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Cindy Makarowsky <cindymakarowsky(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Performance of query
Date: 2013-03-24 06:45:00
Message-ID: CABcZEEBDCaMmPWftiNZa1d6n4=Q85Ai3mNQ2VAQhBG2YQCQNoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I assume there are reasons not to throw away join to state. May be it still
can be done as the last thing. This should help further:
SELECT counts.* FROM (
SELECT busbase.state AS state, count(busbase.id) AS m0 FROM busbase
GROUP BY busbase.state ) AS counts
INNER JOIN state USING (state)

Regards,
Roman Konoval

On Sun, Mar 24, 2013 at 12:27 AM, Misa Simic <misa(dot)simic(at)gmail(dot)com> wrote:

> Hi Jeff,
>
> It seems my previous mail has not showed up in the list... copied/pasted
> again belloew
>
> However, you said something important:
>
> "The join to the "state" table is not necessary. Between the foreign key
> and the primary key, you know that every state exists, and that every state
> exists only once. But, that will not solve your problem, as the join to
> the state table is not where the time goes."
>
> I think it is something what planner could/should be "aware off"... and
> discard the join
>
> " Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual
> time=38.424..41992.070 rows=60057057 loops=1)"
> " Merge Cond: (state.state = busbase.state)"
>
> this part from bellow plan would save significant time if planner didn't
> decide to take this step at all ....
>
> Kind regards,
>
> Misa
>
>
>
>
> "
> Hi Cindy
>
> TBH - I don't know...
>
> I have added this to list so maybe someone else can help...
>
> To recap:
>
> from start situation (table structure and indexes are in the first mail in
> this thread)
>
> EXPLAIN ANALYZE
> SELECT busbase.state AS c0, count(busbase.id) AS m0 FROM busbase INNER
> JOIN state USING (state)
> GROUP BY busbase.state
>
> says:
> "HashAggregate (cost=7416975.58..7416976.09 rows=51 width=7) (actual
> time=285339.465..285339.473 rows=51 loops=1)"
> " -> Hash Join (cost=2.15..7139961.94 rows=55402728 width=7) (actual
> time=0.066..269527.934 rows=60057057 loops=1)"
> " Hash Cond: (busbase.state = state.state)"
> " -> Seq Scan on busbase (cost=0.00..6378172.28 rows=55402728
> width=7) (actual time=0.022..251029.307 rows=60057057 loops=1)"
> " -> Hash (cost=1.51..1.51 rows=51 width=3) (actual
> time=0.028..0.028 rows=51 loops=1)"
> " Buckets: 1024 Batches: 1 Memory Usage: 2kB"
> " -> Seq Scan on state (cost=0.00..1.51 rows=51 width=3)
> (actual time=0.003..0.019 rows=51 loops=1)"
> "Total runtime: 285339.516 ms"
>
> on created composite index
> CREATE INDEX comp_statidx2
> ON busbase
> USING btree
> (state, id );
>
>
> we got:
>
> "GroupAggregate (cost=0.00..2610570.81 rows=51 width=3) (actual
> time=98.923..51033.888 rows=51 loops=1)"
> " -> Merge Join (cost=0.00..2310285.02 rows=60057056 width=3) (actual
> time=38.424..41992.070 rows=60057057 loops=1)"
> " Merge Cond: (state.state = busbase.state)"
> " -> Index Only Scan using state_pkey on state (cost=0.00..13.02
> rows=51 width=3) (actual time=0.008..0.148 rows=51 loops=1)"
> " Heap Fetches: 51"
> " -> Index Only Scan using comp_statidx2 on busbase
> (cost=0.00..1559558.68 rows=60057056 width=3) (actual
> time=38.408..12883.575 rows=60057057 loops=1)"
> " Heap Fetches: 0"
> "Total runtime: 51045.648 ms"
>
>
> Question is - is it possible to improve it more?
> "
> ​
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Colin Currie 2013-03-25 09:57:34 9.2.3 upgrade reduced pgbench performance by 60%
Previous Message Misa Simic 2013-03-24 00:12:12 PostgreSQL planner