Re: Largely inconsistent query execution speed, involving psql_tmp

From: Andy Colson <andy(at)squeakycode(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Largely inconsistent query execution speed, involving psql_tmp
Date: 2014-07-08 15:10:49
Message-ID: 53BC09F9.1070103@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 7/8/2014 4:47 AM, Spiros Ioannou wrote:
> While executing the following query through psql :
>
> SELECT me.* FROM measurement_events me JOIN msrcs_timestamps mt ON
> me.measurement_source_id=mt.measurement_source_id WHERE
> measurement_time > last_update_time
>
> there are two behaviors observed by postgresql (8.4):
> 1) Either the query performs lots of reads on the database and completes
> in about 4 hours (that is the normal-expected behavior)
> 2) Either the query starts filling-up pgsql_tmp and this causes large
> write I/O on the server, and the query never actually completes on a
> reasonable time (we stop it after 10h).
>
> For some strange reason, behaviour 2 is always observed when running
> psql through a bash script, while behavior 1 is only observed while
> running psql interactively from command line (but not always).
>
> explain:
> # explain SELECT me.* FROM measurement_events me JOIN msrcs_timestamps
> mt ON me.measurement_source_id=mt.measurement_source_id WHERE
> measurement_time > last_update_time;
> QUERY PLAN
> ----------------------------------------------------------------------------------------------
> Hash Join (cost=10111.78..422893652.69 rows=2958929695 width=103)
> Hash Cond: (me.measurement_source_id = mt.measurement_source_id)
> Join Filter: (me.measurement_time > mt.last_update_time)
> -> Seq Scan on measurement_events me (cost=0.00..234251772.85
> rows=8876789085 width=103)
> -> Hash (cost=5733.57..5733.57 rows=350257 width=24)
> -> Seq Scan on msrcs_timestamps mt (cost=0.00..5733.57
> rows=350257 width=24)
> (6 rows)
>
>
> We have tried so far fiddling with work_mem up to 512M - no difference.
> Any suggestions?
>
>
>
> Thanks for any help,
> -Spiros Ioannou
> inaccess
>

Is there any reason you don't have an index?

One, or both, of these will help:

create index measurement_events_pk on
measurement_events(measurement_source_id);

create index msrcs_timestamps_pk on msrcs_timestamps(measurement_source_id);

measurement_events has 8 billion rows, so expect it to take a while, but
its a one time cost and should _dramatically_ increase your query
performance.

-Andy

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeff Janes 2014-07-08 16:21:27 Re: Largely inconsistent query execution speed, involving psql_tmp
Previous Message Kevin Grittner 2014-07-08 14:22:39 Re: [ADMIN] WARNING: database must be vacuumed within 8439472 transactions