From: | Tomas Vondra <tv(at)fuzzy(dot)cz> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Sanity checking big select performance |
Date: | 2014-10-28 21:42:47 |
Message-ID: | 54500DD7.9040900@fuzzy.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 28.10.2014 22:15, Jeff Chen wrote:
> Hi friends!
>
> I'd love to get a sanity check on whether a fat select query I'm doing
> makes sense given the infrastructure that we have.
>
> We have 3 big tables that we typically join together for certain
> queries: a ~40 million row photos table, a ~20 million row users table,
> and a ~50 million row photo_to_album table that maps photos to albums.
So how much data is it? Does it fit within RAM (after loading into DB,
with all the indexes)?
> We like to display real time analytics, which often results in a query like:
>
> select (random aggregations )
> from
> photo_to_album join photos on photos.id <http://photos.id> =
> photo_to_album.photo_id
> join users on users.id <http://users.id> = photos.user_id
> where
> photo_to_album.album_id = <something>
> and
> photos.created_at between <some dates>
> and <other junk>
>
> We have indexes on all of the joins, and the where clauses.
Can we get EXPLAIN (and ideally EXPLAIN ANALYZE) for such queries?
> One of these queries that should be targeting something like 300K
> photos takes 38 seconds to run (with an aggregate/nested loop taking
> effectively all of that time), and then upon second execution with a
> warm cache, 4 seconds.
Well, if you're hitting disk, it's going to be slow. As you observed,
after loading it into page cache, it's much faster.
> Also worryingly, it spikes read IOPS to almost 1500/sec during the time
> and write IOPS 200/sec. When not running the query, steady level read
> iops basically nil, write hovers around 50-100.
>
> This also increases the queue depth from basically 0 up to 6. Keeping
> the queue depth high seems to cause timeouts in other queries. The CPU
> is barely if at all affected, hovering around 20%. Memory also barely
> affected.
20% is ~2 CPU cores (as you have 8 of them).
> We have a RDS Postgres database, m3.2xlarge with 2000 Provisioned IOPS
> and 400GB storage. This translates to 8 virtual CPUs, 30GiB memory, and
> all SSD drives.
AFAIK there are two PostgreSQL major versions supported on RDS - 9.1 and
9.3. Which one are you using?
Also, can you list values for some basic parameters (shared_buffers,
work_mem)? We don't know what are the default values on RDS, neither if
you somehow modified them.
> Several questions here:
>
> 1) Is that level of IOPS normal?
Ummmmm, why wouldn't it be? Each IO request works with 16 KB (on EBS),
and you're reading/writing a certain amount of data.
> 2) Is it bad that the level of iops can queue requests that screw up the
> whole database even if it's just select queries? Especially when the
> CPU and Memory are still plentiful?
You're saturating a particular resource. If you hit I/O wall, you can't
use the CPU / memory. The fact that it slows down your queries is
somehow expected.
Is it bad? Well, if you need to minimize impact on other queries, then
probably yes.
> 3) What is up with the huge difference between cold and warm cache?
I don't understand why you're surprised by this? The EBS performance on
m3.2xlarge (with EBS-Optimized networking, i.e. 1 Gbit dedicated to EBS)
you get up to ~120 MB/s, except that you set 2000 IOPS, which is ~32
MB/s. Memory is orders of magnitude faster, hence the difference.
regards
Tomas
From | Date | Subject | |
---|---|---|---|
Next Message | Huang, Suya | 2014-10-28 23:43:22 | Re: unnecessary sort in the execution plan when doing group by |
Previous Message | Tomas Vondra | 2014-10-28 21:19:10 | Re: Incredibly slow restore times after 9.0>9.2 upgrade |