Re: How to reduce impact of a query.

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Howard Cole <howardnews(at)selestial(dot)com>, "pg >> PostgreSQL" <pgsql-general(at)postgresql(dot)org>
Subject: Re: How to reduce impact of a query.
Date: 2008-11-19 00:02:33
Message-ID: 49235799.9080802@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Howard Cole wrote:

> Unfortunately I am on a windows platform. Plus I am running windows
> software raid so there is little tweaking allowed.

Don't write the possibility off too quickly. The driver may well accept
parameters for readahead settings etc, either through a custom
configuration applet (might be a separate download), the driver
properties interface in Device Manager, or through manual registry settings.

Obviously you should be really careful with messing with anything like
this, especially the latter. Your backups are up to date and well tested
anyway, though, RIGHT?

I do suspect that your tuning options will be severely limited by your
I/O system. It's quite likely that the software RAID implementation
you're using is optimized to benchmark well on measurements of
throughput (I/Os per second and bytes per second) at the cost of request
latency. This won't help with your problems with queries interfering
with each other.

That doesn't mean it's not worth some research into tuning Windows
systems for I/O request latency. I don't know much about this, but I'd
be surprised if there wasn't info out there.

Personally I'd also consider getting a database server with a bit more
I/O grunt and room for tuning - even a Linux box with a four fast-ish
NCQ-capable disks in RAID 10 on a modern AHCI SATA controller would
probably help. It'd give you room to use tools like `ionice', for one
thing, plus you'd be using the `md' or `dmraid' drivers for software
RAID and would be able to tweak their behaviour significantly more than
you're likely to be able to with your RAID system under Windows. Then
again, I have lots more experience with Linux in a server role, so I'm
naturally inclined toward using it in preference to Windows. PostgreSQL
is much better tested, much more mature, and has much more community
support on UNIX/Linux platforms though, which is not a trivial factor to
consider either.

> I didn't even know you could do that! I can do this on a system wide
> basis for all of my read only queries so I shall see if it makes a
> difference. I'll check the locking issues but I was under the impression
> that postgres was excellent for this?

It generally is. You still need to think about how your concurrent
queries interact, though, as there are some operations that really must
be ordered. For example, if a query tries to UPDATE a record that a
concurrent query has already UPDATEd PostgreSQL will make the second
query block until the first one commits or rolls back. Doing otherwise
would cause exciting issues if the second query's UPDATE was in any way
dependent on the results of the first.

If you're only INSERTing and SELECTing then there's not much that'll
make queries interfere with each other with locks, unless you have
trigger-maintained materialized views, summary tables or the like
issuing UPDATEs behind the scenes.

> One of the reqular, smaller
> queries does however use the same table so I shall check if this is
> having a major impact. - If I set them both to read-only then that might
> have the desired impact?

I don't think it'll change anything much, but it might tell you (by
causing a query to fail) if it's trying to write anything. This might
help you detect a point at which the queries are interacting that you
might not have expected, such as an UPDATE being issued within a trigger
or by a rule.

> On another point, I tried setting up a scheduled query to force the
> tables into cache and this had some strange effects... As I mentioned in
> an earlier post I have multiple databases running on the same server so
> I ran a select queries for all of them. This speeded up the queries as
> expected with the cached data. However, two of the databases seemingly
> refused to speed up - They always seemed to take 30+ seconds (again
> eating up the machine IO resource). Even if I ran the query-as-a-job on
> only one of these databases, it didn't seem to speed up. Perhaps there
> is something wrong with these databases? The explain analyse seems to
> come back with identical plans on these. Any ideas? (p.s. I am running
> autovacuum)

I can't really help you on this one. I'd start by comparing table sizes
across the various databases, but beyond that ...

--
Craig Ringer

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ravi Chemudugunta 2008-11-19 01:38:04 Code Organisation
Previous Message Erik Jones 2008-11-18 23:38:34 Re: No serial type