Re: Massive performance issues

From: Arjen van der Meijden <acmmailing(at)tweakers(dot)net>
To: Matthew Sackman <matthew(at)lshift(dot)net>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Massive performance issues
Date: 2005-09-01 20:54:45
Message-ID: 43176A95.70506@tweakers.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 1-9-2005 19:42, Matthew Sackman wrote:
> Obviously, to me, this is a problem, I need these queries to be under a
> second to complete. Is this unreasonable? What can I do to make this "go
> faster"? I've considered normalising the table but I can't work out
> whether the slowness is in dereferencing the pointers from the index
> into the table or in scanning the index in the first place. And
> normalising the table is going to cause much pain when inserting values
> and I'm not entirely sure if I see why normalising it should cause a
> massive performance improvement.

In this case, I think normalising will give a major decrease in on-disk
table-size of this large table and the indexes you have. If that's the
case, that in itself will speed-up all i/o-bound queries quite a bit.

locality_1, _2, city and county can probably be normalised away without
much problem, but going from varchar's to integers will probably safe
you quite a bit of (disk)space.

But since it won't change the selectivity of indexes, so you won't get
more index-scans instead of sequential scans, I suppose.
I think its not that hard to create a normalized set of tables from this
data-set (using insert into tablename select distinct ... from address
and such, insert into address_new (..., city) select ... (select cityid
from cities where city = address.city) from address)
So its at least relatively easy to figure out the performance
improvement from normalizing the dataset a bit.

If you want to improve your hardware, have a look at the Western Digital
Raptor-series SATA disks, they are fast scsi-like SATA drives. You may
also have a look at the amount of memory available, to allow caching
this (entire) table.

Best regards,

Arjen

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Ron 2005-09-01 21:05:33 Re: Massive performance issues
Previous Message Sebastian Hennebrueder 2005-09-01 20:39:13 Re: Massive performance issues