From: | Aaron Turner <synfinatic(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Matthew T(dot) O'Connor" <matthew(at)zeut(dot)net>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: 10+hrs vs 15min because of just one index |
Date: | 2006-02-12 19:33:57 |
Message-ID: | 1ca1c1410602121133y279ea488nc38f049ac6ebc7f4@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On 2/12/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Aaron Turner <synfinatic(at)gmail(dot)com> writes:
> > Well before I go about re-architecting things, it would be good to
> > have a strong understanding of just what is going on. Obviously, the
> > unique index on the char(48) is the killer. What I don't know is:
>
> You have another unique index on the integer primary key, so it's not
> the mere fact of a unique index that's hurting you.
Understood. I just wasn't sure if in general unique indexes are some
how more expensive then non-unique indexes.
> > 1) Is this because the column is so long?
>
> Possibly. Allowing for 12 bytes index-entry overhead, the char keys
> would be 60 bytes vs 16 for the integer column, so this index is
> physically almost 4x larger than the other. You might say "but that
> should only cause 4x more I/O" but it's not necessarily so. What's
> hard to tell is whether you are running out of RAM disk cache space,
> resulting in re-reads of pages that could have stayed in memory when
> dealing with one-fifth as much index data. You did not show us the
> iostat numbers for the two cases, but it'd be interesting to look at
> the proportion of writes to reads on the data drive in both cases.
Sounds a lot like what Marc mentioned.
> > 2) Is this because PG is not optimized for char(48) (maybe it wants
> > powers of 2? or doesn't like even numbers... I don't know, just
> > throwing it out there)
>
> Are the key values really all 48 chars long? If not, you made a
> bad datatype choice: varchar(n) (or even text) would be a lot
> smarter. char(n) wastes space on blank-padding.
Yep, everything exactly 48. Looks like I'll be storing it as a bytea
in the near future though.
> The only one of these effects that looks to me like it could result in
> worse-than-linear degradation of I/O demand is maxing out the available
> RAM for disk cache. So while improving the datatype choice would
> probably be worth your while, you should first see if fooling with
> shared_buffers helps, and if not it's time to buy RAM not disk.
Yeah, that's what it's beginning to sound like. Thanks Tom.
--
Aaron Turner
http://synfin.net/
From | Date | Subject | |
---|---|---|---|
Next Message | Adnan DURSUN | 2006-02-12 20:25:28 | SQL Function Performance |
Previous Message | Aaron Turner | 2006-02-12 19:04:37 | Re: 10+hrs vs 15min because of just one index |