From: | Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> |
---|---|
To: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> |
Cc: | Merlin Moncure <mmoncure(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: REINDEX takes half a day (and still not complete!) |
Date: | 2011-04-18 05:19:03 |
Message-ID: | BANLkTim4e8PQKHofUSOcQHsPsLgcPmnEVw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Btw, hardware is not an issue. My db has been working fine for a
while. Smaller poorer systems around the web run InnoDB databases. I
wouldn't touch that with a barge pole.
I have a hardware RAID controller, not "fake". It's a good quality
battery-backed 3Ware:
http://192.19.193.26/products/serial_ata2-9000.asp
On Mon, Apr 18, 2011 at 1:14 PM, Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:
> Thanks for these suggestions.
>
> I am beginning to wonder if the issue is deeper.
>
> I set autovacuum to off, then turned off all the connections to the
> database, and did a manual vacuum just to see how long it takes.
>
> This was last night my time. I woke up this morning and it has still
> not finished.
>
> The maintenance_men given to the DB for this process was 2GB.
>
> There is nothing else going on on the server! Now, even REINDEX is
> just failing in the middle:
>
>
> # REINDEX INDEX new_idx_userid;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
>
>
> What else could be wrong?
>
>
>
>
> On Mon, Apr 18, 2011 at 2:38 AM, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> wrote:
>> On Sun, Apr 17, 2011 at 10:59 AM, Phoenix <phoenix(dot)kiula(at)gmail(dot)com> wrote:
>>> TOP does not show much beyond "postmaster". How should I use TOP and
>>> what info can I give you? This is what it looks like:
>>
>> We're basically looking to see if the postmaster process doing the
>> vacuuming or reindexing is stuck in a D state, which means it's
>> waiting on IO.
>> hot the c key while it's running and you should get a little more info
>> on which processes are what.
>>
>>> 4799 postgres 15 0 532m 94m 93m D 0.7 1.2 0:00.14
>>> postmaster
>>
>> That is likely the postmaster that is waiting on IO.
>>
>>> VMSTAT 10 shows this:
>>>
>>> r b swpd free buff cache si so bi bo in cs us sy id wa
>>> 3 14 99552 17900 41108 7201712 0 0 42 11 0 0 8 34 41 16
>>> 2 17 99552 16468 41628 7203012 0 0 1326 84 1437 154810 7 66 12 15
>>> 3 7 99476 16796 41056 7198976 0 0 1398 96 1453 156211 7 66 21 6
>>> 3 17 99476 17228 39132 7177240 0 0 1325 68 1529 156111 8 65 16 11
>>
>> So, we're at 11 to 15% io wait. I'm gonna guess you have 8 cores /
>> threads in your CPUs, and 1/8th ot 100% is 12% so looks like you're
>> probably IO bound here. iostat tells us more:
>>
>>> The results of "iostat -xd 10" is:
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
>>> avgrq-sz avgqu-sz await svctm %util
>>> sda 0.00 7.41 0.30 3.50 2.40 87.29 1.20 43.64
>>> 23.58 0.13 32.92 10.03 3.81
>>> sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
>>> 0.00 0.00 0.00 0.00 0.00
>>> sdc 0.00 18.32 158.26 4.10 2519.32 180.98 1259.66
>>> 90.49 16.63 13.04 79.91 6.17 100.11
>>
>> 100% IO utilization, so yea, it's likely that your sdc drive is your
>> bottleneck. Given our little data is actually moving through the sdc
>> drive, it's not very fast.
>>
>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s
>>
>>> 8GB memory in total. 1GB devoted to PGSQL during these operations.
>>> Otherwise, my settings are as follows (and yes I did make the vacuum
>>> settings more aggressive based on your email, which has had no
>>> apparent impact) --
>>
>> Yeah, as it gets more aggressive it can use more of your IO bandwidth.
>> Since you
>>
>>> What else can I share?
>>
>> That's a lot of help. I'm assuming you're running software or
>> motherboard fake-raid on this RAID-1 set? I'd suggest buying a $500
>> or so battery backed caching RAID controller first, the improvements
>> in performance are huge with such a card. You might wanna try testing
>> the current RAID-1 set with bonnie++ to get an idea of how fast it is.
>>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Claudio Freire | 2011-04-18 06:39:58 | Re: REINDEX takes half a day (and still not complete!) |
Previous Message | Phoenix Kiula | 2011-04-18 05:14:34 | Re: REINDEX takes half a day (and still not complete!) |