Re: 10+hrs vs 15min because of just one index

From: Aaron Turner <synfinatic(at)gmail(dot)com>
To: Marc Morin <marc(at)sandvine(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 10+hrs vs 15min because of just one index
Date: 2006-02-12 19:04:37
Message-ID: 1ca1c1410602121104p2c231d77u883871d7e2bddbfa@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2/12/06, Marc Morin <marc(at)sandvine(dot)com> wrote:
> From your config, a check point will be forced when
>
> (checkpoint_segments * 16 M) < rows * (8K/N*h + (1-h)*8K) * B
>
> Where h is the "hitrate" or correlation between the update scan and the
> index. Do you have a sense of what this is?

I know my checkpoints happen > 30 secs apart, since PG isn't
complaining in my log. I have no clue what the correlation is.

> In the limits, we have 100%
> correlation or 0% correlation. N is the lower cost of putting the
> change in the WAL entry, not sure what this is, but small, I am
> assuming, say N=100. B is the average number of blocks changed per
> updated row (assume B=1.1 for your case, heap,serial index have very
> high correlation)
>
> In the 0% correlation case, each updated row will cause the index update
> to read/modify the block. The modified block will be entirely written to
> the WAL log. After (30 * 16M) / (8K) / 1.1 ~ 55k rows, a checkpoint
> will be forced and all modified blocks in shared buffers will be written
> out.
>
> Increasing checkpoint_segments to 300 and seeing if that makes a
> difference. If so, the excessive WAL checkpoints are your issue. If
> performance is exactly the same, then I would assume that you have close
> to 0% correlation between the rows in the heap and index.

Ok, i'll have to give that a try.

> Can you increase shared_buffers? With a low correlation index, the only
> solution is to hold the working set of blocks in memory. Also, make
> sure that the checkpoint segments are big enough for you to modify them
> in place, don't want checkpoints occurring....

I'll have to look at my memory usage on this server... with only 2GB
and a bunch of other processes running around I'm not sure if I can go
up much more without causing swapping. Of course RAM is cheap...

> Note that the more updates you do, the larger the tables/index become
> and the worse the problem becomes. Vacuuming the table is an "answer"
> but unfortunately, it tends to decrease correlation from our
> observations. :-(

Good to know.

> From our observations, dropping index and rebuilding them is not always
> practical, depends on your application; table will be exclusively locked
> during the transaction due to drop index.

Yep. In my case it's not a huge problem right now, but I know it will
become a serious one sooner or later.

Thanks a lot Marc. Lots of useful info.

--
Aaron Turner
http://synfin.net/

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Aaron Turner 2006-02-12 19:33:57 Re: 10+hrs vs 15min because of just one index
Previous Message Marc Morin 2006-02-12 17:37:13 Re: 10+hrs vs 15min because of just one index