Re: vacuum, performance, and MVCC

From: Hannu Krosing <hannu(at)skype(dot)net>
To: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>
Cc: Mark Woodward <pgsql(at)mohawksoft(dot)com>, Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: vacuum, performance, and MVCC
Date: 2006-06-22 15:38:19
Message-ID: 1150990699.4370.33.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ühel kenal päeval, N, 2006-06-22 kell 10:20, kirjutas Jonah H. Harris:
> On 6/22/06, Alvaro Herrera <alvherre(at)commandprompt(dot)com> wrote:
> > > Hmm, OK, then the problem is more serious than I suspected.
> > > This means that every index on a row has to be updated on every
> > > transaction that modifies that row. Is that correct?
> >
> > Add an index entry, yes.
>
> Again, this is a case for update-in-place. No need to write an extra
> index entry and incur the WAL associated with it.

I guess that MySQL on its original storage does that, but they allow
only one concurrent update per table and no transactions.

> Imagine a table
> with 3 indexes on it... I would estimate that we perform at least 3 to
> 6 times more overhead than any commercial database on such an update.

One way to describe what "commercial databases" do to keep constant
update rates is saying that they do either vacuuming as part of
update, or they just use locks anf force some transactions to wait or
fail/retry.

Depending on exact details and optimisations done, this can be either
slower or faster than postgresql's way, but they still need to do
something to get transactional visibility rules implemented.

> > > There has to be a more linear way of handling this scenario.
> >
> > So vacuum the table often.
>
> It's easy to say VACUUM often... but I'd bet that vacuuming is going
> to lessen the throughput in his tests even more; no matter how it's
> tuned.

Running VACUUM often/continuously will likely keep his update rate
fluctuatons within a corridor of maybe 5-10%, at the cost of 1-2% extra
load. At least if vacuum is configured right and the server is not
already running at 100% IO saturation, in which case it will be worse.

The max throughput figure is not something you actually need very often
in production. What is interesting is setting up the server so that you
can service your loads comfortably. Running the server at 100% lead is
not anything you want to do on production server. There will be things
you need to do anyway and you need some headroom for that.

--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Browne 2006-06-22 16:08:53 Re: vacuum, performance, and MVCC
Previous Message Tom Lane 2006-06-22 15:37:08 Re: [CORE] GPL Source and Copyright Questions