Re: Frequent Update Project: Design Overview ofHOTUpdates

From: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
To: "Robert Treat" <xzilla(at)users(dot)sourceforge(dot)net>
Cc: <pgsql-hackers(at)postgresql(dot)org>, "Zeugswetter Andreas ADI SD" <ZeugswetterA(at)spardat(dot)at>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: Frequent Update Project: Design Overview ofHOTUpdates
Date: 2006-11-12 21:23:01
Message-ID: 1163366582.3634.1114.camel@silverbirch.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2006-11-12 at 13:01 -0500, Robert Treat wrote:
> On Friday 10 November 2006 08:53, Simon Riggs wrote:
> > On Fri, 2006-11-10 at 12:32 +0100, Zeugswetter Andreas ADI SD wrote:
> > > 4. although at first it might seem so I see no advantage for vacuum with
> > > overflow
> >
> > No need to VACUUM the indexes, which is the most expensive part. The
> > more indexes you have, the more VACUUM costs, not so with HOT.
> >
>
> This isn't exactly true though right?

The above statement is completely true; please don't say I aim to
mislead. I've been clear about the pre-conditions for the optimization.
This is a straight-up attempt to improve some important use cases.

> Since the more indexes you have, the
> more likely it is that your updating an indexed column, which means HOT isn't
> going to work for you.

Well its not a chance thing is it? It's clear that the pre-conditions
could in some circumstances be an annoyance, but that in itself isn't an
argument against it. I'm especially keen to hear of an optimisation that
would work in all cases for heavy updates. (It was I that originally
suggested the fillfactor approach to optimising UPDATEs, but regret that
although it applies no matter how many indexes you have its not very
effective and even that reduces after the first batch of UPDATEs have
happened).

> One common use case that seems problematic is the
> indexed, frequently updated timestamp field.

Not sure of the use case for that? I understand using a timestamp field
for optimistic locking; why would you index that rather than the PK?

Locating things via coordinates was a use-case that would be non-HOT,
are you thinking of something similar? It's important to understand
which types of things HOT would optimize/not.

HOT probably would change the way you design if you need such a thing.
Rather than indexing the co-ordinate you'd end up binning the values so
the index value would change less often, so most would be HOT with a few
non-HOT UPDATEs. Maybe the same would be true with the timestamp, I'm
not sure.

--
Simon Riggs
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Mair 2006-11-12 21:31:17 Re: error compiling 8.2 in debian sarge
Previous Message Oliver Jowett 2006-11-12 21:22:42 Re: [HACKERS] server process (PID 1188) exited with exit code