Quick Links

Re: logical changeset generation v6.4

From:	Andres Freund <andres(at)2ndquadrant(dot)com>
To:	Hannu Krosing <hannu(at)2ndQuadrant(dot)com>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: logical changeset generation v6.4
Date:	2013-10-21 15:06:02
Message-ID:	20131021150602.GD2968@awork2.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 2013-10-21 16:40:43 +0200, Hannu Krosing wrote:
> On 10/18/2013 08:50 PM, Andres Freund wrote:
> > On 2013-10-18 08:11:29 -0400, Robert Haas wrote:
> ...
> >> 2. If that seems too complicated, how about just logging the whole old
> >> tuple for version 1?
> > I think that'd make the patch much less useful because it bloats WAL
> > unnecessarily for the primary user (replication) of it. I'd rather go
> > for primary keys only if that proves to be the contentious point.
> >
> > How about modifying the selection to go from:
> > * all rows if ALTER TABLE ... REPLICA IDENTITY NOTHING|FULL;
> > * index chosen by ALTER TABLE ... REPLICA IDENTITY USING indexname
> > * [later, maybe] ALTER TABLE ... REPLICA IDENTITY (cola, colb)
> > * primary key
> > * candidate key with the smallest oid
> >
> > Including the candidate key will help people using changeset extration
> > for auditing that do not have primary key. That really isn't an
> > infrequent usecase.

> As I understand it for a table with *no* unique index,
> the "candidate key" is the full tuple, so if we get an UPDATE for
> it then this should be replicated as
> "UPDATE first row matching (NOT DISTINCT FROM) all columns"
> which on replay side will be equivalent to
> CREATE CURSOR ...; FETCH 1 ...; UPDATE ... WHERE CURRENT...'

No, it's not a candidate key since it's not uniquely identifying a
row. You can play tricks as you describe, but that still doesn't make
the whole row a candidate key.

But anyway, I suggest allowing for logging all columns above...

> I know that this will slow down replication, as you can not use direct
> index updates internally - at least not easily - but need to let postgreSQL
> actually plan this, but such single row update is no faster on origin
> either.

That's not actually true. Consider somebody doing something like:
UPDATE big_table_without_indexes SET column = ...;
On the source side that's essentialy O(n). If you replicate on a
row-by-row basis it will be O(n^2) on the replay side.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Re: logical changeset generation v6.4 at 2013-10-21 14:40:43 from Hannu Krosing

Responses

Re: logical changeset generation v6.4 at 2013-10-21 15:26:47 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2013-10-21 15:10:09	Re: Commitfest II CLosed
Previous Message	Hannu Krosing	2013-10-21 14:45:01	Re: Commitfest II CLosed