Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Farina <daniel(at)heroku(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node
Date: 2012-06-20 00:40:45
Message-ID: CA+TgmoYFiDM5vGhzcd0ou-Q-X6pxta5JEm_CKXUeYfDKoNrZDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 19, 2012 at 6:14 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> I definitely agree that low-level apply is possible as a module. Sure change
> extraction needs core support but I was talking about what you need to
> implement it reusing the "plain" logical support...
>
> What I do not understand is how you want to prevent loops in a simple manner
> without in core support:
>
> A generates a HEAP_INSERT record. Gets decoded into the lcr stream as a INSERT
> action.
> B reads the lcr stream from A and applies the changes. A new HEAP_INSERT
> record. Gets decoded into the lcr stream as a INSERT action.
> A reads the lcr stream from B and ???
>
> At this point you need to prevent a loop. If you have the information where a
> change originally happened (xl_origin_id = A in this case) you can have the
> simple filter on A which ignores change records if lcr_origin_id ==
> local_replication_origin_id).

See my email to Chris Browne, which I think covers this. It needs a
bit in WAL (per txn, or, heck, if it's one bit, maybe per record) but
not a whole node ID.

>> You need a backend-local hash table inside the wal reader process, and
>> that hash table needs to map XIDs to node IDs.  And you occasionally
>> need to prune it, so that it doesn't eat too much memory.  None of
>> that sounds very hard.
> Its not very hard. Its just more complex than what I propose(d).

True, but not a whole lot more complex, and a moderate amount of
complexity to save bit-space is a good trade. Especially when Tom has
come down against eating up the bit space. And I agree with him. If
we've only got 16 bits of padding to work with, we surely be judicious
in burning them when it can be avoided for the expense of a few
hundred lines of code.

>> > Btw, what do you mean with "conflating" the stream? I don't really see
>> > that being proposed.
>> It seems to me that you are intent on using the WAL stream as the
>> logical change stream.  I think that's a bad design.  Instead, you
>> should extract changes from WAL and then ship them around in a format
>> that is specific to logical replication.
> No, I don't want that. I think we will need some different format once we have
> agreed how changeset extraction works.

I think you are saying that you agree with me that the formats should
be different, but that the LCR format is undecided as yet. If that is
in fact what you are saying, great. We'll need to decide that, of
course, but I think there is a lot of cool stuff that can be done that
way.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-20 00:41:08 Re: Backport of fsync queue compaction
Previous Message Robert Haas 2012-06-20 00:35:59 Re: [PATCH 10/16] Introduce the concept that wal has a 'origin' node