Re: Implementing incremental client updates

From: Glyn Astill <glynastill(at)yahoo(dot)co(dot)uk>
To: Nii Gii <nkgm(dot)realm(at)gmail(dot)com>, "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: Implementing incremental client updates
Date: 2015-02-10 13:05:50
Message-ID: 521810732.2592114.1423573550866.JavaMail.yahoo@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

> From: Nii Gii <nkgm(dot)realm(at)gmail(dot)com>
>To: pgsql-sql(at)postgresql(dot)org
>Sent: Tuesday, 10 February 2015, 11:39
>Subject: [SQL] Implementing incremental client updates
>
>
>
>Dear all,
>
>
>I am a newcomer to postgres and love it so far. I've given this problem a lot of thought already, RTFM to the best of my ability, but hit a dead end, so I need a nudge in the right direction.
>
>
>I'm designing a database where each entity of interest has a "rowversion" column that gets assigned a value from a global sequence. So, in the simplest scenario, if I have two rows in table "emps": emp1 with rowversion(at)3 and emp2 with rowversion(at)5, then I know emp2 was modified after emp1.
>
>
>This is to form the foundation of a data sync scenario, where a client knows they have everything up until @3 and now they need the latest updates (SELECT * FROM emps WHERE rowversion>3 and rowversion<=new_anchor). The problem here is, there's no way to interrogate the database for the "latest committed @rowversion that has none pending before it". An example scenario:
>
>
>@3 - committed
>@4 - committed
>@5 - committed
>@6 - in progress - not committed yet
>@7 - in progress - not committed yet
>@8 - committed
>@9 - committed
>
>
>When client asks for updated records, we ask the database for an appropriate new_anchor. Since the rows with rowversion @6 and @7 are still in progress, new_anchor has to be @5, so that our range query doesn't miss any uncommitted updates. Now the client can be confident it has everything up until @5.
>
>
>So the actual problem distilled: how can this new_anchor be safely determined each time?
>
>
>As you can probably tell I've borrowed this idea from SQL Server, where this problem is trivially solved by the min_active_rowversion() function. This function would return @6 in the above scenario, so your new_anchor is always going to be "min_active_rowversion()-1". I sort of had an idea how this could be implemented in postgres using an "active_rowversions" table, and a "SELECT min(id) FROM active_rowversions" but that would require READ UNCOMMITTED isolation, which is not available in postgres.
>
>
>I would really appreciate any help or ideas.

>

I guess one way to tackle it would be to try and make the assignment of rowversions from the sequence transactional in some way. I'm not sure if that's possible without reinventing some sort of sequence behaviour with locking, but you might be able to achieve something workable by using a deferred trigger on the table to assign the sequence after the initial modification at transaction commit. There could be a race condition with this approach, but something like:

CREATE OR REPLACE FUNCTION update_rowversion()
RETURNS trigger AS $$
BEGIN
UPDATE emps SET rowversion = nextval('rowversion_seq') WHERE <some pk> = NEW.<some pk>;
RETURN NEW;
END;
$$ LANGUAGE plpgsql VOLATILE;

CREATE CONSTRAINT TRIGGER emps_insert_rowversion
AFTER INSERT ON emps
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
EXECUTE PROCEDURE update_rowversion();

CREATE CONSTRAINT TRIGGER emps_update_rowversion
AFTER UPDATE ON emps
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
WHEN (OLD.rowversion IS NOT DISTINCT FROM NEW.rowversion)
EXECUTE PROCEDURE update_rowversion();

Obviously the downsides are that you're now doing extra work; an extra update, and the overhead of maintaining the deferred-trigger action queue.


Another way to keep track of what is committed is to track the current snapshot, but there are complications there such as freezing of rows and xids wrapping around every 4 billion transcations.

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Tom Paynter 2015-02-11 10:32:13 Advisory locks
Previous Message Nii Gii 2015-02-10 11:39:28 Implementing incremental client updates