Re: BUG #13846: INSERT ON CONFLICT consumes sequencers on conflicts

From: Paul Hester <paul(at)salesintel(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13846: INSERT ON CONFLICT consumes sequencers on conflicts
Date: 2016-01-06 05:04:41
Message-ID: CAOwSVr_9FARiegvC54VqZiTz9A3ryuaAbi8XHYha+y=NHmeQpw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

No peek-ahead necessary, when the sequenced column would not be used in the
conflict_target. UPSERTS to dimension tables in a star schema as part of an
OLAP system are very handy, but typically the records in the dimension
tables use a surrogate key based on sequences for maintenance & performance
reasons, and the 'natural' key/value is the only column that would be used
to detect a conflict. In this case, the 'normal' path would be the
conflict_action nearly all the time, and not the insert. If the typical
path was the INSERT, I'd bet the data would be from some event source, and
I guess in that case if there were a key column defaulted with a sequence,
like an order#, it would have to be used to resolve conflicts. But it's
hard to imagine a case in a process flow where you wouldn't already know if
the event existed or not, limiting the need for an UPSERT on event data.

This limitation, of consuming sequencers used for defaults on columns not
necessary to resolve conflict, diminishes the viability for using Postges
UPSERTS for large data warehouses, is all I'm saying (or requires surrogate
keys being 64 bit). Just caught me by surprise in comparison to other
RDBMSs offering some form of UPSERTs that would not consume a sequencer if
its values weren't required to resolve conflicts.

On Tue, Jan 5, 2016 at 12:27 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> On Tue, Jan 5, 2016 at 8:13 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> >> When using an INSERT statement with an ON CONFLICT clause, if there is a
> >> conflict, and the table being inserted into has a column defaulted to
> >> nextval('seq'), the sequencer 'seq' is always incremented. This can
> quickly
> >> and artificially consume all values of the sequencer; it behaves as if
> >> conflict detection happens after inserts are attempted, rather than
> before.
> >
> > Yes. That's by design. You can't reliably do conflict detection before
> > evaluating column default values.
>
> Right. If you didn't consume a sequence value, but just did a
> "peek-ahead", then several concurrently inserting sessions would all
> "peek-ahead" and see the same value. There'd then be a race condition
> that broke the useful guarantees that ON CONFLICT DO UPDATE makes.
>
> --
> Peter Geoghegan
>

--
*Paul Hester*
Co-founder & Chief Architect
www.salesintel.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2016-01-06 05:33:05 Re: Failure with regression test largeobject if pg_regress invoked from external paths
Previous Message Michael Paquier 2016-01-06 04:26:46 Re: Failure with regression test largeobject if pg_regress invoked from external paths