From: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: INSERT ... ON CONFLICT DO UPDATE |
Date: | 2015-07-19 18:08:45 |
Message-ID: | 928925e4-d55b-4144-b994-5004e4b5a0e3@mm |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Rafal Pietrak wrote:
> CREATE TABLE vouchers (a_voucher bigint PRIMARY KEY default
> (random()*1000000000)::bigint, issued date default now(), .....);
Generators of truly unique pseudo-random values provide a
better ground for this. Consider for example:
https://wiki.postgresql.org/wiki/Pseudo_encrypt
You may replace the round function with your own secret function,
so you'll get the required randomness, secrecy and uniqueness.
No need to deal with collisions on insertion as there are none.
> 2. with current (as of 9.5) implementation I think I can always "ON
> CONFLICT DO NOTHING", and retry the INSERT from application level.
Yes, but retrying is now easy, let's not underappreciate that.
As a test, with 9.5alpha1, I create a table with 100k unique
random numbers:
CREATE TABLE vouchers(id int primary key);
Then try to populate it immediately with 100k rows:
INSERT INTO vouchers
select (random()*1000000000)::int from
generate_series(1,100000)
ON CONFLICT DO NOTHING;
psql result: INSERT 0 99995
Note how 5 values conflicted right from the beginning,
even though we're claiming only 10^5 out of the 10^9 output
range (or 0.01%).
The probability of at least one collision is pretty high,
see the "birthday paradox" for the theory on that.
Anyway the collisions got eliminated without any effort
from me and that's quite useful already.
Now trying to insert 10k rows at a time:
INSERT INTO vouchers
SELECT (random()*1000000000)::int
FROM generate_series(1,10000)
ON CONFLICT DO NOTHING
RETURNING id;
when run repeatedly, it tends to return between 9995 and 10000 values.
If we want exactly N rows and we get back N-epsilon, then we need to
re-ask for epsilon rows, but this will converge fast to completion.
(that is, until you have enough values that the birthday paradox effect
really kicks in).
My point is that we can now achieve that without any exception handling
or transaction retry, and no plpgsql function to create, so it's really
a significant improvement in ease of use. And presumably in
performance too.
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @ManitouMail
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Verite | 2015-07-19 19:03:07 | Re: INSERT ... ON CONFLICT DO UPDATE |
Previous Message | Amitabh Kant | 2015-07-19 17:35:07 | Upgrade postgres cluster on FreeBSD using pg_upgrade |