Re: Generating random unique alphanumeric IDs

From: Ivan Sergio Borgonovo <mail(at)webthatworks(dot)it>
To: Thom Brown <thombrown(at)gmail(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Generating random unique alphanumeric IDs
Date: 2009-08-20 16:03:49
Message-ID: 20090820180349.5ac146f5@dawn.webthatworks.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 20 Aug 2009 13:34:51 +0100
Thom Brown <thombrown(at)gmail(dot)com> wrote:

Correcting myself.
a) it is a bad idea to pad an hex with an hex... so I should still
find a quick way to change representation to [g-z] for the padding
characters... or just pad with a constant string.
select lpad(
to_hex(feistel_encrypt(10)),8 , 'mjkitlh')
);

b) this if from int (signed) to int (signed).

begin;
create or replace function feistel_encrypt(value int)
returns int as
$$
declare
l1 int;
l2 int;
r1 int;
r2 int;
i int:=0;
begin
l1:= (value >> 16) & 65535;
r1:= value & 65535;
while i<3 loop
l2:=r1;
r2:=l1#((((1366.0
*r1+150889)%714025)/714025.0)*32767)::int;
l1:=l2;
r1:=r2;
i:=i+1;
end loop;
return ((l1 << 16) | r1);
end;
$$ language plpgsql strict immutable;
create or replace function feistel_decrypt(value int)
returns int as
$$
declare
l1 int;
l2 int;
r1 int;
r2 int;
i int:=0;
begin
l2:= (value >> 16) & 65535;
r2:= value & 65535;
while i<3 loop
r1=l2;
l1:=r2#((((1366.0*l2+150889)%714025)/714025.0)*32767)::int;
l2:=l1;
r2:=r1;
i:=i+1;
end loop;
return ((l2 << 16) | r2);
end;
$$ language plpgsql strict immutable;
commit;

select * from feistel_decrypt(feistel_encrypt((2^31-1)::int))
union
select * from feistel_decrypt(feistel_encrypt((-2^31)::int))
union
select * from feistel_decrypt(feistel_encrypt((0)::int))
union
select * from feistel_decrypt(feistel_encrypt((-1)::int))
;

> This appears a lot more tricky than I had originally anticipated!
> I may be misunderstanding your example, but by alphanumeric, I
> mean beyond hex (i.e. a-z and possibly uppcase too).

me too... but to_hex was there and a quick trick to shorten the
string and get rid of a sign.

> I've looked into LFSR, but I'm afraid it goes over my head. But

There is too much dust on my copy of "Concrete Mathematics" still by
popular culture (read wikipedia) it is said that LFSR are not
cryptographically safe, while making 4 loops and choosing a suitable
F, Feistel cypher is.

Then it is just a problem of "shrinking the string" or
representing it in another base... and that may result in some
"waste".
5 bits are 32 char... you actually have more chars available even
just considering a subset of ASCII.

Picking 5 bits from LFSR algo isn't that different than converting
to hex feistel cipher as I see it. The main advantage of hex over
ASCII is that ints map very well to hex (no waste) and that to_hex
has good chance to perform better than any plpgsql function.

Since I'm generating "gift codes" It wouldn't look nice if I present
the user with

A

as a gift code...

And that's going to happen as soon as I'll have generated 232798491
gift codes. (/me wondering which is the smaller number with a
corresponding one digit hex(fiestel()) transform.)[1].
So just to make gift codes look nicer I thought about padding them
with some furter random noise... but the way initially described is
not going to work. Variants could be to concat with something
[^a-f0-9] (eg '-') and then padding with hex random noise

A -> -A -> (random noise)-A

I don't know if it is worth since it is another round of lpad.

Even if I'm currently not overly concerned by performances I'm
working with plpgsql and while I think that writing something to
change base representation to an int can be done... it will be slow
and ugly.
If I was working with pgperl (?) I'd just google for some perl
receipt.
Given the premises I'll just embellish the hex with some padding.
But if you really need to use letters and be compact and such... I
think you're just looking for changing the base of your
wathever-pseudo-random algorithm.

That's a common problem you may just have to adapt to plpgsql.

[1]
select s.i, feistel_decrypt(s.i) from generate_series(0,16) as s(i)
order by feistel_decrypt(s.i)
did it in a hurry... didn't check

--
Ivan Sergio Borgonovo
http://www.webthatworks.it

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2009-08-20 16:07:28 Re: Temp table or normal table for performance?
Previous Message Greg Stark 2009-08-20 15:56:38 Re: unique index for periods