From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | "Gavin M(dot) Roy" <gmr(at)justsportsusa(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: ORDER BY random() LIMIT 1 slowness |
Date: | 2002-12-17 05:44:13 |
Message-ID: | 4144.1040103853@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
"Gavin M. Roy" <gmr(at)justsportsusa(dot)com> writes:
> SELECT * FROM poetry ORDER BY random() LIMIT 1;
> [ is slow for 35000 rows ]
Yeah. Basically this query is implemented as
(a) select all 35000 rows of "poetry";
(b) compute a random() value for each row;
(c) sort by the random() values;
(d) take the first row, discard the rest.
The good news: this gives you a pretty-durn-random selection. The bad
news: you didn't really care about choosing a random ordering of the
other 34999 rows, but it computed one anyway.
This problem's been discussed before, but I've not seen any really
ideal answer. SQL is not designed to offer unpredictable results ;-)
If you have a reasonably-uniformly-distributed index key on the rows,
you can try something like
select * from poetry where key > (scale * random() + min)
order by key limit 1;
(where scale and min are chosen to make the result cover the range of
index keys). Given an index on "key" this should pick a result quickly
via an indexscan+limit plan. However, if the distribution of keys isn't
real uniform then you won't get real random results --- keys just above
larger gaps will be selected with unfairly high probability. You also
have to know the min and max keys accurately.
I recall some folks have suggested generating an actual random column
(ie, something initialized with "default random()" and suitably indexed)
but this seems like a lot of overhead ... you'd have to be mighty
concerned with the exactness of your random selection to put up with
that, I think.
There are some other suggestions in the archives, IIRC.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2002-12-17 05:57:44 | Re: How to access deleted but not vacuum'ed tuples? |
Previous Message | Lamar Owen | 2002-12-17 05:33:20 | Re: Upgrading 6.5.1 |