From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | rj(at)last(dot)fm |
Cc: | PgSQL Performance ML <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Selecting random rows efficiently |
Date: | 2003-08-30 14:47:42 |
Message-ID: | 23400.1062254862@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Richard Jones <rj(at)last(dot)fm> writes:
>>> i have a table of around 3 million rows from which i regularly (twice a
>>> second at the moment) need to select a random row from
> i was hoping there was some trickery with sequences that would allow me to
> easily pick a random valid sequence number..?
There is no magic bullet here, but if you expect that requirement to
persist then it is worth your trouble to expend effort on a real
solution. A real solution in my mind would look like
1. Add a column "random_id float8 default random()". The idea here
is that you assign a random ID to each row as it is created.
2. Add an index on the above column.
3. Your query now looks like
SELECT * FROM table WHERE random_id >= random()
ORDER BY random_id LIMIT 1;
This gives you a plan on the order of
Limit (cost=0.00..0.17 rows=1 width=8)
-> Index Scan using fooi on foo (cost=0.00..57.00 rows=334 width=8)
Filter: (random_id >= random())
which is fast and gives a genuinely random result row. At least up
until you have enough rows that there start being duplicate random_ids,
which AFAIK would be 2 billion rows with a decent random()
implementation. If you're concerned about that, you could periodically
re-randomize with
UPDATE table SET random_id = random();
so that any rows that were "hidden" because they had a duplicate
random_id have another shot at being choosable. But with only a few mil
rows I don't think you need to worry.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2003-08-30 15:02:50 | Re: Linux2.6 overcommit behaviour |
Previous Message | Richard Jones | 2003-08-30 14:25:51 | Re: Selecting random rows efficiently |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2003-08-30 15:02:01 | Re: How to force Nested Loop plan? |
Previous Message | Tom Lane | 2003-08-30 14:35:52 | Re: bad estimates |