From: | Alban Hertroys <alban(at)magproductions(dot)nl> |
---|---|
To: | Alban Hertroys <alban(at)magproductions(dot)nl> |
Cc: | Martijn van Oosterhout <kleptog(at)svana(dot)org>, Postgres General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Obtaining random rows from a result set |
Date: | 2007-09-04 15:41:51 |
Message-ID: | 46DD7CBF.7040609@magproductions.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
To follow up on my own post, I came up with a workable solution based on
scrolling cursors. The SP approach didn't work out for me, I didn't
manage to declare a cursor in PL/pgSQL that could be positioned
absolutely (maybe that's due to us still using PG 8.1.something?).
A solution to that would be appreciated.
Anyway, I solved the problem in our application (PHP). I even got a
workable solution to prevent returning the same record more than once.
Here goes:
function randomSet($query, $limit, $uniqueColumn) {
// queries; depends on your DB connector
DECLARE _cur SCROLL CURSOR WITHOUT HOLD FOR $query;
MOVE FORWARD ALL IN _cur;
//GET DIAGNOSTICS _count := ROW_COUNT;
$count = pg_affected_rows();
$uniques = array();
$resultSet = array();
while ($limit > 0 && count($uniques) < $count) {
$idx = random(1, $count);
//query
$record = FETCH ABSOLUTE $idx FROM _cur;
// Skip records with a column value we want to be unique
if (in_array($record[$uniqueColumn], $uniques)
continue;
$uniques[] = $record[$uniqueColumn];
$resultSet[] = $record;
$limit--;
}
// query
CLOSE _cur;
return $resultSet;
}
I hope this is useful to anyone. It worked for us; it is definitely
faster than order by random(), and more random than precalculated column
values. Plus it translates directly to what we are requesting :)
Alban Hertroys wrote:
> I thought of another solution (with only a few calculations of random())
> that can be deployed in existing versions of PG, using a set-returning
> function with a scrolling cursor that accepts the query string as input
> like this (in pseudoish-code):
>
> ----
> create function random(text _query, integer _limit)
> returns set
> volatile
> as $$
> DECLARE
> _cur cursor;
> _cnt bigint;
> _idx integer;
> _rowpos bigint;
>
> _rec record;
> BEGIN
> open _cur for execute query;
> fetch forward all into _rec;
> -- select total nr of records into _cnt
>
> for _idx in 1.._limit loop
> _rowpos := random() * _cnt;
>
> fetch absolute _rowpos into _rec;
> return next _rec;
> end loop;
>
> return;
> END;
> $$
> language 'plpgsql';
> ----
--
Alban Hertroys
alban(at)magproductions(dot)nl
magproductions b.v.
T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede
// Integrate Your World //
From | Date | Subject | |
---|---|---|---|
Next Message | Alban Hertroys | 2007-09-04 15:44:52 | Re: Obtaining random rows from a result set |
Previous Message | Nik | 2007-09-04 15:38:59 | Partition Reindexing |