From: | Kaloyan Iliev <kaloyan(at)digsys(dot)bg> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Obtaining random rows from a result set |
Date: | 2007-08-31 13:34:48 |
Message-ID: | 46D818F8.9080006@digsys.bg |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
Why not generate a random number in your application and then:
SELECT *
FROM table_x
WHERE condition = true
OFFSET generated_random_number
LIMIT xx
Kaloyan Iliev
Alban Hertroys wrote:
>Hello,
>
>I've recently been busy improving a query that yields a fixed number of
>random records matching certain conditions. I have tried all the usual
>approaches, and although they do work, they're all limited in some way
>and don't translate really well to what you "want". They're kludges, IMHO.
>
>The methods I've tried are explained quite well on
>http://people.planetpostgresql.org/greg/index.php?/archives/40-Getting-random-rows-from-a-database-table.html
>
>All these methods involve calculating a random number for every record
>in the result set at some point in time, which is really not what I'm
>trying to model. I think the database should provide some means to get
>those records, so...
>
>Dear Santa,
>
>I'd like my database to have functionality analogue to how LIMIT works,
>but for other - non-sequential - algorithms.
>
>I was thinking along the lines of:
>
> SELECT *
> FROM table
> WHERE condition = true
> RANDOM 5;
>
>Which would (up to) return 5 random rows from the result set, just as
>LIMIT 5 returns (up to) the first 5 records in the result set.
>
>
>Or maybe even with a custom function, so that you could get non-linear
>distributions:
>
> SELECT *
> FROM table
> WHERE condition = true
> LIMIT 5 USING my_func();
>
> Where my_func() could be a user definable function accepting a number
>that should be (an estimate of?) the number of results being returned so
>that it can provide pointers to which rows in the resultset will be
>returned from the query.
>
>Examples:
>* random(maxrows) would return random rows from the resultset.
>* median() would return the rows in the middle of the result set (this
>would require ordering to be meaningful).
>
>What do people think, is this feasable? Desirable? Necessary?
>
>If I'd have time I'd volunteer for at least looking into this, but I'm
>working on three projects simultaneously already. Alas...
>
>Regards,
>Alban Hertroys.
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Phoenix Kiula | 2007-08-31 13:35:09 | Re: URGENT: Whole DB down ("no space left on device") |
Previous Message | Josh Tolley | 2007-08-31 13:33:36 | Re: Obtaining random rows from a result set |