Re: random record from small set

From: Jan Poslusny <pajout(at)gingerall(dot)cz>
To: Jeff Davis <jdavis-pgsql(at)empires(dot)org>
Cc: PgSQL General List <pgsql-general(at)postgresql(dot)org>
Subject: Re: random record from small set
Date: 2005-02-15 16:15:06
Message-ID: 4212200A.1010606@gingerall.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

And what about another data representation like

create table r1 (
i int,
chance_from numeric,
chance_to numeric
)
, you can select one random row in one select, for instance
select * from r1 where chance_from <= $rnd and chance_to > $rnd;

I see these advantages
- Only one select.
- Indices can improve performance if r1 has many rows.

and disadvantage
- Tricky update

Jeff Davis wrote:

>I am trying to retrieve a random record (according to a chance
>attribute) from a small set of records, each with a "chance" attribute.
>This may eventually be somwhat of a performance concern, so I'd like to
>make sure I'm doing this right.
>
>Here's what I have so far:
>
>create table r1 (
> i int,
> chance numeric
>)
>create or replace function randrec() returns int as $$
> $res = spi_exec_query('select i,chance from r1');
> $r = rand;
> $accum = 0;
> $i = 0;
> while($accum < $r) {
> $accum += $res->{rows}[$i++]->{chance}
> }
> return $res->{rows}[$i-1]->{i};
>$$ language plperl;
>
>test=# select * from r1;
> i | chance
>---+--------
> 1 | 0.25
> 2 | 0.20
> 3 | 0.15
> 4 | 0.10
> 5 | 0.30
>
>
>That seems to work, in that out of 10k times, I got the following
>numbers of each:
>1 2479
>2 1959
>3 1522
>4 950
>5 3090
>
>But I have a few questions:
>* Am I right to use NUMERIC for the chance attribute?
>* Does perl's arithmetic leave me with the chance that those numeric
>values don't add up to 1.00 (and in this case that could mean an
>infinite loop)?
>* In my design I'll need a constraint trigger making sure that the
>numbers add up to 1.00. Will that be a performance problem for
>operations on the table that don't modify the chance attribute?
>* Is there a better way?
>* Does spi_exec_query pull the entire result set into memory at once? Is
>there a point at which performance could be a serious problem if there
>are a large number of items to select among?
>
>Regards,
> Jeff Davis
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jan Poslusny 2005-02-15 16:25:02 Re: random record from small set
Previous Message Ruben Oliveira 2005-02-15 16:04:42 problem with dots in order by