From: | Jan Poslusny <pajout(at)gingerall(dot)cz> |
---|---|
To: | Jeff Davis <jdavis-pgsql(at)empires(dot)org> |
Cc: | PgSQL General List <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: random record from small set |
Date: | 2005-02-15 16:15:06 |
Message-ID: | 4212200A.1010606@gingerall.cz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
And what about another data representation like
create table r1 (
i int,
chance_from numeric,
chance_to numeric
)
, you can select one random row in one select, for instance
select * from r1 where chance_from <= $rnd and chance_to > $rnd;
I see these advantages
- Only one select.
- Indices can improve performance if r1 has many rows.
and disadvantage
- Tricky update
Jeff Davis wrote:
>I am trying to retrieve a random record (according to a chance
>attribute) from a small set of records, each with a "chance" attribute.
>This may eventually be somwhat of a performance concern, so I'd like to
>make sure I'm doing this right.
>
>Here's what I have so far:
>
>create table r1 (
> i int,
> chance numeric
>)
>create or replace function randrec() returns int as $$
> $res = spi_exec_query('select i,chance from r1');
> $r = rand;
> $accum = 0;
> $i = 0;
> while($accum < $r) {
> $accum += $res->{rows}[$i++]->{chance}
> }
> return $res->{rows}[$i-1]->{i};
>$$ language plperl;
>
>test=# select * from r1;
> i | chance
>---+--------
> 1 | 0.25
> 2 | 0.20
> 3 | 0.15
> 4 | 0.10
> 5 | 0.30
>
>
>That seems to work, in that out of 10k times, I got the following
>numbers of each:
>1 2479
>2 1959
>3 1522
>4 950
>5 3090
>
>But I have a few questions:
>* Am I right to use NUMERIC for the chance attribute?
>* Does perl's arithmetic leave me with the chance that those numeric
>values don't add up to 1.00 (and in this case that could mean an
>infinite loop)?
>* In my design I'll need a constraint trigger making sure that the
>numbers add up to 1.00. Will that be a performance problem for
>operations on the table that don't modify the chance attribute?
>* Is there a better way?
>* Does spi_exec_query pull the entire result set into memory at once? Is
>there a point at which performance could be a serious problem if there
>are a large number of items to select among?
>
>Regards,
> Jeff Davis
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to majordomo(at)postgresql(dot)org)
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Jan Poslusny | 2005-02-15 16:25:02 | Re: random record from small set |
Previous Message | Ruben Oliveira | 2005-02-15 16:04:42 | problem with dots in order by |