Random Weighted Result Ordering

From: Eliot Gable <egable+pgsql-general(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Random Weighted Result Ordering
Date: 2010-06-07 02:51:46
Message-ID: AANLkTimF6IgK7EHhv01W_hljsUcbuNY2O8MjzoC31-b_@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have a set of results that I am selecting from a set of tables which I
want to return in a random weighted order for each priority group returned.
Each row has a priority column and a weight column. I sort by the priority
column with 1 being highest priority. Then, for each distinct priority, I
want to do a weighted random ordering of all rows that have that same
priority. I select the set of rows and pass it to a custom-built function
that does the ordering. I have tested both the prioritize and the random
weighted ordering functions and they do exactly what I want them to do for
ordering the data that I send them.

The problem comes from the fact that I tried to make them generalized. They
take an array of a small complex type which holds just an arbitrary ID, the
priority, and the weight. The output is the same information but the rows
are in the correct order. I thought I could take that output and just INNER
JOIN it back to my original data rows from which I obtained the ID,
priority, and weight values. However, when I do that, the ordering no longer
seems to be preserved, even though there is no ORDER BY clause on the query
that joins the data back to the original rows. Basically, I thought that if
the original data was:

50, 1, 5, 'data1'
55, 1, 4, 'data2'
34, 2, 0, 'data3'
90, 2, 1, 'data4'
95, 2, 1, 'data5

And the input to the functions was:

50, 1, 5
55, 1, 4
34, 2, 0
90, 2, 1
95, 2, 1

And the prioritized and weighted order came back:

50, 1, 5
55, 1, 4
95, 2, 1
90, 2, 1
34, 2, 0

Then, if I INNER JOINED them like:

(
50, 1, 5
55, 1, 4
95, 2, 1
90, 2, 1
34, 2, 0
) AS randomized INNER JOIN (
50, 1, 5, 'data1'
55, 1, 4, 'data2'
34, 2, 0, 'data3'
90, 2, 1, 'data4'
95, 2, 1, 'data5
) AS data ON (
randomized.id = data.id
)

Then the rows would come back as:

50, 1, 5, 'data1'
55, 1, 4, 'data2'
95, 2, 1, 'data5'
90, 2, 1, 'data4'
34, 2, 0, 'data3

Unfortunately, that does not seem to be happening. Before I spend a ton of
time digging into this issue, I thought I would pose the questions here:

Does anyone know for certain why this would not work? Or, does anyone know
why this should work?

I assumed that the order of the joins would preserve the ordering of the
first set of data. However, I am worried about how the planner might
re-arrange the joins on me, and I am wondering whether the order is
guaranteed to be preserved like this in the first place... Does anyone know
for sure about these assumptions?

Thanks in advance for any assistance.

--
Eliot Gable

"We do not inherit the Earth from our ancestors: we borrow it from our
children." ~David Brower

"I decided the words were too conservative for me. We're not borrowing from
our children, we're stealing from them--and it's not even considered to be a
crime." ~David Brower

"Esse oportet ut vivas, non vivere ut edas." (Thou shouldst eat to live; not
live to eat.) ~Marcus Tullius Cicero

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Lew 2010-06-07 05:01:25 Re: Random Weighted Result Ordering
Previous Message John T. Dow 2010-06-07 02:29:31 Error on Windows server could not open relation base/xxx/xxx Permission denied