From: | "Picavet Vincent" <Vincent(dot)Picavet(at)mediapost(dot)fr> |
---|---|
To: | <pgsql-sql(at)postgresql(dot)org> |
Subject: | enumerate groups given a certain value |
Date: | 2008-08-07 17:31:16 |
Message-ID: | EB18254270D1FD429047C987937D4A12031E6665@s92e07497.ad.mediapost.fr |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Hello,
Here is a sql problem, which I thought simple at first, but for which I
ended up with a solution I find surprisingly complicated.
I really think I could have achieved a much easier way of handling this,
but I do not manage to find the trick allowing a very simple and
efficient query to solve the problem.
Let's present it with a sample case. The initial table is the following
one :
--------------------
drop table if exists test_gen ;
create table
test_gen as
select *
from (
select
chr((round(random()* 25) +65)::int) as id
, random()* 100 as val
from
generate_series(1,200) as g
order
by id
) as foo
select * from test_gen;
-------------------
What I want to do is to enumerate lines for each group of id, following
the order of val.
For example :
id val gen
A 2.65105138532817 1
A 38.9289360493422 2
A 74.6089164167643 3
B 2.01512188650668 1
B 11.4642047323287 2
B 31.2643219716847 3
B 65.8427979797125 4
C 0.759994331747293 1
C 11.8905796203762 2
C 13.7388648930937 3
C 49.1934351157397 4
C 83.1861903425306 5
D 45.8268967922777 1
D 57.1161589119583 2
E 9.72125697880983 1
E 61.324825277552 2
E 70.3348958399147 3
F 0.49891234234237 1
Here is the solution I ended up with :
---------------------------
-- first count number of ids per group
drop table test_gen2 ;
create table test_gen2 as
select t1.*, t2.nb
from
test_gen as t1,
(
SELECT
id, count(*) as nb
FROM
test_gen
GROUP BY id
) as t2
WHERE
t1.id =t2.id
ORDER BY
t1.id;
create sequence seq_test_gen start with 1;
create sequence seq_test_gen2 start with 1;
-- get the table with the order set (gen is our order)
select
*
from
(
select
foo1.*,
nextval('seq_test_gen') as serial
from (
select
*
from
test_gen2
order by
id, val
) as foo1
) as t1,
(
select
foo.*,
nextval('seq_test_gen2') as serial
from (
select
gb1.*,
generate_series(1, gb1.nb) as gen
from (
select
id, nb
from
test_gen2
group by
id, nb
) as gb1
order by
gb1.id, gen
) as foo
) as t2
where
t1.serial = t2.serial
;
-----------------------------------
The problem seems to be as easy as : <sort my two sets and put them side
to side>. But I could not find a better way to do that than putting a
serial on left and right side and do a join on this serial.
I also tried to find a solution using a modulo but could not manage to
get it work.
Anybody for a ray of light on a different approach ? This look like a
recurrent problem, isn't there an experienced sql programmer here who
tackled this issued a couple of time ?
Thanks for any help,
Vincent
From | Date | Subject | |
---|---|---|---|
Next Message | Steve Midgley | 2008-08-07 17:40:22 | Re: DELETE with JOIN |
Previous Message | Ragnar | 2008-08-07 17:05:38 | Re: DELETE with JOIN |