From: | Kenichiro Tanaka <ketanaka(at)ashisuto(dot)co(dot)jp> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Need some help with a query (uniq -c) |
Date: | 2010-04-13 11:12:40 |
Message-ID: | 4BC451A8.3010900@ashisuto.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello.
I try with "With Query".
http://www.postgresql.org/docs/8.4/static/queries-with.html
#We can use "With Queries" > v8.4
#That'll only work if the time values are contiguous, but there's probably a
#similar trick for non-contiguous ,too.
--create data
drop table foo;
create table foo( time int,message text);
insert into foo values(1,'a');
insert into foo values(2,'b');
insert into foo values(3,'b');
insert into foo values(4,'c');
insert into foo values(5,'a');
insert into foo values(6,'c');
insert into foo values(7,'c');
insert into foo values(8,'a');
insert into foo values(9,'a');
insert into foo values(10,'a');
--begin Answer
with recursive r as (
select foo.time,foo.message,1 as dummy from foo
union all
select foo.time,foo.message,r.dummy+1 from foo , r
where foo.time=r.time-1 and foo.message=r.message
)
,rr as (
select foo.time,foo.message,'OLID' as flag from foo
union all
select foo.time,foo.message,'DUP' as flag from foo , rr
where foo.time-1=rr.time-2 and foo.message=rr.message
)
select time min,time+max(dummy)-1 max,message,max(dummy) counts
from r where time not in (select distinct (time+1) times from rr
where flag='DUP') group by time,message order by time;
--result
postgres(# where flag='DUP') group by time,message order by time;
min | max | message | counts
-----+-----+---------+--------
1 | 1 | a | 1
2 | 3 | b | 2
4 | 4 | c | 1
5 | 5 | a | 1
6 | 7 | c | 2
8 | 10 | a | 3
(6 rows)
--end
But I think some one can provide more simple SQL.
Thank you.
> On Apr 12, 2010, at 11:31 AM, Scott Marlowe wrote:
>
>
>> On Mon, Apr 12, 2010 at 12:22 PM, A B<gentosaker(at)gmail(dot)com> wrote:
>>
>>> Hello!
>>>
>>> I have a table (think of it as a table of log messages)
>>>
>>> time | message
>>> -----------------------
>>> 1 | a
>>> 2 | b
>>> 3 | b
>>> 4 | b
>>> 5 | a
>>>
>>> the three 'b' are the same message, so I would like to write a query
>>> that would give me a result that is similar to what the unix command
>>> "uniq -c" would give:
>>>
>>> first | message | last | count
>>> --------------------------------------
>>> 1 | a | 1 | 1
>>> 2 | b | 4 | 3<--- here it squeezes
>>> similar consecutive messages into a single row
>>> 5 | a | 5 | 1
>>>
>>> How do I write such a command?
>>>
>> Pretty straight ahead:
>>
>> select min(t), message, max(t), count(*) from table group by message.
>>
> That was my first though too, but it combines everything not just adjacent messages.
>
> Something like this, maybe
>
> select t1.message, t1.time as first, t2.time as last, t2.time-t1.time+1 as count
> from foo as t1, foo as t2
> where t1.time<= t2.time and t1.message = t2.message
> and not exists
> (select * from foo as t3
> where (t3.time between t1.time and t2.time and t3.message<> t1.message)
> or (t3.time = t2.time + 1 and t3.message = t1.message)
> or (t3.time = t1.time - 1 and t3.message = t1.message));
>
> message | first | last | count
> ---------+-------+------+-------
> a | 1 | 1 | 1
> b | 2 | 4 | 3
> a | 5 | 5 | 1
>
> That'll only work if the time values are contiguous, but there's probably a
> similar trick for non-contiguous.
>
> Cheers,
> Steve
>
>
>
--
================================================
Kenichiro Tanaka
K.K.Ashisuto
http://www.ashisuto.co.jp/english/index.html
================================================
From | Date | Subject | |
---|---|---|---|
Next Message | Andrus | 2010-04-13 11:49:59 | Email address column verification for address list |
Previous Message | tv | 2010-04-13 11:05:40 | Re: Dynamic Catalog Views |