Keeping top N records of a group

From: Alex Magnum <magnum11200(at)gmail(dot)com>
To: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Keeping top N records of a group
Date: 2016-05-14 11:33:29
Message-ID: CA+cR4zey3qZC2TA1cEYAnhzA70q5TchxairB1JV89G0eUqYNVw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,
i want to archive data and am looking for a query to replace a rather slow
function i am currently using.

The idea is that for every sym and doc_key I want to keep the records of
the top 2 sources. Eg. in below table I want to archive
sym 1022 of source 3000 but keep the 4 records from sources 4045 and 4081.

Any help on this would be appreciated.

source | sym | doc_id | doc_key
-----------+--------------+---------+--------------
3000 | 1022 | 551008 | 23834363 <-- Archive
3000 | 1022 | 551008 | 23834363 <-- Archive
4045 | 1022 | 3699290 | 23834363
4045 | 1022 | 3699290 | 23834363
4081 | 1022 | 3811196 | 23834363
4081 | 1022 | 3811196 | 23834363
3000 | 1028 | 550997 | 23834363
3468 | 1085 | 1740526 | 23834363
3000 | 1149 | 551017 | 23834363
4045 | 1149 | 3699291 | 23834363
3000 | 12 | 551015 | 23834363
3000 | 12 | 551015 | 23834363
3951 | 12 | 3147700 | 23834363
3951 | 12 | 3147700 | 23834363
3000 | 13 | 551000 | 23834363
4045 | 13 | 3699283 | 23834363
3000 | 1327 | 551010 | 23834363
3971 | 1327 | 3394469 | 23834363
3000 | 15 | 551001 | 23834363
4045 | 15 | 3699284 | 23834363
3000 | 16 | 551002 | 23834363
4045 | 16 | 3699285 | 23834363
3000 | 18 | 551013 | 23834363
4045 | 18 | 3699286 | 23834363
3000 | 257 | 551005 | 23834363
3951 | 257 | 3147701 | 23834363
3000 | 2795 | 551011 | 23834363 <-- Archive
3459 | 2795 | 1710571 | 23834363 <-- Archive
3905 | 2795 | 2994791 | 23834363
4045 | 2795 | 3699292 | 23834363
3060 | 2913 | 856199 | 23834363
3000 | 2954 | 551012 | 23834363 <-- Archive
3971 | 2954 | 3394470 | 23834363
4212 | 2954 | 4650870 | 23834363
3183 | 3427 | 1055492 | 23834363 <-- Archive
3971 | 3427 | 3394471 | 23834363
4248 | 3427 | 4763105 | 23834363
3188 | 594 | 1062642 | 23834363 <-- Archive
3188 | 594 | 1062642 | 23834363 <-- Archive
3191 | 594 | 1067501 | 23834363 <-- Archive
3191 | 594 | 1067501 | 23834363 <-- Archive
3192 | 594 | 1068391 | 23834363 <-- Archive
3192 | 594 | 1068391 | 23834363 <-- Archive
3199 | 594 | 1096070 | 23834363 <-- Archive
3199 | 594 | 1096070 | 23834363 <-- Archive
3303 | 594 | 1305467 | 23834363
3303 | 594 | 1305467 | 23834363
4117 | 594 | 4000987 | 23834363

Thanks a lot for any ideas.
Alex

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andreas Kretschmer 2016-05-14 12:34:34 Re: Keeping top N records of a group
Previous Message Venkata Balaji N 2016-05-14 07:38:51 Re: Streaming replication, master recycling