| From: | Stephen Frost <sfrost(at)snowman(dot)net> |
|---|---|
| To: | Roger Mason <rmason(at)mun(dot)ca>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
| Cc: | pgsql-novice <pgsql-novice(at)postgresql(dot)org> |
| Subject: | Re: select from grouped data |
| Date: | 2021-08-25 20:42:56 |
| Message-ID: | 20210825204256.GX17906@tamriel.snowman.net |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-novice |
Greetings,
* Roger Mason (rmason(at)mun(dot)ca) wrote:
> I have written this function to process some multiline text data into a
> table with both text & numeric columns:
>
> CREATE OR REPLACE FUNCTION get_final_energy ()
> RETURNS TABLE (
> jid text,
> "timestamp" text,
> scf integer,
> energy double precision
> )
> AS $$
> WITH a AS (
> SELECT
> jid,
> regexp_split_to_table(totenergy_out, '\n') AS teo
> FROM
> results
> ),
> b AS (
> SELECT
> results.jid,
> results. "timestamp",
> cast(
> CASE WHEN split_part(a.teo, ' ', 2) = '' THEN
> '0'
> ELSE
> split_part(a.teo, ' ', 2)
> END AS integer) AS scf,
> cast(
> CASE WHEN split_part(a.teo, ' ', 3) = '' THEN
> '0.0'
> ELSE
> split_part(a.teo, ' ', 3)
> END AS double precision) AS energy
> FROM
> results,
> a
> WHERE
> results.jid = a.jid
> GROUP BY
> results.jid,
> results. "timestamp",
> a.teo
> --HAVING
> -- scf = max(scf)
> ORDER BY
> timestamp ASC,
> scf DESC
> )
> SELECT
> *
> FROM
> b;
>
> $$
> LANGUAGE sql;
>
> The output looks like:
>
> jid | timestamp | scf | energy
> ------------+-----------------+-----+----------------
> 1250_1 | 20210805-114634 | 18 | -1316.43700819
> 1250_1 | 20210805-114634 | 17 | -1316.43700825
> 1250_1 | 20210805-114634 | 16 | -1316.4370097
> 1250_1 | 20210805-114634 | 15 | -1316.43700991
> 1250_1 | 20210805-114634 | 14 | -1316.43699775
> 1250_1 | 20210805-114634 | 13 | -1316.43699117
> 1250_1 | 20210805-114634 | 12 | -1316.43750771
> 1250_1 | 20210805-114634 | 11 | -1316.43805358
> 1250_1 | 20210805-114634 | 10 | -1316.43857192
> 1250_1 | 20210805-114634 | 9 | -1316.43070942
> 1251_1 | 20210806-062539 | 18 | -1316.43700819
> 1251_1 | 20210806-062539 | 17 | -1316.43700826
> ....
>
> What I want is to get (for each group) the energy corresponding to the
> maximum value of scf.
* David G. Johnston (david(dot)g(dot)johnston(at)gmail(dot)com) wrote:
> >
> >
> > The output looks like:
> >
> > jid | timestamp | scf | energy
> > ------------+-----------------+-----+----------------
> > 1250_1 | 20210805-114634 | 18 | -1316.43700819
> > 1250_1 | 20210805-114634 | 17 | -1316.43700825
> > 1250_1 | 20210805-114634 | 16 | -1316.4370097
> > 1250_1 | 20210805-114634 | 15 | -1316.43700991
> > 1250_1 | 20210805-114634 | 14 | -1316.43699775
> > 1250_1 | 20210805-114634 | 13 | -1316.43699117
> > 1250_1 | 20210805-114634 | 12 | -1316.43750771
> > 1250_1 | 20210805-114634 | 11 | -1316.43805358
> > 1250_1 | 20210805-114634 | 10 | -1316.43857192
> > 1250_1 | 20210805-114634 | 9 | -1316.43070942
> > 1251_1 | 20210806-062539 | 18 | -1316.43700819
> > 1251_1 | 20210806-062539 | 17 | -1316.43700826
> > ....
> >
> > What I want is to get (for each group) the energy corresponding to the
> > maximum value of scf.
> >
> >
> SELECT DISTINCT ON (jid) jid, timestamp, scf, energy [...] ORDER BY jid,
> scf DESC
While this works, it's generally better to use a LATERAL join as that's
part of the SQL standard while DISTINCT ON isn't. Using a LATERAL join
also would allow you to have multiple rows (top-N) if you wanted. You'd
do that using:
WITH jids AS (SELECT jid FROM results GROUP BY jid)
SELECT jids.jid, t.ts, t.scf, t.energy
FROM jids CROSS JOIN LATERAL
(SELECT ts, scf, energy FROM results
WHERE results.jid = jids.jid
ORDER BY scf DESC
LIMIT 1) AS t
;
A couple of notes on this: don't name a column "timestamp" and when it's
a timestamp, use the 'timestamptz' data type, not text. Your jids sure
look like they should just be numbers instead of text too. If you
already have a distinct set of jids somewhere (like in another table),
you could use that as the source table instead of the CTE that I'm using
above.
Thanks,
Stephen
| From | Date | Subject | |
|---|---|---|---|
| Next Message | James Keener | 2021-08-25 21:37:42 | Re: select from grouped data |
| Previous Message | Roger Mason | 2021-08-25 13:59:10 | Re: select from grouped data |