From: | Morris de Oryx <morrisdeoryx(at)gmail(dot)com> |
---|---|
To: | Rich Shepard <rshepard(at)appl-ecosys(dot)com> |
Cc: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Aggregate functions on groups |
Date: | 2019-08-31 02:46:12 |
Message-ID: | CAKqncciB=yzfCpcbANb20NBxxVuQcmWOkUi8=2fYuaKuP-w9fQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Your tributaries and fish master tables make sense. If I read your code
right, you're grouping by too many columns. I flattened the data into a
survey table for this simple example:
select tributary,
common_name,
scientific_name,
sum(count_value) as fish_seen,
count(count_value) as observations_made
from survey
group by 1,2,3 -- The GROUP BY clause can use positions on the select
list, if you feel like typing less.
tributary common_name scientific_name
fish_seen observations_made
Anderson Creek trib to Nehalem River Black crappie Pomoxis
nigromaculatus 3 2
Anderson Creek trib to Nehalem River Brook trout Salvelinus
fontinalis 3 2
Anderson Creek trib to Nehalem River Bluegill Lepomis macrochirus
3 2
Anderson Creek trib to Nehalem River Brown bullhead Ameiurus nebulosus
3 2
But this is not why I'm answering. I'm responding as I wanted to make sure
that you're aware of the pg-similarity extension:
https://salsa.debian.org/postgresql/pg-similarity
This tool implements a *lot* of similarity measures for fuzzy cmparisons.
Some are sting-oriented algorithms (Jaro-Winkler, Soundex, Levenshtein,
etc.), and others derive from and/or apply to field population comparisons,
like the Jaccard and Dice Coefficients. There's a lot of great stuff in the
package.
On Sat, Aug 31, 2019 at 3:14 AM Rich Shepard <rshepard(at)appl-ecosys(dot)com>
wrote:
> Tables hold data on fish counts by stream name, species, and (unreported)
> collection dates. I'm trying to write a query that returns the total number
> of each species in each stream.
>
> The latest attempt is (lines wrapped by alpine; submitted as one line):
>
> \copy (select f.stream_tribs, f.count_value, sum(f.count_value),
> i.common_name, i.sci_name from fish_counts as f, itis as i where
> f.stream_tribs like '%Nehalem River%' group by f.stream_tribs,
> i.common_name, i.sci_name, f.count_value order by f.stream_tribs,
> i.common_name, i.sci_name, f.count_value) to
> '/home/rshepard/projects/oregon/mohler_sand/data/fish/fishes.dat';
>
> The returned set starts this way:
>
> Anderson Creek trib to Nehalem River 0 0 Black crappie
> Pomoxis nigromaculatus
> Anderson Creek trib to Nehalem River 3 3 Black crappie
> Pomoxis nigromaculatus
> Anderson Creek trib to Nehalem River 0 0 Bluegill
> Lepomis macrochirus
> Anderson Creek trib to Nehalem River 3 3 Bluegill
> Lepomis macrochirus
> Anderson Creek trib to Nehalem River 0 0 Brook trout
> Salvelinus fontinalis
> Anderson Creek trib to Nehalem River 3 3 Brook trout
> Salvelinus fontinalis
> Anderson Creek trib to Nehalem River 0 0 Brown bullhead
> Ameiurus nebulosus
> Anderson Creek trib to Nehalem River 3 3 Brown bullhead
> Ameiurus nebulosus
>
> What I want returned would look like this:
>
> Anderson Creek trib to Nehalem River Black crappie Pomoxis
> nigromaculatus 3
> Anderson Creek trib to Nehalem River Bluegill Lepomis macrochirus
> 3
> Anderson Creek trib to Nehalem River Brook trout Salvelinus
> fontinalis 3
> Anderson Creek trib to Nehalem River Brown bullhead Ameiurus nebulosus
> 3
>
> I've read the manual yet must have not seen the section explaining how to
> apply aggregate functions to groups.
>
> Thanks in advance,
>
> Rich
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Francisco Olarte | 2019-08-31 08:11:38 | Re: SQL equivalint of #incude directive ? |
Previous Message | Ken Tanzer | 2019-08-31 00:03:28 | Re: Exclusion constraints on overlapping text arrays? |