From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | David Nelson <dlnelson77808(at)outlook(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Count of non-null values per table column |
Date: | 2015-08-14 14:51:57 |
Message-ID: | CAKJS1f9rqzSe8M-kRL2kkm6NqJXNGn0udBJo3cnYVE25_kinvg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 15 August 2015 at 02:32, David Nelson <dlnelson77808(at)outlook(dot)com> wrote:
> Hello list,<br><br>Apologies if this has been asked before. My search only
> turned up ways to list the total non-null values for all columns as a
> single number. I want the count for each column by column.<br><br>I have
> inherited a database consisting of two related huge monolithic tables that
> lack referential integrity between them, or even basic data constraints.
> One of the problems these tables have is every single non-PK column is
> NULLable. I am trying to understand the information that is actually stored
> and used so that I can implement a (hopefully) better design. Towards that
> end I would like to know the count of non-null values in each column per
> column. In other words I would like to get the following output from a
> table (the numbers are totally made up):<br><br>column_name |
> num_values<br>------------+-----------<br>col1
> |
> 5787<br>------------+-----------<br>col2
> |
> 17<br>------------+-----------<br>col3
> |
> 567<br>------------+-----------<br>col4
> |
> 5787<br>------------+-----------<br>col5
> |
> 143<br>------------+-----------<br>col6
> |
> 1<br>------------+-----------<br>...<br><br>Is this possible through one or
> more of the system views, or will I need to write a function to do this?
> Obviously I can just issue multiple SELECT COUNT(column)... statements, but
> I'd rather not.<br><br>Thanks,<br>David
>
>
I didn't quite catch a Postgres version number in that mess :)
I assume the tables are quite large if you don't want to just issue a:
SELECT COUNT(a) FILTER(WHERE a IS NULL),COUNT(b) FILTER(WHERE b IS NULL)
... (assuming you're on a version new enough to support agg FILTER)
On the other hand if you were happy with just an approximation then you
could look at pg_stats;
create table abc(a int, b int,c int);
insert into abc values(1, 1, NULL),(1, NULL, NULL),(NULL, NULL, NULL);
analyze abc;
select attname, null_frac from pg_stats where tablename = 'abc';
attname | null_frac
---------+-----------
a | 0.333333
b | 0.666667
c | 1
Keep in mind though that this is an *approximation* and possibly could be
inaccurate. If you want exact do the COUNT(col).
If you're not on a version new enough for COUNT(col) FILTER WHERE ..., you
could just SUM(CASE WHEN col IS NULL THEN 1 ELSE 0 END)
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | John McKown | 2015-08-14 14:59:44 | Re: Count of non-null values per table column |
Previous Message | David Nelson | 2015-08-14 14:38:21 | Re: Count of non-null values per table column |