Re: Count of non-null values per table column

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: David Nelson <dlnelson77808(at)outlook(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Count of non-null values per table column
Date: 2015-08-14 14:51:57
Message-ID: CAKJS1f9rqzSe8M-kRL2kkm6NqJXNGn0udBJo3cnYVE25_kinvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 15 August 2015 at 02:32, David Nelson <dlnelson77808(at)outlook(dot)com> wrote:

> Hello list,<br><br>Apologies if this has been asked before. My search only
> turned up ways to list the total non-null values for all columns as a
> single number. I want the count for each column by column.<br><br>I have
> inherited a database consisting of two related huge monolithic tables that
> lack referential integrity between them, or even basic data constraints.
> One of the problems these tables have is every single non-PK column is
> NULLable. I am trying to understand the information that is actually stored
> and used so that I can implement a (hopefully) better design. Towards that
> end I would like to know the count of non-null values in each column per
> column. In other words I would like to get the following output from a
> table (the numbers are totally made up):<br><br>column_name |
> num_values<br>------------+-----------<br>col1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> 5787<br>------------+-----------<br>col2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> 17<br>------------+-----------<br>col3&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> 567<br>------------+-----------<br>col4&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> 5787<br>------------+-----------<br>col5&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> 143<br>------------+-----------<br>col6&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> |&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
> 1<br>------------+-----------<br>...<br><br>Is this possible through one or
> more of the system views, or will I need to write a function to do this?
> Obviously I can just issue multiple SELECT COUNT(column)... statements, but
> I'd rather not.<br><br>Thanks,<br>David
>
>
I didn't quite catch a Postgres version number in that mess :)

I assume the tables are quite large if you don't want to just issue a:
SELECT COUNT(a) FILTER(WHERE a IS NULL),COUNT(b) FILTER(WHERE b IS NULL)
... (assuming you're on a version new enough to support agg FILTER)

On the other hand if you were happy with just an approximation then you
could look at pg_stats;

create table abc(a int, b int,c int);
insert into abc values(1, 1, NULL),(1, NULL, NULL),(NULL, NULL, NULL);
analyze abc;

select attname, null_frac from pg_stats where tablename = 'abc';

attname | null_frac
---------+-----------
a | 0.333333
b | 0.666667
c | 1

Keep in mind though that this is an *approximation* and possibly could be
inaccurate. If you want exact do the COUNT(col).

If you're not on a version new enough for COUNT(col) FILTER WHERE ..., you
could just SUM(CASE WHEN col IS NULL THEN 1 ELSE 0 END)

Regards

David Rowley

--
David Rowley http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John McKown 2015-08-14 14:59:44 Re: Count of non-null values per table column
Previous Message David Nelson 2015-08-14 14:38:21 Re: Count of non-null values per table column