Re: counting distinct rows on more than one column

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Dirk Lutzebaeck <lutzeb(at)aeccom(dot)com>
Cc: Michael Fork <mfork(at)toledolink(dot)com>, pgsql-sql(at)postgresql(dot)org
Subject: Re: counting distinct rows on more than one column
Date: 2001-03-28 19:51:39
Message-ID: 19360.985809099@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Dirk Lutzebaeck <lutzeb(at)aeccom(dot)com> writes:
> Michael Fork writes:
>>> In 7.0.3, I believe the following would work:
>>>
>>> SELECT count(distinct(a || b)) FROM t;

> Great, this works! I don't quite get it why...

Michael really should not have proposed that solution without mentioning
its limitations: it's not actually counting distinct values of the column
pair a,b, but only of their textual concatenation. For example a = 'xy'
and b = 'z' will look the same as a = 'x' and b = 'yz'.

If there is some character you never use in column A, say '|', you
could do count(distinct(a || '|' || b)) with some safety, but this
strikes me as still a pretty fragile approach.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Stephan Szabo 2001-03-28 20:00:09 Re: DELETE FROM fails with error
Previous Message Dirk Lutzebaeck 2001-03-28 19:32:23 Re: counting distinct rows on more than one column