From: | Greg Stark <gsstark(at)mit(dot)edu> |
---|---|
To: | John Seberg <johnseberg(at)yahoo(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Duplicate Values or Not?! |
Date: | 2005-09-17 05:36:50 |
Message-ID: | 87r7boqmq5.fsf@stark.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
John Seberg <johnseberg(at)yahoo(dot)com> writes:
> I recently tried to CREATE a UNIQUE INDEX and could
> not, due to duplicate values:
>
> CREATE UNIQUE INDEX usr_login ON usr (login);
>
> To try to find the offending row(s), I then executed
> the following:
>
> SELECT count(*), login FROM usr GROUP BY login ORDER
> BY 1 DESC;
>
> The GROUP BY didn't group anything, indicating to me
> that there were no duplicate values. There were the
> same number of rows in this query as a simple SELECT
> count(*) FROM usr.
>
> This tells me that Postgresql is not using the same
> method for determining duplicates when GROUPING and
> INDEXing.
You might try running the GROUP BY query after doing:
set enable_hashagg = false;
select ...
With that false it would have to sort the results which should be exactly the
same code as the index is using. I think.
That doesn't really answer the rest of your questions. The short of it is that
setting the encoding doesn't magically make your data encoded in that
encoding. If your client sends it one encoding but claims it's unicode then
Postgres will happily store it in a UNICODE database and it'll be garbage.
Maybe someone else will have more specific advice on that front.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Vikas | 2005-09-17 08:35:04 | unsubscribe |
Previous Message | Tom Lane | 2005-09-17 05:23:33 | Re: pg_ctl reload breaks our client |