Re: Encoding/collation question

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Rich Shepard <rshepard(at)appl-ecosys(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Encoding/collation question
Date: 2019-12-11 19:25:17
Message-ID: 24866.1576092317@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Rich Shepard <rshepard(at)appl-ecosys(dot)com> writes:
> My older databases have LATIN1 encoding and C collation; the newer ones have
> UTF8 encoding and en_US.UTF-8 collation. A web search taught me that I can
> change each old database by dumping it and restoring it with the desired
> encoding and collation types. My question is whether the older types make
> any difference in a single-user environment.

String comparisons in non-C collations tend to be a lot slower than
they are in C collation. Whether this makes a noticeable difference
to you depends on your workload, but certainly we've seen performance
gripes that trace to that.

If your data doesn't require the larger character set of UTF8, then
using LATIN-any is going to offer some space savings (for non-ASCII
characters) plus minor performance benefits due to the lack of
variable-width characters. This is less significant than the
collation issue, though, for most people.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rich Shepard 2019-12-11 19:44:29 Re: Encoding/collation question
Previous Message Tom Lane 2019-12-11 19:20:10 Re: Fast, stable, portable hash function producing 4-byte or 8-byte values?