Re: Internationalisation of database content (text columns)

From: Karsten Hilbert <Karsten(dot)Hilbert(at)gmx(dot)net>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Internationalisation of database content (text columns)
Date: 2010-10-13 19:26:41
Message-ID: 20101013192641.GC2485@hermes.hilbert.loc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Oct 08, 2010 at 12:12:54PM +0200, Wolfgang Keller wrote:

> I'm working on a database schema which contains lots of
> "type code lookup" tables. The entries of these tables are
> also hierarchically related among themselves
> (subtype/supertype), to store rather large and quite complex
> taxonomies.

.From my experience it depends. Either you've got a "coding
system" which you need to provide localized "translations"
for or else you've got arbitrary type code lookups.

With coding systems it is typically not really a translation
of the coded term but rather *another* term people attach to
the same code - incidentally when using another language.
Terms in one language change while they don't change in
another. Think of the code as defining a class with all the
local language terms being ever-so-slightly different things
all belonging into that class (eg. while "back pain" and
"Kreuzschmerz" aren't considered translations of each other
*medically* they can well be considered to group under the
same ICD-10 code). Thus I've found this general scheme to
work well:

create table coded_term (
pk serial primary key,
code text,
term text,
lang text,
fk_coding_system integer
references coding_system(pk),
unique(code, term, lang, fk_coding_system)
);

(it can be argued whether lang should fold into coding_system)

If it's about arbitrary lookup values for codes I am using a
gettext version rewritten in pgsql similar to this:

create table lut_colors (
pk serial primary key
color text
);

create view v_lut_colors as
select
pk
as pk_lut_color,
color
as color,
_(color)
as l10n_color
from
lut_colors;

(you don't need the view or you don't need it in this way but
it's useful)

Now you guessed it: _() is a plpgsql function which does a
translation table lookup based on the database account (or a
passed in user name) and a pre-configured (or passed in)
language per said account/user name. It falls back from,
say, "de_DE" to, say, "de" to returning the original string.

The translation table is filled this way:

select i18n_upd_tx('de_DE', 'blue', 'blau');
select i18n_upd_tx('de_DE', 'grey', 'grau');

It doesn't really matter which language is used as the
"original" lookup language as long as a translation exists
for the desired target language:

select i18n_upd_tx('en', '1ö34kjafg8', 'yellow');

will properly make

select _('1ö34kjafg8', 'en');

return "yellow".

All the code for this is to be found in the git repository
for GNUmed at gitorious:

http://gitorious.org/gnumed

> BTW: Methods that use a single table to hold all
> translations for all strings in all tables of the entire
> schema are not very useful in this case, since it can't be
> excluded that depending on the context (i.e. the specific
> semantics of the specific "type code lookup" table) the
> translation of one and the same string in one language will
> be different in other languages.

Well, either add in a context field to the _()/i18n_upd_tx()
approach or consider using the coding system approach. You
might even figure out a way to use the tableoid in the
translation function:

create view v_lut_colors as
select
pk
as pk_lut_color,
color
as color,
_(color, lut_colors.tableoid)
as l10n_color
from
lut_colors;

This would require applying the tableoid when adding
translations though.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Pavel Stehule 2010-10-13 20:53:04 Re: How to search ignoring spaces and minus signs
Previous Message Dmitriy Igrishin 2010-10-13 18:58:30 How to determine failed connection attempt due to invalid authorization (libpq)?