Quick Links

Character Conversions Handling

From:	Volkan YAZICI <volkan(dot)yazici(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Character Conversions Handling
Date:	2005-10-18 19:29:30
Message-ID:	7104a7370510181229p72d4d340j348d8e5f04a3ea34@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

I'm trying to understand the schema laying behind
backend/utils/adt/like.c to downcase letters [1]. When I look at the
other tolower() implementations, there're lots of them spread around.
(In interfaces/libpq, backend/regex, backend/utils/adt/like and etc.)
For example, despite having pg_wc_tolower() function in regc_locale.c,
achieving same with manually in iwchareq() of like.c.

I'd so appreciated if somebody can point me the places where I should
start to look at to understand the character handling with different
encodings. Also, I wonder why didn't we use any btow/mbsrtowc/wctomb
like functions. Is this for portability with other compilers?

[1] iwchareq() is using pg_mb2wchar_with_len() which decides the right
mb2wchar function from pg_wchar_table. When I look at
backend/mb/wchar.c there're some other specific to locale mblen and
mb2wchar routines. For example, EUC_KR is handled with
pg_euc2wchar_with_len() function, but LATIN5 is handled with
pg_latin12wchar_with_len() function. Will we write a new function for
latin5 like pg_latin52wchar_with_len() if we'd encounter with a new
problem with latin5?

Regards.

Responses

Re: Character Conversions Handling at 2005-10-18 21:39:29 from Martijn van Oosterhout

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2005-10-18 19:43:12	Re: Alpha: HEAD: Failure
Previous Message	Larry Rosenman	2005-10-18 19:27:19	Alpha: HEAD: Failure