Quick Links

Re: Per-column collation, work in progress

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc:	Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Per-column collation, work in progress
Date:	2010-10-15 02:54:40
Message-ID:	AANLkTikzfXX1jt4Noyi_PEQdE=vUvvnPhjW+hdMeg9xY@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, Oct 14, 2010 at 12:53 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
> On ons, 2010-10-13 at 19:15 -0400, Robert Haas wrote:
>> What's the status of this patch?
>
> I would appreciate some more review of the basic architecture.

Wow, what a patch. On the whole, I think this looks pretty good. Of
course, the obvious thing to be dismayed about is how many parts of
the system this touches. To some extent, that's probably inevitable
and maybe not that bad, but I wonder if there is some preparatory
refactoring that could be done to trim it down a bit. I notice, for
example, that a lot of places that looked at <asc/desc, nulls
first/last> now look at <asc/desc, nulls first/last, collation>. In
particular, all the pathkey stuff is like this. And similarly places
that used to care about <type, typmod> now have to care about <type,
tymod, collation>. There might be ways to restructure some of this
code so that these things can be changed without having to touch quite
so many places. If we're extending these lists from two items to
three, do we need to worry about what happens when they grow to four
or five or six? I particularly think this is in issue for the type
information. We are still finding bugs where typemod isn't carried
through properly; this kind of thing is only going to make it much
worse. We need to encapsulate it in some future-proof way.

It seems you've falsified the header comment in
pathkeys_useful_for_merging(), although I guess it's already false
because it doesn't seem to have been updated for the NULLS ASC/DESC
stuff, and the interior comment in right_merge_direction() also needs
adjusting. But this might be more than a documentation problem,
because the choice of merge direction really *is* arbitrary in the
case of ASC/DESC and NULLS FIRST/LAST, but I'm not sure whether that's
actually true for collation. If collation affects the definition of
equality then it certainly isn't true.

It looks like you've define collations as objects that exist within
particular namespaces, but there's no CREATE COLLATION statement, so I
don't see what purpose this serves. I suppose we could leave that to
be added later, but is there actually a use case for having collations
in individual schemas, or should we treat them more like we do casts -
i.e. as database-global objects?

Why does the executor ever need to see collate clauses?

In the department of minor nits, the use of the word "respectively" in
the CREATE INDEX documentation doesn't make sense to me. The message
about "has a collation conflict" is not self-explanatory.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Per-column collation, work in progress at 2010-10-14 16:53:50 from Peter Eisentraut

Responses

Re: Per-column collation, work in progress at 2010-10-21 18:44:36 from Peter Eisentraut
Re: Per-column collation, work in progress at 2010-11-24 20:37:54 from Peter Eisentraut

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2010-10-15 02:56:04	Re: Re: starting to review the Extend NOT NULL representation to pg_constraint patch
Previous Message	Shigeru HANADA	2010-10-15 01:33:14	Re: patch: SQL/MED(FDW) DDL