From: | "David E(dot) Wheeler" <david(at)kineticode(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: PATCH: CITEXT 2.0 v3 |
Date: | 2008-07-12 22:13:13 |
Message-ID: | A2AA4B32-A916-497C-9C2B-DC69B8BF952A@kineticode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Jul 12, 2008, at 14:57, Tom Lane wrote:
> There was some discussion earlier about whether the proposed
> regression
> tests for citext are suitable for use in contrib or not. After
> playing
> with them for awhile, I have to come down very firmly on the side of
> "not". I have these gripes:
You're nothing if not thorough, Tom.
> 1. The style is gratuitously different from every other regression
> test
> in the system. This is not a good thing. If it were an amazingly
> better way to do things, then maybe, but as far as I can tell the
> style
> pgTAP forces on you is really pretty darn poorly suited for SQL tests.
> You have to contort what could naturally be expressed in SQL as a
> table
> result into a scalar. Plus it's redundant with the expected-output
> file.
Sure. I wasn't aware of the existing regression test methodology when
I wrote pgTAP and those tests. I'm fine to change them and use pgTAP
for testing my app code, rather than PostgreSQL itself.
> 2. It's ridiculously slow; at least a factor of ten slower than doing
> equivalent tests directly in SQL. This is a very bad thing. Speed of
> regression tests matters a lot to those of us who run them a dozen
> times
> per day --- and I do not wish to discourage any developers who don't
> work that way from learning better habits ;-)
Hrm. I'm wonder why it's so slow? The test functions don't really do a
lot. Anyway, I agree that they should perform well.
> Because of #1 and #2 I find the use of pgTAP to be a nonstarter.
> I have a couple of gripes about the substance of the tests as well:
>
> 3. What you are mostly testing is not the behavior of citext itself,
> but the behavior of the underlying strcoll function. This is pretty
> much doomed to failure, first because the regression tests are
> intended
> to work in multiple locales (and we are *not* giving that up for
> citext), and second because the behavior of strcoll isn't all that
> portable across platforms even given allegedly similar locale settings
> (as we already found out yesterday).
Yes, I *just* ran the tests on Ubuntu and opened my mail to ask about
the bizarre differences when I saw this message. Thanks for answering
my question before I asked it. :-)
> Sadly, I think you have to give up
> attempts to check the interesting multibyte cases and confine yourself
> to tests using ASCII strings.
Grr. Kind of defeats the purpose. Is there no infrastructure for
testing multibyte functionality? Are test database clusters all built
using SQL_ASCII and the C locale?
> 4. A lot of the later test cases are equally uselessly testing whether
> piggybacking over text functions works. The odds of ever finding
> anything with those tests are not distinguishable from zero (unless
> the
> underlying text function is busted, which is not your responsibility
> to
> test). So I don't see any point in putting them into the standard
> regression package. (What maybe *would* be useful to test, but you
> didn't, is whether the result of a function is considered citext
> rather
> than text.)
Well, I was doing test-driven development: I wrote the tests to ensure
that I had added the functions for CITEXT properly, and when they
passed, I could move on. As a unit tester, it'd feel weird for me to
drop these. I'm not testing the underlying functions; I'm making sure
that they work properly with CITEXT.
> I suggest something more like the attached as a suitable regression
> test. I got bored about halfway through and started to skim, so there
> might be a few tests toward the end of the original set that would be
> worth transposing into this one, but nothing jumped out at me.
Thanks! I'll work this in.
Best,
David
PS: I confirmed late yesterday that the memory leak I saw was with my
version for 8.3, where I had my own str_tolower(). It clearly has a
leak that I'll have to fix, but it has no bearing on the contribution
to 8.4, which has no such leak. Thanks for running the test yourself
to confirm.
From | Date | Subject | |
---|---|---|---|
Next Message | David E. Wheeler | 2008-07-13 03:30:33 | Re: PATCH: CITEXT 2.0 v3 |
Previous Message | Tom Lane | 2008-07-12 21:57:27 | Re: PATCH: CITEXT 2.0 v3 |