From: | "Erik Rijkers" <er(at)xs4all(dot)nl> |
---|---|
To: | "Jeff Davis" <pgsql(at)j-davis(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: REVIEW Range Types |
Date: | 2011-02-08 19:43:18 |
Message-ID: | a0804f90179263648c96f89d481cd85e.squirrel@webmail.xs4all.nl |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, February 6, 2011 07:41, Jeff Davis wrote:
> New patch. All known TODO items are closed, although I should do a
> cleanup pass over the code and docs.
>
> Fixed in this patch:
>
> * Many documentation improvements
> * Added INT8RANGE
> * Renamed PERIOD[TZ] -> TS[TZ]RANGE
> * Renamed INTRANGE -> INT4RANGE
> * Improved parser's handling of whitespace and quotes
> * Support for PL/pgSQL functions with ANYRANGE arguments/returns
> * Make "subtype_float" function no longer a requirement for GiST,
> but it should still be supplied for the penalty function to be
> useful.
>
I'm afraid even the review is WIP, but I thought I'd post what I have.
Context: At the moment we use postbio (see below) range functionality, to search ranges and
overlap in large DNA databases ('genomics'). We would be happy if a core data type could replace
that. It does look like the present patch is ready to do those same tasks, of which the main one
for us is gist-indexed ranges. We also use btree_gist with that, so to include that in core would
make sense in this regard.
test config:
./ configure \
--prefix=/var/data1/pg_stuff/pg_installations/pgsql.range_types \
--with-pgport=6563 \
--enable-depend \
--enable-cassert \
--enable-debug \
--with-perl \
--with-openssl \
--with-libxml \
--enable-dtrace
compile, make, check, install all OK.
------------------
Submission review:
------------------
* Is the patch in context diff format?
Yes.
* Does it apply cleanly to the current git master?
It applied cleanly. (after the large serialisation commit 2011.02.08 it will need some
changes/rebase)
* Does it include reasonable tests, necessary doc patches, etc?
Yes, there are many tests; the documentation is good. Small improvements below.
-----------------
Usability review
-----------------
Read what the patch is supposed to do, and consider:
* Does the patch actually implement that?
Yes.
* Do we want that?
Yes.
* Do we already have it?
contrib/seg has some similar functionalities: "seg is a data type for representing line segments,
or floating point intervals".
And on pgFoundry there is a seg spin-off "postbio", tailored to genomic data (with gist-indexing).
(see postbio manual: http://postbio.projects.postgresql.org/ )
* Does it follow SQL spec, or the community-agreed behavior?
I don't know - I couldn't find much in the SQL-spec on a range datatype.
The ranges behaviour has been discussed on -hackers.
* Does it include pg_dump support (if applicable)?
dump/restore were fine in the handful of range-tables
which I moved between machines.
* Are there dangers?
Not that I could find.
* Have all the bases been covered?
I think the functionality looks fairly complete.
-------------
Feature test:
-------------
* Does the feature work as advertised?
The patch seems very stable. My focus has been mainly on the intranges. I tested by parsing
documentation- and regression examples, and parametrising them in a perl harness, to generate many
thousands of range combinations. I found only a single small problem (posted earlier - Jeff Davis
solved it already apparently).
see: http://archives.postgresql.org/pgsql-hackers/2011-02/msg00387.php
* Are there corner cases the author has failed to consider?
No.
* Are there any assertion failures or crashes?
I haven't seen a single one.
--------------
Documentation:
--------------
Section 9.18:
table 9-42. range functions:
The following functions are missing (I encountered them in the regression tests):
contained_by()
range_eq()
section 'Constructing Ranges' (8.16.6):
In the code example, remove the following line:
"-- the int4range result will appear in the canonical format"
it doesn't make sense there. At this place "canonical format" has not been discussed;
maybe it is not even discussed anywhere.
also (same place):
'where "_" is used to mean "exclusive" and "" is used to mean "inclusive".'
should be:
'where "_" is used to mean "exclusive" and "i" is used to mean "inclusive".'
And btw: it should mention here that the range*inf* functions,
an underscore to separate 'range' from the rest of the function name, e.g.:
range_linfi_() => infinite lower bound, inclusive upper bound
I still want to do Performance review and Coding review.
FWIW, I would like to repeat that my impression is that the patch is very stable, especially with
regard to the intranges (tested extensively).
regards,
Erik Rijkers
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2011-02-08 19:48:53 | Re: MVCC doc typo fix |
Previous Message | Magnus Hagander | 2011-02-08 19:34:15 | Re: Sync Rep for 2011CF1 |