Hierarchical numeric data type

From: Derek Poon <derekp+pgsql(at)ece(dot)ubc(dot)ca>
To: pgsql-general(at)postgresql(dot)org
Subject: Hierarchical numeric data type
Date: 2013-08-06 21:36:00
Message-ID: 8D6FFCD7-12C8-4536-B2DB-FC9B8BA0C46A@ece.ubc.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

I'm looking for a data type to store numerically labelled hierarchical data, such as section.subsection.paragraph numbers (e.g. '1.5.3', '1.10.2', '6.30').

The closest data type that I have found is ltree. However, the collation order is inappropriate: it would put '1.10.2' before '1.5.3', since it performs a naïve memcmp() at each level.[1]

One way to get the desired sort order would be to use the semver extension. However, that's not really appropriate, as I don't want to store version numbers, and my data do not fit server's mandatory X.Y.Z three-level scheme.

Of course, I could define a hierarchy-of-integers data type and implement my own comparison functions. I'm reluctant to cause a proliferation of data types, though, as ltree is semantically the type I want. I'm just unhappy with its sort order.

Therefore, I would like to suggest that ltree be modified to use a smart comparator that recognizes numbers within strings and sorts them in a human-friendly way. Apple[2] and recent versions of Windows[3] handle filenames this way. One sample implementation of such a comparator is natsort.[4]

The performance impact of the enhanced comparator would probably be negligible, compared to I/O bottlenecks. A bigger issue would be backwards compatibility, especially for ltrees with existing btree indexes.

Feedback? Suggestions?

Derek

[1]: http://doxygen.postgresql.org/ltree__op_8c.html#a635600ad7aad78addf3c14a6e2d67fed

[2]: https://developer.apple.com/LIBRARY/IOS/#documentation/FileManagement/Conceptual/FileSystemProgrammingGUide/FileSystemDetails/FileSystemDetails.html

[3]: http://www.codinghorror.com/blog/2007/12/sorting-for-humans-natural-sort-order.html

[4]: http://sourcefrog.net/projects/natsort/

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Carlos Henrique Reimer 2013-08-06 23:17:16 Re: Exit code -1073741819
Previous Message John R Pierce 2013-08-06 19:52:55 Re: Fwd: Sharing data directories between machines