From: | "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu> |
---|---|
To: | "Ross J(dot) Reedstrom" <reedstrm(at)rice(dot)edu> |
Cc: | Peter Eisentraut <peter_e(at)gmx(dot)net>, Marko Kreen <markokr(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: sha1, sha2 functions into core? |
Date: | 2011-09-03 18:59:39 |
Message-ID: | 20110903185939.GX19360@staff-mud-56-27.rice.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 02, 2011 at 04:27:46PM -0500, Ross J. Reedstrom wrote:
> On Fri, Sep 02, 2011 at 02:05:45PM -0500, ktm(at)rice(dot)edu wrote:
> > On Fri, Sep 02, 2011 at 09:54:07PM +0300, Peter Eisentraut wrote:
> > > On ons, 2011-08-31 at 13:12 -0500, Ross J. Reedstrom wrote:
> > > > Hmm, this thread seems to have petered out without a conclusion. Just
> > > > wanted to comment that there _are_ non-password storage uses for these
> > > > digests: I use them in a context of storing large files in a bytea
> > > > column, as a means to doing data deduplication, and avoiding pushing
> > > > files from clients to server and back.
> > >
> > > But I suppose you don't need the hash function in the database system
> > > for that.
> > >
> >
> > It is very useful to have the same hash function used internally by
> > PostgreSQL exposed externally. I know you can get the code and add an
> > equivalent one of your own...
> >
> Thanks for the support Ken, but Peter's right: the only backend use in
> my particular case is to let the backend do the hash calc during bulk
> loads: in the production code path, having the hash in two places
> doesn't save any work, since the client code has to calculate the hash
> in order to test for its existence in the backend. I suppose if the
> network cost was negligable, I could just push the files anyway, and
> have a before-insert trigger calculate the hash and do the dedup: then
> it'd be hidden in the backend completely. But as is, I can do all the
> work in the client.
>
While it is true that it doesn't save any work. My motivation for having
it exposed is that "good" hash functions are non-trivial to find. I have
dealt with computational artifacts produced by hash functions that seemed
at first to be good. We use a very well behaved function within the data-
base and exposing it will help prevent bad user hash function
implementations.
Regards,
Ken
From | Date | Subject | |
---|---|---|---|
Next Message | Dimitri Fontaine | 2011-09-03 20:49:51 | Re: pg_restore --no-post-data and --post-data-only |
Previous Message | Bruce Momjian | 2011-09-03 15:12:11 | Re: pg_upgrade automatic testing |