From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Marko Kreen <markokr(at)gmail(dot)com>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: MD5 aggregate |
Date: | 2013-06-14 13:59:01 |
Message-ID: | 51BB21A5.3060300@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 06/14/2013 09:40 AM, Stephen Frost wrote:
> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>> Marko Kreen <markokr(at)gmail(dot)com> writes:
>>> On Thu, Jun 13, 2013 at 12:35 PM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>>>> Attached is a patch implementing a new aggregate function md5_agg() to
>>>> compute the aggregate MD5 sum across a number of rows.
>>> It's more efficient to calculate per-row md5, and then sum() them.
>>> This avoids the need for ORDER BY.
>> Good point. The aggregate md5 function also fails to distinguish the
>> case where we have 'xyzzy' followed by 'xyz' in two adjacent rows
>> from the case where they contain 'xyz' followed by 'zyxyz'.
>>
>> Now, as against that, you lose any sensitivity to the ordering of the
>> values.
>>
>> Personally I'd be a bit inclined to xor the per-row md5's rather than
>> sum them, but that's a small matter.
> Where I'd take this is actually in a completely different direction..
> I'd like the aggregate to be able to match the results of running the
> 'md5sum' unix utility on a file that's been COPY'd out. Yes, that means
> we'd need a way to get back "what would this row look like if it was
> sent through COPY with these parameters", but I've long wanted that
> also.
>
> No, no clue about how to put all that together. Yes, having this would
> be better than nothing, so I'm still for adding this even if we can't
> make it match COPY output. :)
>
>
I'd rather go the other way, processing the records without having to
process them otherwise at all. Turning things into text must slow things
down, surely.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2013-06-14 14:01:11 | Re: Patch for fail-back without fresh backup |
Previous Message | Heikki Linnakangas | 2013-06-14 13:58:38 | Re: Patch for fail-back without fresh backup |