From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Marko Kreen <markokr(at)gmail(dot)com> |
Cc: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Noah Misch <noah(at)leadboat(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, david(at)fetter(dot)org |
Subject: | Re: MD5 aggregate |
Date: | 2013-06-27 15:44:28 |
Message-ID: | CA+Tgmoaa5kMEVZRoGQkUmZ8ykBN9Qv3iqUxyZi8o7U8Y_V_9YA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Jun 27, 2013 at 7:29 AM, Marko Kreen <markokr(at)gmail(dot)com> wrote:
> On Thu, Jun 27, 2013 at 11:28 AM, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> wrote:
>> On 26 June 2013 21:46, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>>> On 6/26/13 4:04 PM, Dean Rasheed wrote:
>>>> A quick google search reveals several people asking for something like
>>>> this, and people recommending md5(string_agg(...)) or
>>>> md5(string_agg(md5(...))) based solutions, which are doomed to failure
>>>> on larger tables.
>>>
>>> The thread discussed several other options of checksumming tables that
>>> did not have the air of a crytographic offering, as Noah put it.
>>>
>>
>> True but md5 has the advantage of being directly comparable with the
>> output of Unix md5sum, which would be useful if you loaded data from
>> external files and wanted to confirm that your import process didn't
>> mangle it.
>
> The problem with md5_agg() is that it's only useful in toy scenarios.
>
> It's more useful give people script that does same sum(hash(row))
> on dump file than try to run MD5 on ordered rows.
>
> Also, I don't think anybody actually cares about MD5(table-as-bytes), instead
> people want way to check if 2 tables or table and dump are same.
I think you're trying to tell Dean to write the patch that you want
instead of the patch that he wants. There are certainly other things
that could be done that some people might sometimes prefer, but that
doesn't mean what he did isn't useful.
That having been said, I basically agree with Noah: I think this would
be a useful extension (perhaps even in contrib?) but I don't think we
need to install it by default. It's useful, but it's also narrow.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2013-06-27 15:45:43 | Re: in-catalog Extension Scripts and Control parameters (templates?) |
Previous Message | Robert Haas | 2013-06-27 15:39:43 | Re: Reduce maximum error in tuples estimation after vacuum. |