From: | "Cimode" <cimode(at)hotmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: cyclical redundancy checksum algorithm(s)? |
Date: | 2006-09-28 12:22:38 |
Message-ID: | 1159446158.259610.222230@i3g2000cwc.googlegroups.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Karen Hill wrote:
> Tom Lane wrote:
> > "Karen Hill" <karen_hill22(at)yahoo(dot)com> writes:
> > > Ralph Kimball states that this is a way to check for changes. You just
> > > have an extra column for the crc checksum. When you go to update data,
> > > generate a crc checksum and compare it to the one in the crc column.
> > > If they are same, your data has not changed.
> >
> > You sure that's actually what he said? A change in CRC proves the data
> > changed, but lack of a change does not prove it didn't.
>
>
> On page 100 in the book, "The Data Warehouse Toolkit" Second Edition,
> Ralph Kimball writes the following:
>
> "Rather than checking each field to see if something has changed, we
> instead compute a checksum for the entire row all at once. A cyclic
> redundancy checksum (CRC) algorithm helps us quickly recognize that a
> wide messy row has changed without looking at each of its constituent
> fields."
>
> On page 360 he writes:
>
> "To quickly determine if rows have changed, we rely on a cyclic
> redundancy checksum (CRC) algorithm. If the CRC is identical for the
> extracted record and the most recent row in the master table, then we
> ignore the extracted record. We don't need to check every column to be
> certain that the two rows match exactly."
> > People do sometimes use this logic in connection with much wider
> > "summary" functions, such as an MD5 hash. I wouldn't trust it at all
> > with a 32-bit CRC, and not much with a 64-bit CRC. Too much risk of
> > collision.
Ho do you calculate a checksum on a binary file stored in a database?
What part of the file are you going to use for computations?
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-09-28 14:05:50 | Re: 'pg_ctl -w' times out when unix_socket_directory is |
Previous Message | David Cressey | 2006-09-28 12:20:49 | Re: cyclical redundancy checksum algorithm(s)? |