Dhaval Shah wrote:
> 2. Most of the streamed rows are very similar. Think syslog rows,
> where for most cases only the timestamp changes. Of course, if the
> data can be compressed, it will result in improved savings in terms of
> disk size.
If it really is usually just the timestamp that changes, one way to
"compress" such data might be to split your logical row into two
tables. First table has all the original columns but the timestanp,
plus an ID. Second table has the timestamp and a foreign key into
the first table. Depending on how wide your original row is, and how
often it's only the timestamp that changes, this could result in
decent "compression".
Of course, now you need referential integrity.
- John D. Burger
MITRE