Re: Call for Google Summer of Code (GSoC) 2012: Project ideas?

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Stefan Keller <sfkeller(at)gmail(dot)com>, pgsql-general List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Call for Google Summer of Code (GSoC) 2012: Project ideas?
Date: 2012-03-09 16:19:54
Message-ID: 4F5A2DAA.3020100@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 3/9/2012 9:47 AM, Merlin Moncure wrote:
> On Thu, Mar 8, 2012 at 2:01 PM, Andy Colson<andy(at)squeakycode(dot)net> wrote:
>> I know toast compresses, but I believe its only one row. page level would
>> compress better because there is more data, and it would also decrease the
>> amount of IO, so it might speed up disk access.
>
> er, but when data is toasted it's spanning pages. page level
> compression is a super complicated problem.
>
> something that is maybe more attainable on the compression side of
> things is a userland api for compression -- like pgcrypto is for
> encryption. even if it didn't make it into core, it could live on
> reasonably as a pgfoundry project.
>
> merlin

Agreed its probably too difficult for a GSoC project. But userland api
would still be row level, which, in my opinion is useless. Consider
rows from my apache log that I'm dumping to database:

date, url, status
2012-3-9 10:15:00, '/index.php?id=4', 202
2012-3-9 10:15:01, '/index.php?id=5', 202
2012-3-9 10:15:02, '/index.php?id=6', 202

That wont compress at all on a row level. But it'll compress 99% on a
"larger" (page/multirow/whatever/?) level.

-Andy

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2012-03-09 16:22:36 Re: autovacuum and transaction id wraparound
Previous Message Merlin Moncure 2012-03-09 15:47:09 Re: Call for Google Summer of Code (GSoC) 2012: Project ideas?