From: | Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Joachim Wieland <joe(at)mcknight(dot)de>, Greg Stark <gsstark(at)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: a faster compression algorithm for pg_dump |
Date: | 2010-04-14 08:25:17 |
Message-ID: | 4BC57BED.1090705@kaltenbrunner.cc |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tom Lane wrote:
> Joachim Wieland <joe(at)mcknight(dot)de> writes:
>> If we still cannot do this, then what I am asking is: What does the
>> project need to be able to at least link against such a compression
>> algorithm?
>
> Well, what we *really* need is a convincing argument that it's worth
> taking some risk for. I find that not obvious. You can pipe the output
> of pg_dump into your-choice-of-compressor, for example, and that gets
> you the ability to spread the work across multiple CPUs in addition to
> eliminating legal risk to the PG project. And in any case the general
> impression seems to be that the main dump-speed bottleneck is on the
> backend side not in pg_dump's compression.
legal risks aside (I'm not a lawyer so I cannot comment on that) the
current situation imho is:
* for a plain pg_dump the backend is the bottleneck
* for a pg_dump -Fc with compression, compression is a huge bottleneck
* for pg_dump | gzip, it is usually compression (or bytea and some other
datatypes in <9.0)
* for a parallel dump you can either dump uncompressed and compress
afterwards which increases diskspace requirements (and if you need
parallel dump you usually have a large database) and complexity (because
you would have to think about how to manually parallel the compression
* for a parallel dump that compresses inline you are limited by the
compression algorithm on a per core base and given that the current
inline compression overhead is huge you loose a lot of the benefits of
parallel dump
Stefan
From | Date | Subject | |
---|---|---|---|
Next Message | Fujii Masao | 2010-04-14 08:32:14 | Re: Remaining Streaming Replication Open Items |
Previous Message | Dimitri Fontaine | 2010-04-14 08:24:45 | Re: testing HS/SR - 1 vs 2 performance |