From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Chris Travers <chris(dot)travers(at)adjust(dot)com> |
Cc: | Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [HACKERS] Custom compression methods |
Date: | 2019-03-21 19:59:56 |
Message-ID: | e02b7f01-ad9a-8b4c-609e-092a37d75926@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 3/19/19 4:44 PM, Chris Travers wrote:
>
>
> On Tue, Mar 19, 2019 at 12:19 PM Tomas Vondra
> <tomas(dot)vondra(at)2ndquadrant(dot)com <mailto:tomas(dot)vondra(at)2ndquadrant(dot)com>> wrote:
>
>
> On 3/19/19 10:59 AM, Chris Travers wrote:
> >
> >
> > Not discussing whether any particular committer should pick this
> up but
> > I want to discuss an important use case we have at Adjust for this
> sort
> > of patch.
> >
> > The PostgreSQL compression strategy is something we find
> inadequate for
> > at least one of our large deployments (a large debug log spanning
> > 10PB+). Our current solution is to set storage so that it does not
> > compress and then run on ZFS to get compression speedups on
> spinning disks.
> >
> > But running PostgreSQL on ZFS has some annoying costs because we have
> > copy-on-write on copy-on-write, and when you add file
> fragmentation... I
> > would really like to be able to get away from having to do ZFS as an
> > underlying filesystem. While we have good write throughput, read
> > throughput is not as good as I would like.
> >
> > An approach that would give us better row-level compression would
> allow
> > us to ditch the COW filesystem under PostgreSQL approach.
> >
> > So I think the benefits are actually quite high particularly for those
> > dealing with volume/variety problems where things like JSONB might
> be a
> > go-to solution. Similarly I could totally see having systems which
> > handle large amounts of specialized text having extensions for dealing
> > with these.
> >
>
> Sure, I don't disagree - the proposed compression approach may be a big
> win for some deployments further down the road, no doubt about it. But
> as I said, it's unclear when we get there (or if the interesting stuff
> will be in some sort of extension, which I don't oppose in principle).
>
>
> I would assume that if extensions are particularly stable and useful
> they could be moved into core.
>
> But I would also assume that at first, this area would be sufficiently
> experimental that folks (like us) would write our own extensions for it.
>
>
>
> >
> > But hey, I think there are committers working for postgrespro,
> who might
> > have the motivation to get this over the line. Of course,
> assuming that
> > there are no serious objections to having this functionality
> or how it's
> > implemented ... But I don't think that was the case.
> >
> >
> > While I am not currently able to speak for questions of how it is
> > implemented, I can say with very little doubt that we would almost
> > certainly use this functionality if it were there and I could see
> plenty
> > of other cases where this would be a very appropriate direction
> for some
> > other projects as well.
> >
> Well, I guess the best thing you can do to move this patch forward is to
> actually try that on your real-world use case, and report your results
> and possibly do a review of the patch.
>
>
> Yeah, I expect to do this within the next month or two.
>
>
>
> IIRC there was an extension [1] leveraging this custom compression
> interface for better jsonb compression, so perhaps that would work for
> you (not sure if it's up to date with the current patch, though).
>
> [1]
> https://www.postgresql.org/message-id/20171130182009.1b492eb2%40wp.localdomain
>
> Yeah I will be looking at a couple different approaches here and
> reporting back. I don't expect it will be a full production workload but
> I do expect to be able to report on benchmarks in both storage and
> performance.
>
FWIW I was a bit curious how would that jsonb compression affect the
data set I'm using for testing jsonpath patches, so I spent a bit of
time getting it to work with master. It attached patch gets it to
compile, but unfortunately then it fails like this:
ERROR: jsonbd: worker has detached
It seems there's some bug in how sh_mq is used, but I don't have time
investigate that further.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment | Content-Type | Size |
---|---|---|
jsonbd-master-fix.patch | text/x-patch | 7.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2019-03-21 20:03:22 | Re: Transaction commits VS Transaction commits (with parallel) VS query mean time |
Previous Message | Robert Haas | 2019-03-21 19:57:09 | Re: Libpq support to connect to standby server as priority |