From: | Christian Convey <christian(dot)convey(at)gmail(dot)com> |
---|---|
To: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: alternative back-end block formats |
Date: | 2014-01-27 18:42:29 |
Message-ID: | CAPfS4ZzwxnQuYjEBnmd0eiYW3t85o4YOvGXfqK=AcNOgKc77rQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Craig,
On Sun, Jan 26, 2014 at 5:47 AM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:
> On 01/21/2014 07:43 PM, Christian Convey wrote:
> > Hi all,
> >
> > I'm playing around with Postgres, and I thought it might be fun to
> > experiment with alternative formats for relation blocks, to see if I can
> > get smaller files and/or faster server performance.
>
> It's not clear how you'd do this without massively rewriting the guts of
> Pg.
>
> Per the docs on internal structure, Pg has a block header, then tuples
> within the blocks, each with a tuple header and list of Datum values for
> the tuple. Each Datum has a generic Datum header (handling varlena vs
> fixed length values etc) then a type-specific on-disk representation
> controlled by the type output function for that type.
>
I'm still in the process of getting familiar with the pg backend code, so I
don't have a concrete plan yet. However, I'm working on the assumption
that some set of macros and functions encapsulates the page layout.
If/when I tackle this, I expect to add a layer of indirection somewhere
around that boundary, so that some non-catalog tables, whose schemas meet
certain simplifying assumptions, are read and modified using specialized
code.
I don't want to get into the specific optimizations I'd like to try, only
because I haven't fully studied the code yet, so I don't want to put my
foot in my mouth.
What concrete problem do you mean to tackle? What idea do you want to
> explore or implement?
>
My real motivation is that I'd like to get more familiar with the pg
backend codebase, and tilting at this windmill seemed like an interesting
way to accomplish that.
If I was focused on really solving a real-world problem, I'd say that this
lays the groundwork for table-schema-specific storage optimizations and
optimized record-filtering code. But I'd only make that argument if I
planned to (a) perform a careful study with statistically significant
benchmarks, and/or (b) produce a merge-worthy patch. At this point I have
no intentions of doing so. My main goal really is just to have fun with
the code.
> > Does anyone know if this has been done before with Postgres? I would
> > have assumed yes, but I'm not finding anything in Google about people
> > having done this.
>
> AFAIK (and I don't know much in this area) the storage manager isn't
> very pluggable compared to the rest of Pg.
>
Thanks for the warning. Duly noted.
Kind regards,
Christian
From | Date | Subject | |
---|---|---|---|
Next Message | Josh Berkus | 2014-01-27 18:51:11 | Re: Standalone synchronous master |
Previous Message | Fujii Masao | 2014-01-27 18:42:05 | Re: [PATCH] Support for pg_stat_archiver view |