From: | Dilip Kumar <dilipbalaut(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, David Steele <david(at)pgmasters(dot)net>, Ildus Kurbangaliev <i(dot)kurbangaliev(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [HACKERS] Custom compression methods |
Date: | 2021-02-09 08:38:07 |
Message-ID: | CAFiTN-tZgTTejUb1tU+=LFUr5LrCbLw=x8Bxz8O0ZjN+U=Cu4w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Feb 7, 2021 at 5:15 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Feb 5, 2021 at 8:11 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Wed, Feb 3, 2021 at 2:07 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > >
> > > Even more review comments, still looking mostly at 0001:
> > >
> > > If there's a reason why parallel_schedule is arranging to run the
> > > compression test in parallel with nothing else, the comment in that
> > > file should explain the reason. If there isn't, it should be added to
> > > a parallel group that doesn't have the maximum number of tests yet,
> > > probably the last such group in the file.
> > >
> > > serial_schedule should add the test in a position that roughly
> > > corresponds to where it appears in parallel_schedule.
> >
> > Done
> >
> > > I believe it's relatively standard practice to put variable
> > > declarations at the top of the file. compress_lz4.c and
> > > compress_pglz.c instead put those declarations nearer to the point of
> > > use.
> >
> > Do you mean pglz_compress_methods and lz4_compress_methods ? I
> > followed that style from
> > heapam_handler.c. If you think that doesn't look good then I can move it up.
> >
> > > compressamapi.c has an awful lot of #include directives for the code
> > > it actually contains. I believe that we should cut that down to what
> > > is required by 0001, and other patches can add more later as required.
> > > In fact, it's tempting to just get rid of this .c file altogether and
> > > make the two functions it contains static inline functions in the
> > > header, but I'm not 100% sure that's a good idea.
> >
> > I think it looks better to move them to compressamapi.h so done that.
> >
> > > The copyright dates in a number of the file headers are out of date.
> >
> > Fixed
> >
> > > binary_upgrade_next_pg_am_oid and the related changes to
> > > CreateAccessMethod don't belong in 0001, because it doesn't support
> > > non-built-in compression methods. These changes and the related
> > > pg_dump change should be moved to the patch that adds support for
> > > that.
> >
> > Fixed
> >
> > > The comments added to dumpTableSchema() say that "compression is
> > > assigned by ALTER" but don't give a reason. I think they should. I
> > > don't know how much they need to explain about what the code does, but
> > > they definitely need to explain why it does it. Also, isn't this bad?
> > > If we create the column with the wrong compression setting initially
> > > and then ALTER it, we have to rewrite the table. If it's empty, that's
> > > cheap, but it'd still be better not to do it at all.
> >
> > Yeah, actually that part should go in 0003 patch where we implement
> > the custom compression method.
> > in that patch we need to alter and set because we want to keep the
> > preserved method as well
> > So I will add it there
> >
> > > I'm not sure it's a good idea for dumpTableSchema() to leave out
> > > specifying the compression method if it happens to be pglz. I think we
> > > definitely shouldn't do it in binary-upgrade mode. What if we changed
> > > the default in a future release? For that matter, even 0002 could make
> > > the current approach unsafe.... I think, anyway.
> >
> > Fixed
> >
> >
> > > The changes to pg_dump.h look like they haven't had a visit from
> > > pgindent. You should probably try to do that for the whole patch,
> > > though it's a bit annoying since you'll have to manually remove
> > > unrelated changes to the same files that are being modified by the
> > > patch. Also, why the extra blank line here?
> >
> > Fixed, ran pgindent for other files as well.
> >
> > > GetAttributeCompression() is hard to understand. I suggest changing
> > > the comment to "resolve column compression specification to an OID"
> > > and somehow rejigger the code so that you aren't using one not-NULL
> > > test and one NULL test on the same variable. Like maybe change the
> > > first part to if (!IsStorageCompressible(typstorage)) { if
> > > (compression == NULL) return InvalidOid; ereport(ERROR, ...); }
> >
> > Done
> >
> > > It puzzles me that CompareCompressionMethodAndDecompress() calls
> > > slot_getallattrs() just before clearing the slot. It seems like this
> > > ought to happen before we loop over the attributes, so that we don't
> > > need to call slot_getattr() every time. See the comment for that
> > > function. But even if we didn't do that for some reason, why would we
> > > do it here? If it's already been done, it shouldn't do anything, and
> > > if it hasn't been done, it might overwrite some of the values we just
> > > poked into tts_values. It also seems suspicious that we can get away
> > > with clearing the slot and then again marking it valid. I'm not sure
> > > it really works like that. Like, can't clearing the slot invalidate
> > > pointers stored in tts_values[]? For instance, if they are pointers
> > > into an in-memory heap tuple, tts_heap_clear() is going to free the
> > > tuple; if they are pointers into a buffer, tts_buffer_heap_clear() is
> > > going to unpin it. I think the supported procedure for this sort of
> > > thing is to have a second slot, set tts_values, tts_isnull etc. and
> > > then materialize the slot. After materializing the new slot, it's
> > > independent of the old slot, which can then be cleared. See for
> > > example tts_virtual_materialize(). The whole approach you've taken
> > > here might need to be rethought a bit. I think you are right to want
> > > to avoid copying everything over into a new slot if nothing needs to
> > > be done, and I think we should definitely keep that optimization, but
> > > I think if you need to copy stuff, you have to do the above procedure
> > > and then continue using the other slot instead of the original one.
> > > Some places I think we have functions that return either the original
> > > slot or a different one depending on how it goes; that might be a
> > > useful idea here. But, you also can't just spam-create slots; it's
> > > important that whatever ones we end up with get reused for every
> > > tuple.
> >
> > I have changed this algorithm, so now if we have to decompress
> > anything we will use the new slot and we will stick that new slot to
> > the ModifyTableState, DR_transientrel for matviews and DR_intorel for
> > CTAS. Does this looks okay or we need to do something else? If this
> > logic looks fine then maybe we can think of some more optimization and
> > cleanup in this function.
> >
> >
> > > Doesn't the change to describeOneTableDetails() require declaring
> > > changing the declaration of char *headers[11] to char *headers[12]?
> > > How does this not fail Assert(cols <= lengthof(headers))?
> >
> > Fixed
> >
> > > Why does describeOneTableDetais() arrange to truncate the printed
> > > value? We don't seem to do that for the other column properties, and
> > > it's not like this one is particularly long.
> >
> > Not required, fixed.
> >
> > > Perhaps the changes to pg_am.dat shouldn't remove the blank line?
> >
> > Fixed
> >
> > > I think the comment to pg_attribute.h could be rephrased to stay
> > > something like: "OID of compression AM. Must be InvalidOid if and only
> > > if typstorage is 'a' or 'b'," replacing 'a' and 'b' with whatever the
> > > right letters are. This would be shorter and I think also clearer than
> > > what you have
> >
> > Fixed
> >
> > > The first comment change in postgres.h is wrong. You changed
> > > va_extsize to "size in va_extinfo" but the associated structure
> > > definition is unchanged, so the comment shouldn't be changed either.
> >
> > Yup, not required.
> >
> > > In toast_internals.h, you end using 30 as a constant several times but
> > > have no #define for it. You do have a #define for RAWSIZEMASK, but
> > > that's really a derived value from 30. Also, it's not a great name
> > > because it's kind of generic. So how about something like:
> > >
> > > #define TOAST_RAWSIZE_BITS 30
> > > #define TOAST_RAWSIZE_MASK ((1 << (TOAST_RAW_SIZE_BITS + 1)) - 1)
> > >
> > > But then again on second thought, this 30 seems to be the same 30 that
> > > shows up in the changes to postgres.h, and there again 0x3FFFFFFF
> > > shows up too. So maybe we should actually be defining these constants
> > > there, using names like VARLENA_RAWSIZE_BITS and VARLENA_RAWSIZE_MASK
> > > and then having toast_internals.h use those constants as well.
> >
> > Done, IMHO it should be
> > #define VARLENA_RAWSIZE_BITS 30
> > #define VARLENA_RAWSIZE_MASK ((1 << VARLENA_RAWSIZE_BITS) -1 )
> >
> >
> > > Taken with the email I sent yesterday, I think this is a more or less
> > > complete review of 0001. Although there are a bunch of things to fix
> > > here still, I don't think this is that far from being committable. I
> > > don't at this point see too much in terms of big design problems.
> > > Probably the CompareCompressionMethodAndDecompress() is the closest to
> > > a design-level problem, and certainly something needs to be done about
> > > it, but even that is a fairly localized problem in the context of the
> > > entire patch.
> >
> > 0001 is attached, now pending parts are
> >
> > - Confirm the new design of CompareCompressionMethodAndDecompress
> > - Performance test, especially lz4 with small varlena
>
> I have tested the performance, pglz vs lz4
>
> Test1: With a small simple string, pglz doesn't select compression but
> lz4 select as no min limit
> Table: 100 varchar column
> Test: Insert 1000 tuple, each column of 25 bytes string (32 is min
> limit for pglz)
> Result:
> pglz: 1030 ms (doesn't attempt compression so externalize),
> lz4: 212 ms
>
> Test2: With small incompressible string, pglz don't select compression
> lz4 select but can not compress
> Table: 100 varchar column
> Test: Insert 1000 tuple, each column of 25 bytes string (32 is min
> limit for pglz)
> Result:
> pglz: 1030 ms (doesn't attempt compression so externalize),
> lz4: 1090 ms (attempt to compress but externalize):
>
> Test3: Test a few columns with large random data
> Table: 3 varchar column
> Test: Insert 1000 tuple 3 columns size(3500 byes, 4200 bytes, 4900bytes)
> pglz: 150 ms (compression ratio: 3.02%),
> lz4: 30 ms (compression ratio : 2.3%)
>
> Test4: Test3 with different large random slighly compressible, need to
> compress + externalize:
> Table: 3 varchar column
> Insert: Insert 1000 tuple 3 columns size(8192, 8192, 8192)
> CREATE OR REPLACE FUNCTION large_val() RETURNS TEXT LANGUAGE SQL AS
> 'select array_agg(md5(g::text))::text from generate_series(1, 256) g';
> Test: insert into t1 select large_val(), large_val(), large_val() from
> generate_series(1,1000);
> pglz: 2000 ms
> lz4: 1500 ms
>
> Conclusion:
> 1. In most cases lz4 is faster and doing better compression as well.
> 2. In Test2 when small data is incompressible then lz4 tries to
> compress whereas pglz doesn't try so there is some performance loss.
> But if we want we can fix
> it by setting some minimum limit of size for lz4 as well, maybe the
> same size as pglz?
>
> > - Rebase other patches atop this patch
> > - comment in ddl.sgml
>
> Other changes in patch:
> - Now we are dumping the default compression method in the
> binary-upgrade mode so the pg_dump test needed some change, fixed
> that.
> - in compress_pglz.c and compress_lz4.c, we were using
> toast_internal.h macros so I removed and used varlena macros instead.
>
> While testing, I noticed that if the compressed data are externalized
> then pg_column_compression(), doesn't fetch the compression method
> from the toast chunk, I think we should do that. I will analyze this
> and fix it in the next version.
While trying to fix this, I have realized this problem exists in
CompareCompressionMethodAndDecompress
see below code.
---
+ new_value = (struct varlena *)
+ DatumGetPointer(slot->tts_values[attnum - 1]);
+
+ /* nothing to be done, if it is not compressed */
+ if (!VARATT_IS_COMPRESSED(new_value))
+ continue;
---
Basically, we are just checking whether the stored value is compressed
or not, but we are clearly ignoring the fact that it might be
compressed and stored externally on disk. So basically if the value
is stored externally we can not know whether the external data were
compressed or not without fetching the values from the toast table, I
think instead of fetching the complete data from toast we can just
fetch the header using 'toast_fetch_datum_slice'.
Any other thoughts on this?
--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | torikoshia | 2021-02-09 08:48:55 | Re: adding wait_start column to pg_locks |
Previous Message | Amit Langote | 2021-02-09 08:38:06 | Re: ModifyTable overheads in generic plans |