Re: Compress ReorderBuffer spill files using LZ4

From: Julien Tachoires <julmon(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Compress ReorderBuffer spill files using LZ4
Date: 2024-06-06 19:31:25
Message-ID: CAFEQCbGfmWUxOQUYQUCCOzuv=fDJwyvi590Ay=gd+Cnj+DHQxg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le jeu. 6 juin 2024 à 06:40, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> a écrit :
>
> On Thu, Jun 6, 2024 at 6:22 PM Julien Tachoires <julmon(at)gmail(dot)com> wrote:
> >
> > Le jeu. 6 juin 2024 à 04:13, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> a écrit :
> > >
> > > On Thu, Jun 6, 2024 at 4:28 PM Julien Tachoires <julmon(at)gmail(dot)com> wrote:
> > > >
> > > > When the content of a large transaction (size exceeding
> > > > logical_decoding_work_mem) and its sub-transactions has to be
> > > > reordered during logical decoding, then, all the changes are written
> > > > on disk in temporary files located in pg_replslot/<slot_name>.
> > > > Decoding very large transactions by multiple replication slots can
> > > > lead to disk space saturation and high I/O utilization.
> > > >
> > >
> > > Why can't one use 'streaming' option to send changes to the client
> > > once it reaches the configured limit of 'logical_decoding_work_mem'?
> >
> > That's right, setting subscription's option 'streaming' to 'on' moves
> > the problem away from the publisher to the subscribers. This patch
> > tries to improve the default situation when 'streaming' is set to
> > 'off'.
> >
>
> Can we think of changing the default to 'parallel'? BTW, it would be
> better to use 'parallel' for the 'streaming' option, if the workload
> has large transactions. Is there a reason to use a default value in
> this case?

You're certainly right, if using the streaming API helps to avoid bad
situations and there is no downside, it could be used by default.

> > > > 2. Do we want a GUC to switch compression on/off?
> > > >
> > >
> > > It depends on the overhead of decoding. Did you try to measure the
> > > decoding overhead of decompression when reading compressed files?
> >
> > Quick benchmarking executed on my laptop shows 1% overhead.
> >
>
> Thanks. We probably need different types of data (say random data in
> bytea column, etc.) for this.

Yes, good idea, will run new tests in that sense.

Thank you!

Regards,

JT

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-06-06 19:31:53 Re: problems with "Shared Memory and Semaphores" section of docs
Previous Message Jelte Fennema-Nio 2024-06-06 19:27:36 Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs