Re: Speeding up COPY TO for uuids and arrays

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Speeding up COPY TO for uuids and arrays
Date: 2024-02-26 14:26:27
Message-ID: CAEudQApk7pe1JqDEGXeJZFT_kaY44kLE8THcpE8xgZWP2BVxsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em seg., 26 de fev. de 2024 às 02:28, Michael Paquier <michael(at)paquier(dot)xyz>
escreveu:

> On Thu, Feb 22, 2024 at 04:42:37PM -0300, Ranier Vilela wrote:
> > Can you share exactly script used to create a table?
>
> Stressing the internals of array_out() for the area of the patch is
> not that difficult, as we want to quote each element that's returned
> in output.
>
> The trick is to have the following to stress the second quoting loop a
> maximum:
> - a high number of rows.
> - a high number of items in the arrays.
> - a *minimum* number of characters in each element of the array, with
> characters that require quoting.
>
> The best test case I can think of to demonstrate the patch would be
> something like that (adjust rows and elts as you see fit):
> -- Number of rows
> \set rows 6
> -- Number of elements
> \set elts 4
> create table tab as
> with data as (
> select array_agg(a) as array
> from (
> select '{'::text
> from generate_series(1, :elts) as int(a)) as index(a))
> select data.array from data, generate_series(1,:rows);
>
> Then I get:
> array
> -------------------
> {"{","{","{","{"}
> {"{","{","{","{"}
> {"{","{","{","{"}
> {"{","{","{","{"}
> {"{","{","{","{"}
> {"{","{","{","{"}
> (6 rows)
>
> With "\set rows 100000" and "\set elts 10000", giving 100MB of data
> with 100k rows with 10k elements each, I get for HEAD when data is in
> shared buffers:
> =# copy tab to '/dev/null';
> COPY 100000
> Time: 48620.927 ms (00:48.621)
> And with v3:
> =# copy tab to '/dev/null';
> COPY 100000
> Time: 47993.183 ms (00:47.993)
>
Thanks Michael, for the script.

It is easier to make comparisons, using the exact same script.

best regards,
Ranier Vilela

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Malakhov 2024-02-26 14:29:49 Re: Shared detoast Datum proposal
Previous Message Masahiko Sawada 2024-02-26 14:23:42 Re: Improve eviction algorithm in ReorderBuffer