Re: How batch processing works

From: Lok P <loknath(dot)73(at)gmail(dot)com>
To: Michał Kłeczek <michal(at)kleczek(dot)org>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: How batch processing works
Date: 2024-09-21 11:14:08
Message-ID: CAKna9Vbt1VJu7Oa8FTWasgby+-kJn7omOhbfmWzkdpVwBiqNzQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, Sep 21, 2024 at 9:51 AM Michał Kłeczek <michal(at)kleczek(dot)org> wrote:

> Hi,
>
> > On 19 Sep 2024, at 07:30, Lok P <loknath(dot)73(at)gmail(dot)com> wrote:
> >
> [snip]
> >
> > Method-4
> >
> > INSERT INTO parent_table VALUES (1, 'a'), (2, 'a');
> > INSERT INTO child_table VALUES (1,1, 'a'), (1,2, 'a');
> > commit;
>
> I’ve done some batch processing of JSON messages from Kafka in Java.
> By far the most performant way was to:
>
> 1. Use prepared statements
> 2. Parse JSON messages in Postgres
> 3. Process messages in batches
>
> All three can be achieved by using arrays to pass batches:
>
> WITH parsed AS (
> SELECT msg::json FROM unnest(?)
> ),
> parents AS (
> INSERT INTO parent SELECT … FROM parsed RETURNING ...
> )
> INSERT INTO child SELECT … FROM parsed…
>
> Not the single parameter that you can bind to String[]
>
> Hope that helps.
>
>
Got your point.
But wondering why we don't see any difference in performance between
method-2 and method-3 above. So does it mean that,I am testing this in a
wrong way or it's the expected behaviour and thus there is no meaning in
converting the row by row inserts into a bulk insert, but just changing the
commit frequency will do the same job in a row by row insert approach?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Peter J. Holzer 2024-09-21 12:48:09 Re: Clarify this MERGE warning? "Only columns from the target table that attempt to match data_source rows should appear in join_condition."
Previous Message veem v 2024-09-21 09:36:45 Re: IO related waits