Quick Links

Re: Bug in copy

From:	me nefcanto <sn(dot)1361(at)gmail(dot)com>
To:	Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
Cc:	Zhang Mingli <zmlpostgres(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject:	Re: Bug in copy
Date:	2025-02-24 14:09:14
Message-ID:	CAEHBEODGy0j4=reeVEqxBrdGk3ym_D3TQW+dS4JbGk9Go4sm4Q@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-bugs

Sorry for the long delay.

Let's analyze the scenario of fake data insertion. I want to create a
million fake products, sometimes even 100 million (we're on MariaDB now and
we plan to migrate to Postgres). My team uses fake data for performance
tests and other use cases. Now there could be literally now way to sanitize
those records
Another scenario is translations. Even in production we have translation
files for more than 20 languages, and for more than 2 thousand keys. That
means we need to insert 40 thousand translation records in the production.
Another scenario is updating nested model values for a large hierarchical
table. For example, the categories table. Anytime the user changes a record
in that table we need to recalculate the nested model for the entire
categories and bulk update the results.

The point is, the database schema is not in our hands. We don't know what
rules exist on each table and what rules change. And it's not practical and
feasible to spend resources on keeping our bulk insertion logic with the
database changes.

It's a good design that Postgres add a catch-all handler for each row and
report accordingly. Give it 1 million records, and it should give you back
1 million results.

Is there a problem in implementing this? After all one expects the most
advanced open source database to support this real-world requirement.

Regards
Saeed

On Sun, Feb 9, 2025 at 8:09 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
wrote:

> On Sun, 2025-02-09 at 16:00 +0330, me nefcanto wrote:
> > @laurenz if I use `insert into` or the `merge` would I be able to bypass
> records
> > with errors? Or would I fail there too? I mean there are lots of ways a
> record
> > can be limited. Unique indexes, check constraints, foreign key
> constraints, etc.
> > What happens in those cases?
>
> With INSERT ... ON CONFLICT, you can only handle primar and unique key
> violations.
> MERGE allows some more freedom, but it also only checks for rows that
> match existing
> rows.
>
> You won't find a command that ignores or handles arbitrary kinds of errors.
> You have to figure out what kinds of errors you expect and handle them
> explicitly
> by running queries against the data.
>
> I don't think that a catch-it-all handler that handles all errors would be
> very
> useful. Normally, there are certain errors you want to tolerate, while
> others
> should be considered unrecoverable and lead to errors.
>
> Yours,
> Laurenz Albe
>

In response to

Re: Bug in copy at 2025-02-09 16:39:47 from Laurenz Albe

Responses

Re: Bug in copy at 2025-02-24 15:55:21 from Greg Sabino Mullane

Browse pgsql-bugs by date

	From	Date	Subject
Next Message	Tom Lane	2025-02-24 14:48:27	Re: BUG #18822: mailing lists reject mails due to DKIM-signature
Previous Message	Sandeep Thakkar	2025-02-24 10:58:00	Re: error -10825