Re: pgarchives: Bug report + Patches: loader can't handle message in multiple lists

From: Célestin Matte <celestin(dot)matte(at)cmatte(dot)me>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: PostgreSQL WWW <pgsql-www(at)lists(dot)postgresql(dot)org>
Subject: Re: pgarchives: Bug report + Patches: loader can't handle message in multiple lists
Date: 2023-06-12 15:30:29
Message-ID: f4782e00-6c92-642a-df64-1061b1df32eb@cmatte.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

> OK, I think I need to take a step back to try to figure out what's
> actually wrong here.
>
> What is it you are actually trying to accomplish? Are you trying to
> remove the single-store functionality for emails? As mentioned, the
> whole design of the system is done around that a single email should
> only be stored once - but it sounds like you're removing that for some
> reason? Or am I misunderstanding?

Yes, because during my tests, the import script crashed when trying to import a message that belongs to two different lists at once.
I'm having trouble reproducing the crash as properly cleaning the database would require some work. Not sure it's the crash I had back then, but I now have something like this:
Traceback (most recent call last):
File "/path/pgarchives/local//loader/load_message.py", line 158, in <module>
ap.store(conn, listid, opt.overwrite, opt.overwrite)
File "/path/pgarchives/local/loader/lib/storage.py", line 216, in store
'listid': listid,
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "list_threads_pkey"
DETAIL: Key (threadid)=(21) already exists.

After a bit of digging, the workaround I came around with was to store each email once *for each list* instead of just once (which did not work).

I just performed more tests, and this does not seem to impact regular import of incoming message. I guess the impact is limited.

> It sounds to me like your patch that bypasses the check is either
> broken or misguided, since that's what actually causes the problem?
>
>
>>> I'm not sure I like the idea of committing after every message and
>>> then rolling back in the event of an error. If nothing else, this
>>> should be done using a savepoint instead of a complete transaction.
>>
>> Attached
>

--
Célestin Matte

In response to

Responses

Browse pgsql-www by date

  From Date Subject
Next Message Magnus Hagander 2023-06-12 16:48:29 Re: Planet PostgreSQL blog posted twice on postgresql.org
Previous Message David Steele 2023-06-12 14:26:43 Planet PostgreSQL blog posted twice on postgresql.org