Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.

From: MichaelDBA <MichaelDBA(at)sqlexec(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Gunther <raj(at)gusw(dot)net>, "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.
Date: 2019-02-25 21:07:30
Message-ID: 5C745912.1080200@sqlexec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Was wondering when that would come up, taking queuing logic outside the
database. Can be overly painful architecting queuing logic in
relational databases. imho.

Regards,
Michael Vitale

> Jeff Janes <mailto:jeff(dot)janes(at)gmail(dot)com>
> Monday, February 25, 2019 3:30 PM
> On Sat, Feb 23, 2019 at 4:06 PM Gunther <raj(at)gusw(dot)net
> <mailto:raj(at)gusw(dot)net>> wrote:
>
> Hi,
>
> I am using an SQL queue for distributing work to massively
> parallel workers.
>
> You should look into specialized queueing software.
>
> ...
>
> I figured I might just pause all workers briefly to schedule the
> REINDEX Queue command, but the problem with this is that while the
> transaction volume is large, some jobs may take minutes to
> process, and in that case we need to wait minutes to quiet the
> database with then 47 workers sitting as idle capacity waiting for
> the 48th to finish so that the index can be rebuilt!
>
> The jobs that take minutes are themselves the problem. They prevent
> tuples from being cleaned up, meaning all the other jobs needs to
> grovel through the detritus every time they need to claim a new row.
> If you got those long running jobs to end, you probably wouldn't even
> need to reindex--the problem would go away on its own as the
> dead-to-all tuples get cleaned up.
>
> Locking a tuple and leaving the transaction open for minutes is going
> to cause no end of trouble on a highly active system. You should look
> at a three-state method where the tuple can be
> pending/claimed/finished, rather than pending/locked/finished. That
> way the process commits immediately after claiming the tuple, and then
> records the outcome in another transaction once it is done
> processing. You will need a way to detect processes that failed after
> claiming a row but before finishing, but implementing that is going to
> be easier than all of this re-indexing stuff you are trying to do
> now. You would claim the row by updating a field in it to have
> something distinctive about the process, like its hostname and pid, so
> you can figure out if it is still running when it comes time to clean
> up apparently forgotten entries.
>
> Cheers,
>
> Jeff

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message support@mekong.be 2019-02-25 23:22:39 Re: Query slow for new participants
Previous Message Jeff Janes 2019-02-25 21:03:55 Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.