Re: Read table rows in chunks

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Sushrut Shivaswamy <sushrut(dot)shivaswamy(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Read table rows in chunks
Date: 2024-04-27 17:46:16
Message-ID: CAKFQuwbRbO3qAQis113P+WnemfGF_pPNHi9TzZTs0A3qu=_ETw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 27, 2024 at 12:47 AM Sushrut Shivaswamy <
sushrut(dot)shivaswamy(at)gmail(dot)com> wrote:

>
> I"m trying to read the rows of a table in chunks to process them in a
> background worker.
>

This list really isn't the place for this kind of discussion. You are
doing application-level stuff, not working on patches for core. General
discussions and questions like this should be directed to the -general
mailing list.

I want to ensure that each row is processed only once.
>

Is failure during processing possible?

> I was thinking of using the `SELECT * ... OFFSET {offset_size} LIMIT
> {limit_size}` functionality for this but I"m running into issues.
>

FOR UPDATE and SKIPPED LOCKED clauses usually come into play for this use
case.

> Can you please suggest any alternative to periodically read rows from a
> table in chunks while processing each row exactly once.
>
>
I think you are fooling yourself if you think you can do this without
writing back to the row the fact it has been processed. At which point
ensuring that you only retrieve and process unprocessed rows is trivial -
just don't select ones with a status of processed.

If adding a column to the data is not possible, record processed row
identifiers into a separate table and inspect that.

DavId J.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-04-27 18:32:19 Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?
Previous Message Kashif Zeeshan 2024-04-27 17:04:27 Re: Read table rows in chunks