Quick Links

Re: plpgsql function with offset - Postgres 9.1

From:	Patrick B <patrickbakerbr(at)gmail(dot)com>
To:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: plpgsql function with offset - Postgres 9.1
Date:	2017-06-15 22:19:45
Message-ID:	CAJNY3iud3BehfEBCamyn_ou9pPT=1UUGai0b5ore1Q-VPt5OpA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

2017-05-29 19:27 GMT+12:00 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:

> Patrick B wrote:
> > I am running a background task on my DB, which will copy data from
> tableA to tableB. For
> > that, I'm writing a PL/PGSQL function which basically needs to do the
> following:
> >
> >
> > 1. Select the data from tableA
> > 2. The limit will be put when calling the function
> > 3. insert the selected data on Step 1 onto new table
> >
> > Question:
> >
> > * When I stop it and start it again, how can the query "know" that
> it has already
> > processed some rows so it won't do it twice on the same rows? If it
> stopped on row number
> > 100, I need it to continue on row number 101, for example.
> >
> > * How can I ask the function to return the number of processed rows?
> >
> >
> > I can add a column on TableB if needed, but not on tableA.
> >
> > This is what I've done so far:
>
> > CREATE or REPLACE FUNCTION data_copy(rows integer)
> > RETURNS SETOF bigint AS $$
>
> CREATE or REPLACE FUNCTION data_copy(p_limit integer, p_offset integer)
> RETURNS integer;
>
> > declare
> > row record;
> > offset_num integer;
>
> num_rows integer := 0;
>
> > BEGIN
> > FOR row IN EXECUTE '
> > SELECT
> > id,
> > path,
> > name,
> > name_last,
> > created_at
> > FROM
> > tablea
> > WHERE
> > ready = true
> > ORDER BY 1 LIMIT ' || rows || ' OFFSET ' ||
> rows || ''
>
> '... LIMIT ' || p_limit || ' OFFSET ' || p_offset
>
> > LOOP
>
> num_rows := num_rows + 1;
>
> > INSERT INTO tableB (id,path,name,name_last,created_at)
> > VALUES (row.id,row.path,row.name,row.
> name_last,row.created_at);
> >
> > END LOOP;
>
> RETURN num_rows;
>
> > END
> > $$ language 'plpgsql';
>
> There are two problems with this approach:
>
> 1. It will do the wrong thing if rows are added or deleted in "tablea"
> while
> you process it.
>

There will be actually records being inserted in tablea while processing
the migration.... Any ideas here?

I can add another column in tablea, like example: row_migrated boolean -->
if that helps

>
> 2. Queries with hight OFFSET values have bad performance.
>

No problem. The plan is to perform 2k rows at once, which is not much.

>
> The solution is to avoid OFFSET and to use "keyset pagination":
> http://use-the-index-luke.com/no-offset
>
>
>

Thanks
Patrick

In response to

Re: plpgsql function with offset - Postgres 9.1 at 2017-05-29 07:27:39 from Albe Laurenz

Responses

Re: plpgsql function with offset - Postgres 9.1 at 2017-06-15 22:35:00 from David G. Johnston
Re: plpgsql function with offset - Postgres 9.1 at 2017-06-16 05:29:33 from Peter J. Holzer

Browse pgsql-general by date

	From	Date	Subject
Next Message	David G. Johnston	2017-06-15 22:35:00	Re: plpgsql function with offset - Postgres 9.1
Previous Message	Tom Lane	2017-06-15 22:16:14	Re: Connection options