Re: pg_upgrade with large pg_largeobject table

From: Mate Varga <m(at)matevarga(dot)net>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: pg_upgrade with large pg_largeobject table
Date: 2018-08-14 18:06:36
Message-ID: CAK4GaZ782vw5vDSV0BEhGVu62GVyTsnHeRctYNzrtMiBwmJ4Dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks.

> You mean 250M rows in pg_largeobject itself, or 250M large objects
(that is, 250M rows in pg_largeobject_metadata)?

250M large objects.

> Are you sure you're using a 64-bit build of pg_dump?

file /usr/lib/postgresql/10/bin/pg_dump
/usr/lib/postgresql/10/bin/pg_dump: ELF 64-bit LSB shared object, x86-64,
version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux
2.6.24, BuildID[sha1]=4ff6bf0dd628d151481256723994dd50785e54e9, stripped
Also it really eats all the memory.

> Using the large-object API for things that tend to not actually be very
large (which they aren't, if you've got hundreds of millions of 'em) is an
antipattern, I'm afraid.

I know :( So maybe I need to do some refactoring in the application and
inline the lobs. The data is binary data with very high entropy (encrypted
stuff). Would you recommend bytea for that?

Thanks again.
Mate

On Tue, Aug 14, 2018 at 7:58 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Mate Varga <m(at)matevarga(dot)net> writes:
> > We have a PSQL 9.5 DB with 16G physical RAM and ~ 1 TB data mostly stored
> > in the pg_largeobject system table. This table has 250M rows at the
> moment.
>
> You mean 250M rows in pg_largeobject itself, or 250M large objects
> (that is, 250M rows in pg_largeobject_metadata)?
>
> > This command
> > fails because of an OOM. Logs say:
> > pg_dump: [archiver (db)] query failed: out of memory for query result
> > pg_dump: [archiver (db)] query was: SELECT oid, (SELECT rolname FROM
> > pg_catalog.pg_roles WHERE oid = lomowner) AS rolname, lomacl, NULL AS
> > rlomacl, NULL AS initlomacl, NULL AS initrlomacl FROM
> > pg_largeobject_metadata
>
> The selected rows shouldn't be very wide here, so I'm a bit surprised
> you are hitting OOM, even if there are 250M rows to read. Are you sure
> you're using a 64-bit build of pg_dump?
>
> > Would it help if we'd inline these largeobjects as e.g. text cols
> (instead
> > of storing them as lobs)?
>
> Yes, because then they'd not have their own ownership and permissions
> for pg_dump to keep track of. (You might want bytea instead of text,
> depending on what the data is.)
>
> Using the large-object API for things that tend to not actually be very
> large (which they aren't, if you've got hundreds of millions of 'em)
> is an antipattern, I'm afraid. You could get away with it before we
> added per-largeobject permissions, but now it's a problem for pg_dump.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2018-08-14 18:16:15 Re: pg_upgrade with large pg_largeobject table
Previous Message Jerry Sievers 2018-08-14 18:00:27 Re: Duplicating data folder without tablespace, for read access