RE: pg_dump out of memory for large table with LOB

From: Jean-Marc Lessard <Jean-Marc(dot)Lessard(at)ultra-ft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: RE: pg_dump out of memory for large table with LOB
Date: 2018-11-21 18:45:28
Message-ID: 8FC5F25FF3EC4744ADFCF20CBA3F44BE016811CABE@SRV-CAMTL-EXCH2.Forensictech.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks Tom Lane for your answer
Same issue for pg_dump and LOB was also reported in https://postgrespro.com/list/thread-id/2211186

Concerning the 25% waste of space we experienced with LOB:
We are using LOB because we save jpeg2000 images into the DB.
We display them as thumbnails in a 6x10 multi-viewer (60 images displayed at a time) that the user can scroll to next ones.
When retrieving images into the multi-viewer, we only read the first 30% of the jpeg2000.
Even if jpeg2000 images are relatively small (about 20MB), retrieving thousands of images partially greatly speedup the display.

src/include/storage/large_object.h:
/*
* Each "page" (tuple) of a large object can hold this much data
*
* We could set this as high as BLCKSZ less some overhead, but it seems
* better to make it a smaller value, so that not as much space is used
* up when a page-tuple is updated. Note that the value is deliberately
* chosen large enough to trigger the tuple toaster, so that we will
* attempt to compress page tuples in-line. (But they won't be moved off
* unless the user creates a toast-table for pg_largeobject...)
*
* Also, it seems to be a smart move to make the page size be a power of 2,
* since clients will often be written to send data in power-of-2 blocks.
* This avoids unnecessary tuple updates caused by partial-page writes.
*
* NB: Changing LOBLKSIZE requires an initdb.
*/
#define LOBLKSIZE (BLCKSZ / 4)

Here is my understanding, please correct me if I am wrong.
If the data is compressible, each 2k (2048) uncompressed pieces should be compressed to pieces smaller than 2000 bytes, so that 4 rows and more can fit in 8k data block.
In our case as jpeg2000 images are already compressed, the 2K pieces remain 2048 bytes after LO compression and only 3 pieces can fit in a block.
Can we change the LOBLKSIZE to 2000 bytes to fit 4 rows in a block as follow?
#define LOBLKSIZE 2000
Must the LOBLKSIZE be a power-of-2?
Is there any optimization expecting a power-of-2 value?

Thank you.

Jean-Marc Lessard
Administrateur de base de données / Database Administrator
Ultra Electronics Forensic Technology Inc.
T +1 514 489 4247 x4164
www.ultra-forensictechnology.com<http://www.ultra-forensictechnology.com>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Rene Romero Benavides 2018-11-21 19:18:08 Re: replication lag despite corrective config
Previous Message Alessandro Aste 2018-11-21 17:45:57 Parallel query and number of connections.