Re: Processing very large TEXT columns (300MB+) using C/libpq

From: Geoff Winkless <pgsqladmin(at)geoff(dot)dj>
To: Cory Nemelka <cnemelka(at)gmail(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: Processing very large TEXT columns (300MB+) using C/libpq
Date: 2017-10-20 15:43:42
Message-ID: CAEzk6fct1KZouJAt4jdLseo+aeQGkAXfh=kZzE0Vz+8mMt7_Eg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

It's probably worth removing the iterating code Just In Case.

Apologies for egg-suck-education, but I assume you're not doing something
silly like

for (i=0; i < strlen(bigtextstring); i++) {
....
}

I know it sounds stupid, but you'd be amazed how many times that crops up,
and for small strings it doesn't matter, but for large strings it's
catastrophic.

Geoff

On 20 October 2017 at 16:16, Cory Nemelka <cnemelka(at)gmail(dot)com> wrote:

> All I am am doing is iterating through the characters so I know it isn't
> my code.
>
> --cnemelka
>
> On Fri, Oct 20, 2017 at 9:14 AM, Cory Nemelka <cnemelka(at)gmail(dot)com> wrote:
>
>> Yes, but I should be able to read them much faster. The psql client can
>> display an 11MB column in a little over a minute, while in C using libpg
>> library, it takes over an hour.
>>
>> Anyone have any experience with the same issue that can help me resolve?
>>
>> --cnemelka
>>
>> On Thu, Oct 19, 2017 at 5:20 PM, Aldo Sarmiento <aldo(at)bigpurpledot(dot)com>
>> wrote:
>>
>>> I believe large columns get put into a TOAST table. Max page size is 8k.
>>> So you'll have lots of pages per row that need to be joined with a size
>>> like that: https://www.postgresql.org/docs/9.5/static/storage-toast.html
>>>
>>> *Aldo Sarmiento*
>>> President & CTO
>>>
>>>
>>>
>>> 8687 Research Dr
>>> <https://maps.google.com/?q=8687+Research+Dr,+Irvine,+CA+92618&entry=gmail&source=g>,
>>> Irvine, CA 92618
>>> <https://maps.google.com/?q=8687+Research+Dr,+Irvine,+CA+92618&entry=gmail&source=g>
>>> *O*: (949) 223-0900 - *F: *(949) 727-4265
>>> aldo(at)bigpurpledot(dot)com | www.bigpurpledot.com
>>>
>>> On Thu, Oct 19, 2017 at 2:03 PM, Cory Nemelka <cnemelka(at)gmail(dot)com>
>>> wrote:
>>>
>>>> I have getting very poor performance using libpq to process very large
>>>> TEXT columns (300MB+). I suspect it is IO related but can't be sure.
>>>>
>>>> Anyone had experience with same issue that can help me resolve?
>>>>
>>>> --cnemelka
>>>>
>>>
>>>
>>
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Cory Nemelka 2017-10-20 15:54:51 Re: Processing very large TEXT columns (300MB+) using C/libpq
Previous Message Cory Nemelka 2017-10-20 15:16:12 Re: Processing very large TEXT columns (300MB+) using C/libpq