Re: very slow largeobject transfers through JDBC

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Mate Varga <m(at)matevarga(dot)net>
Cc: Dmitry Igrishin <dmitigr(at)gmail(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: very slow largeobject transfers through JDBC
Date: 2018-09-06 08:55:43
Message-ID: CADK3HHL+=dVadXZRNXRO69Ge=-boFV2Dfqh7Yse=Aasp+30=XQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi

Can you be more explicit how you fixed the problem ?

Thanks
Dave Cramer

davec(at)postgresintl(dot)com
www.postgresintl.com

On Thu, 6 Sep 2018 at 03:46, Mate Varga <m(at)matevarga(dot)net> wrote:

> After inlining the data, performance issues have been solved. Thanks for
> the help.
>
> On Mon, Sep 3, 2018 at 9:57 PM Mate Varga <m(at)matevarga(dot)net> wrote:
>
>> Thanks,
>> 1) we'll try to move stuff out from LOBs
>> 2) we might raise a PR for the JDBC driver
>>
>> Mate
>>
>> On Mon, 3 Sep 2018, 19:35 Dave Cramer, <pg(at)fastcrypt(dot)com> wrote:
>>
>>>
>>>
>>> On Mon, 3 Sep 2018 at 13:00, Mate Varga <m(at)matevarga(dot)net> wrote:
>>>
>>>> More precisely: when fetching 10k rows, JDBC driver just does a large
>>>> bunch of socket reads. With lobs, it's ping-pong: one read, one write per
>>>> lob...
>>>>
>>>>
>>> Ok, this is making more sense. In theory we could fetch them all but
>>> since they are LOB's we could run out of memory.
>>>
>>> Not sure what to tell you at this point. I'd entertain a PR if you were
>>> motivated.
>>>
>>> Dave Cramer
>>>
>>> davec(at)postgresintl(dot)com
>>> www.postgresintl.com
>>>
>>>
>>>
>>>>
>>>> On Mon, Sep 3, 2018 at 6:54 PM Mate Varga <m(at)matevarga(dot)net> wrote:
>>>>
>>>>> So I have detailed profiling results now. Basically it takes very long
>>>>> that for each blob, the JDBC driver reads from the socket then it creates
>>>>> the byte array on the Java side. Then it reads the next blob, etc. I guess
>>>>> this takes many network roundtrips.
>>>>>
>>>>> On Mon, Sep 3, 2018 at 5:58 PM Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>>>>
>>>>>>
>>>>>> On Mon, 3 Sep 2018 at 10:48, Mate Varga <m(at)matevarga(dot)net> wrote:
>>>>>>
>>>>>>> That's 1690 msec (1.69 seconds, and that is how long it takes to
>>>>>>> fetch 20k (small-ish) rows without LOBs (LOBs are a few lines below on the
>>>>>>> screenshot)
>>>>>>>
>>>>>>
>>>>>> that sound high as well!
>>>>>>
>>>>>> Something isn't adding up..
>>>>>>
>>>>>>
>>>>>> Dave Cramer
>>>>>>
>>>>>> davec(at)postgresintl(dot)com
>>>>>> www.postgresintl.com
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Mon, Sep 3, 2018 at 4:40 PM Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>>>>>>
>>>>>>>> the one you have highlighted ~1.69ms
>>>>>>>>
>>>>>>>> Dave Cramer
>>>>>>>>
>>>>>>>> davec(at)postgresintl(dot)com
>>>>>>>> www.postgresintl.com
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, 3 Sep 2018 at 10:38, Mate Varga <m(at)matevarga(dot)net> wrote:
>>>>>>>>
>>>>>>>>> Which frame do you refer to?
>>>>>>>>>
>>>>>>>>> On Mon, Sep 3, 2018 at 3:57 PM Dave Cramer <pg(at)fastcrypt(dot)com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Not sure why reading from a socket is taking 1ms ?
>>>>>>>>>>
>>>>>>>>>> Dave Cramer
>>>>>>>>>>
>>>>>>>>>> davec(at)postgresintl(dot)com
>>>>>>>>>> www.postgresintl.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, 3 Sep 2018 at 09:39, Mate Varga <m(at)matevarga(dot)net> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> https://imgur.com/a/ovsJPRv -- I've uploaded the profiling info
>>>>>>>>>>> (as an image, sorry). It seems this is a JDBC-level problem. I understand
>>>>>>>>>>> that the absolute timing is not meaningful at all because you don't know
>>>>>>>>>>> how large the resultset is, but I can tell that this is only a few
>>>>>>>>>>> thousands rows + few thousand largeobjects, each largeobject is around 1
>>>>>>>>>>> kByte. (Yes I know this is not a proper use of LOBs -- it's a legacy db
>>>>>>>>>>> structure that's hard to change.)
>>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>> Mate
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Sep 3, 2018 at 11:52 AM Mate Varga <m(at)matevarga(dot)net>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hey,
>>>>>>>>>>>>
>>>>>>>>>>>> we'll try to test this with pure JDBC versus hibernate. Thanks!
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Sep 3, 2018 at 11:48 AM Dave Cramer <pg(at)fastcrypt(dot)com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, 3 Sep 2018 at 03:55, Mate Varga <m(at)matevarga(dot)net>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Basically there's a class with a byte[] field, the class is
>>>>>>>>>>>>>> mapped to table T and the byte field is annotated with @Lob so it goes to
>>>>>>>>>>>>>> the pg_largeobject table.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ah, so hibernate is in the mix. I wonder if that is causing
>>>>>>>>>>>>> some challenges ?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The DB is on separate host but relatively close to the app,
>>>>>>>>>>>>>> and I can reproduce the problem locally as well. One interesting bit is
>>>>>>>>>>>>>> that turning of SSL between the app and PSQL speeds up things by at least
>>>>>>>>>>>>>> 50%.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ah, one addition -- the binary objects are encrypted, so
>>>>>>>>>>>>>> their entropy is very high.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any chance you could write a simple non-hibernate test code
>>>>>>>>>>>>> to time the code ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Dave Cramer
>>>>>>>>>>>>>
>>>>>>>>>>>>> dave(dot)cramer(at)crunchydata(dot)ca
>>>>>>>>>>>>> www.crunchydata.ca
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Mate
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Sep 2, 2018 at 12:55 AM Dave Cramer <pg(at)fastcrypt(dot)com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, 31 Aug 2018 at 10:15, Mate Varga <m(at)matevarga(dot)net>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I see -- we could try that, though we're mostly using an
>>>>>>>>>>>>>>>> ORM (Hibernate) to do this. Thanks!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Aug 31, 2018 at 3:57 PM Dmitry Igrishin <
>>>>>>>>>>>>>>>> dmitigr(at)gmail(dot)com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> пт, 31 авг. 2018 г. в 16:35, Mate Varga <m(at)matevarga(dot)net>:
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Hi,
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > we're fetching binary data from pg_largeobject table.
>>>>>>>>>>>>>>>>> The data is not very large, but we ended up storing it there. If I'm
>>>>>>>>>>>>>>>>> copying the data to a file from the psql console, then it takes X time
>>>>>>>>>>>>>>>>> (e.g. a second), fetching it through the JDBC driver takes at least 10x
>>>>>>>>>>>>>>>>> more. We don't see this difference between JDBC and 'native' performance
>>>>>>>>>>>>>>>>> for anything except largeobjects (and bytea columns, for the record).
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Does anyone have any advice about whether this can be
>>>>>>>>>>>>>>>>> tuned or what the cause is?
>>>>>>>>>>>>>>>>> I don't know what a reason of that, but I think it's
>>>>>>>>>>>>>>>>> reasonable and
>>>>>>>>>>>>>>>>> quite simple to call lo_import()/lo_export() via JNI.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Can't imagine that's any faster. The driver simply
>>>>>>>>>>>>>>> implements the protocol
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Do you have any code to share ? Any other information ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is the JDBC connection significantly further away network
>>>>>>>>>>>>>>> wise ?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dave Cramer
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> davec(at)postgresintl(dot)com
>>>>>>>>>>>>>>> www.postgresintl.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dave Page 2018-09-06 08:59:22 Re: Re: How to install pgAgent on windows for postresql-bigsql-10.5
Previous Message jimmy 2018-09-06 08:41:43 Re:Re: How to install pgAgent on windows for postresql-bigsql-10.5