From: | Dave Cramer <pg(at)fastcrypt(dot)com> |
---|---|
To: | m(at)matevarga(dot)net |
Cc: | dmitigr(at)gmail(dot)com, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: very slow largeobject transfers through JDBC |
Date: | 2018-09-03 17:35:30 |
Message-ID: | CADK3HH+ebzr05+_PZqdR3t4=15VVRk8Haw=a=hArocfBYgbKEQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Mon, 3 Sep 2018 at 13:00, Mate Varga <m(at)matevarga(dot)net> wrote:
> More precisely: when fetching 10k rows, JDBC driver just does a large
> bunch of socket reads. With lobs, it's ping-pong: one read, one write per
> lob...
>
>
Ok, this is making more sense. In theory we could fetch them all but since
they are LOB's we could run out of memory.
Not sure what to tell you at this point. I'd entertain a PR if you were
motivated.
Dave Cramer
davec(at)postgresintl(dot)com
www.postgresintl.com
>
> On Mon, Sep 3, 2018 at 6:54 PM Mate Varga <m(at)matevarga(dot)net> wrote:
>
>> So I have detailed profiling results now. Basically it takes very long
>> that for each blob, the JDBC driver reads from the socket then it creates
>> the byte array on the Java side. Then it reads the next blob, etc. I guess
>> this takes many network roundtrips.
>>
>> On Mon, Sep 3, 2018 at 5:58 PM Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>
>>>
>>> On Mon, 3 Sep 2018 at 10:48, Mate Varga <m(at)matevarga(dot)net> wrote:
>>>
>>>> That's 1690 msec (1.69 seconds, and that is how long it takes to fetch
>>>> 20k (small-ish) rows without LOBs (LOBs are a few lines below on the
>>>> screenshot)
>>>>
>>>
>>> that sound high as well!
>>>
>>> Something isn't adding up..
>>>
>>>
>>> Dave Cramer
>>>
>>> davec(at)postgresintl(dot)com
>>> www.postgresintl.com
>>>
>>>
>>>
>>>>
>>>> On Mon, Sep 3, 2018 at 4:40 PM Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>>>
>>>>> the one you have highlighted ~1.69ms
>>>>>
>>>>> Dave Cramer
>>>>>
>>>>> davec(at)postgresintl(dot)com
>>>>> www.postgresintl.com
>>>>>
>>>>>
>>>>> On Mon, 3 Sep 2018 at 10:38, Mate Varga <m(at)matevarga(dot)net> wrote:
>>>>>
>>>>>> Which frame do you refer to?
>>>>>>
>>>>>> On Mon, Sep 3, 2018 at 3:57 PM Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>>>>>>
>>>>>>> Not sure why reading from a socket is taking 1ms ?
>>>>>>>
>>>>>>> Dave Cramer
>>>>>>>
>>>>>>> davec(at)postgresintl(dot)com
>>>>>>> www.postgresintl.com
>>>>>>>
>>>>>>>
>>>>>>> On Mon, 3 Sep 2018 at 09:39, Mate Varga <m(at)matevarga(dot)net> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> https://imgur.com/a/ovsJPRv -- I've uploaded the profiling info
>>>>>>>> (as an image, sorry). It seems this is a JDBC-level problem. I understand
>>>>>>>> that the absolute timing is not meaningful at all because you don't know
>>>>>>>> how large the resultset is, but I can tell that this is only a few
>>>>>>>> thousands rows + few thousand largeobjects, each largeobject is around 1
>>>>>>>> kByte. (Yes I know this is not a proper use of LOBs -- it's a legacy db
>>>>>>>> structure that's hard to change.)
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>> Mate
>>>>>>>>
>>>>>>>> On Mon, Sep 3, 2018 at 11:52 AM Mate Varga <m(at)matevarga(dot)net> wrote:
>>>>>>>>
>>>>>>>>> Hey,
>>>>>>>>>
>>>>>>>>> we'll try to test this with pure JDBC versus hibernate. Thanks!
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Sep 3, 2018 at 11:48 AM Dave Cramer <pg(at)fastcrypt(dot)com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, 3 Sep 2018 at 03:55, Mate Varga <m(at)matevarga(dot)net> wrote:
>>>>>>>>>>
>>>>>>>>>>> Basically there's a class with a byte[] field, the class is
>>>>>>>>>>> mapped to table T and the byte field is annotated with @Lob so it goes to
>>>>>>>>>>> the pg_largeobject table.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Ah, so hibernate is in the mix. I wonder if that is causing some
>>>>>>>>>> challenges ?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> The DB is on separate host but relatively close to the app, and
>>>>>>>>>>> I can reproduce the problem locally as well. One interesting bit is that
>>>>>>>>>>> turning of SSL between the app and PSQL speeds up things by at least 50%.
>>>>>>>>>>>
>>>>>>>>>>> Ah, one addition -- the binary objects are encrypted, so their
>>>>>>>>>>> entropy is very high.
>>>>>>>>>>>
>>>>>>>>>>> Any chance you could write a simple non-hibernate test code to
>>>>>>>>>> time the code ?
>>>>>>>>>>
>>>>>>>>>> Dave Cramer
>>>>>>>>>>
>>>>>>>>>> dave(dot)cramer(at)crunchydata(dot)ca
>>>>>>>>>> www.crunchydata.ca
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Mate
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Sep 2, 2018 at 12:55 AM Dave Cramer <pg(at)fastcrypt(dot)com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, 31 Aug 2018 at 10:15, Mate Varga <m(at)matevarga(dot)net>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I see -- we could try that, though we're mostly using an ORM
>>>>>>>>>>>>> (Hibernate) to do this. Thanks!
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Aug 31, 2018 at 3:57 PM Dmitry Igrishin <
>>>>>>>>>>>>> dmitigr(at)gmail(dot)com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> пт, 31 авг. 2018 г. в 16:35, Mate Varga <m(at)matevarga(dot)net>:
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Hi,
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > we're fetching binary data from pg_largeobject table. The
>>>>>>>>>>>>>> data is not very large, but we ended up storing it there. If I'm copying
>>>>>>>>>>>>>> the data to a file from the psql console, then it takes X time (e.g. a
>>>>>>>>>>>>>> second), fetching it through the JDBC driver takes at least 10x more. We
>>>>>>>>>>>>>> don't see this difference between JDBC and 'native' performance for
>>>>>>>>>>>>>> anything except largeobjects (and bytea columns, for the record).
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Does anyone have any advice about whether this can be tuned
>>>>>>>>>>>>>> or what the cause is?
>>>>>>>>>>>>>> I don't know what a reason of that, but I think it's
>>>>>>>>>>>>>> reasonable and
>>>>>>>>>>>>>> quite simple to call lo_import()/lo_export() via JNI.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>> Can't imagine that's any faster. The driver simply implements
>>>>>>>>>>>> the protocol
>>>>>>>>>>>>
>>>>>>>>>>>> Do you have any code to share ? Any other information ?
>>>>>>>>>>>>
>>>>>>>>>>>> Is the JDBC connection significantly further away network wise ?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Dave Cramer
>>>>>>>>>>>>
>>>>>>>>>>>> davec(at)postgresintl(dot)com
>>>>>>>>>>>> www.postgresintl.com
>>>>>>>>>>>>
>>>>>>>>>>>
From | Date | Subject | |
---|---|---|---|
Next Message | Austin Drenski | 2018-09-03 18:41:35 | Re: WARNING: could not flush dirty data: Function not implemented |
Previous Message | Dmitri Maziuk | 2018-09-03 17:32:32 | Re: WARNING: could not flush dirty data: Function not implemented |