From: | Rob Sargent <robjsargent(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: How different is AWS-RDS postgres? |
Date: | 2021-05-27 22:23:49 |
Message-ID: | dbd962e7-5dd4-eeab-aca0-3a0fd39ed36a@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 5/27/21 4:10 PM, Ron wrote:
> On 5/27/21 4:58 PM, Rob Sargent wrote:
>> On 5/27/21 3:08 PM, Sam Gendler wrote:
>>> The same JDBC connection that is resulting in lost data? Sounds to
>>> me like you aren't connecting to the DB you think you are connecting to.
>>>
>> I almost wish that were true.
>>
>> However, looking at AWS "Performance Insights" is see the sql
>> statements generate by my app begin executed on the server. Not
>> coincidentally this is from the "Top SQL (10)/Load by waits" view.
>> Now that view does NOT, in point of fact, name the database in which
>> the sql is running, but the rest of my environment pretty much rules
>> out silently switching tracks on myself. I have to read from the
>> correct database, using a UUID, to get data to analyze, then save the
>> results of the analysis back. I'm using my wonderful webapp to run
>> this and I've successfully analyzed and stored result for small
>> starting data sets.
>>
>> I just notice the "Top database" tab on that screen: there is only
>> one and it's the correct one.
>>
>> I've reproduced the behaviour. I'm pretty convinced it a size thing,
>> but which part of the system is being max'd out is not clear. Not
>> CPU, but that's the only resource the "Performance Insight" mentions
>> (suggesting this UI wasn't designed by a database person).
>>
>> The loss of the staging table is most spectacular. It filled from a
>> file which has 7.5M records. It's clear in the tomcat logs that is
>> was created and written to, one record read. Then the "top sql"
>> happens:
>>
>> insert into segment select * from
>> bulk."rjs_GEV15_01_c9e224ca_85d2_40b4_ad46_327cfb9f0ac6" as s
>> where s.probandset_id >= ? and s.probandset_id < ?
>>
>> the "bulk" table is the staging table. The params are filled in with
>> a quasi-uuid which grab roughly 1/16th of the data. In the stack
>> trace on my tomcat server I get
>>
>> Caused by: org.jooq.exception.DataAccessException: SQL [insert
>> into segment select * from
>> bulk."rjs_GEV15_01_c9e224ca_85d2_40b4_ad46_327cfb9f0ac6" as s
>> where s.probandset_id >= '30000000-0000-0000-0000-00000000\
>> 0000' and s.probandset_id <
>> '40000000-0000-0000-0000-000000000000' ]; An I/O error occurred
>> while sending to the backend.
>>
>> So this would have been the fourth such insert statement read from
>> that staging table.
>>
>> That table is not deleted by my code. It is renamed after the last
>> insert into segment, by appending "_done" to the name. But that
>> table, by either name, is nowhere to be seen on the server.
>>
>> For those scoring at home, the trouble in the tomcat log start with
>>
>> 0:02:11.452 [https-jsse-nio-10.0.2.28-15002-exec-7] INFO
>> edu.utah.camplab.jx.PayloadFromMux -
>>
>
> Is there any way to replicate this in psql?
>
I have the json file which feeds the staging table and I have code which
is designed to load such files. I suppose an sql file with those 16
insert statements would approximate the the apps call. Let me give that
a shot.
> --
> Angular momentum makes the world go 'round.
From | Date | Subject | |
---|---|---|---|
Next Message | Sam Gendler | 2021-05-27 22:25:43 | Re: How different is AWS-RDS postgres? |
Previous Message | Ron | 2021-05-27 22:10:15 | Re: How different is AWS-RDS postgres? |