Re: How different is AWS-RDS postgres?

From: Rob Sargent <robjsargent(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: How different is AWS-RDS postgres?
Date: 2021-05-27 22:23:49
Message-ID: dbd962e7-5dd4-eeab-aca0-3a0fd39ed36a@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 5/27/21 4:10 PM, Ron wrote:
> On 5/27/21 4:58 PM, Rob Sargent wrote:
>> On 5/27/21 3:08 PM, Sam Gendler wrote:
>>> The same JDBC connection that is resulting in lost data?  Sounds to
>>> me like you aren't connecting to the DB you think you are connecting to.
>>>
>> I almost wish that were true.
>>
>> However, looking at AWS "Performance Insights" is see the sql
>> statements generate by my app begin executed on the server.  Not
>> coincidentally this is from the "Top SQL (10)/Load by waits" view. 
>> Now that view does NOT, in point of fact, name the database in which
>> the sql is running, but the rest of my environment pretty much rules
>> out silently switching tracks on myself.  I have to read from the
>> correct database, using a UUID, to get data to analyze, then save the
>> results of the analysis back. I'm using my wonderful webapp to run
>> this and I've successfully analyzed and stored result for small
>> starting data sets.
>>
>> I just notice the "Top database" tab on that screen:  there is only
>> one and it's the correct one.
>>
>> I've reproduced the behaviour.  I'm pretty convinced it a size thing,
>> but which part of the system is being max'd out is not clear.  Not
>> CPU, but that's the only resource the "Performance Insight" mentions
>> (suggesting this UI wasn't designed by a database person).
>>
>> The loss of the staging table is most spectacular.  It filled from a
>> file which has 7.5M records. It's clear in the tomcat logs that is
>> was created and written to, one record read.  Then the "top sql"
>> happens:
>>
>> insert into segment select * from
>> bulk."rjs_GEV15_01_c9e224ca_85d2_40b4_ad46_327cfb9f0ac6" as s
>> where s.probandset_id >= ? and s.probandset_id < ?
>>
>> the "bulk" table is the staging table. The params are filled in with
>> a quasi-uuid which grab roughly 1/16th of the data.  In the stack
>> trace on my tomcat server I get
>>
>> Caused by: org.jooq.exception.DataAccessException: SQL [insert
>> into segment select * from
>> bulk."rjs_GEV15_01_c9e224ca_85d2_40b4_ad46_327cfb9f0ac6" as s
>> where  s.probandset_id >= '30000000-0000-0000-0000-00000000\
>> 0000' and s.probandset_id <
>> '40000000-0000-0000-0000-000000000000' ]; An I/O error occurred
>> while sending to the backend.
>>
>> So this would have been the fourth such insert statement read from
>> that staging table.
>>
>> That table is not deleted by my code.  It is renamed after the last
>> insert into segment, by appending "_done" to the name.  But that
>> table, by either name, is nowhere to be seen on the server.
>>
>> For those scoring at home, the trouble in the tomcat log start with
>>
>> 0:02:11.452 [https-jsse-nio-10.0.2.28-15002-exec-7] INFO
>> edu.utah.camplab.jx.PayloadFromMux -
>>
>
> Is there any way to replicate this in psql?
>
I have the json file which feeds the staging table and I have code which
is designed to load such files.  I suppose an sql file with those 16
insert statements would approximate the the apps call.  Let me give that
a shot.

> --
> Angular momentum makes the world go 'round.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Sam Gendler 2021-05-27 22:25:43 Re: How different is AWS-RDS postgres?
Previous Message Ron 2021-05-27 22:10:15 Re: How different is AWS-RDS postgres?