On 06/07/2018 12:55 PM, Robert Creager wrote:
>
>
>> On Jun 7, 2018, at 12:40 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
>> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>>
>> On 06/07/2018 11:17 AM, Robert Creager wrote:
>>> I have a system running FreeBSD 11.1-STABLE, PostgreSQL 9.6.8,Java
>>> OpenJDK 1.8.0_131, jdbc 9.3-1104-jdbc41 which is exhibiting very
>>> bizarre behavior. A query is executing against a couple of tables
>>> that have 1 and 0 records in them. ds3.job_entry has 0 rows,
>>> ds3.blob has 1 row. If I execute the query manually via command
>>> line, it executes fine. There are no other active queries, no
>>> locks. The system is idle, in our Dev Test group, so I can leave it
>>> this way for a bit of time. The general software setup is in
>>> production and I’ve seen nothing like this before. Even a system
>>> with 300M ds3.blob entries executes this query fine.
>>
>> So I am assuming the problem query is being run through Java/jdbc,
>> correct?
>
> Yes.
>
>>
>> There is also the below in the log:
>>
>> " ... execute fetch from S_2037436/C_2037437 …"
>
> So, that means nothing to me. Something to you?
>
>>
>> My guess is that we will need to see the Java code that sets up and
>> runs the query. Is that possible?
>
> OMG, you had to ask ;-) Not really, for two reasons. It’s an actual
> shipping product, and I’d have to send you so much code to get from
> the source of the query down through the execute…. Now, the execute,
> on it’s own, is below. m_statement is a java.sql.PreparedStatement.
> Keep in mind this code is literally executing millions of times a day
> on a busy system. Except for this special snowflake system...
>
> private Executor executeQuery()
> {
> final MonitoredWork monitoredWork = new MonitoredWork(
> StackTraceLogging.NONE, m_readableSql.getLogMessagePreExecution() );
> try
> {
> m_closeTransactionUponClose = ( null == m_transactionNumber );
> m_statement.setFetchSize( 10000 );
> final Duration duration = new Duration();
> m_resultSet = m_statement.executeQuery();
> m_readableSql.log( duration, null );
> return this;
> }
> catch ( final SQLException ex )
> {
> throw new DaoException(
> "Failed to execute:
> " + m_readableSql.getLogMessagePreExecution(),
> ex );
> }
> finally
> {
> monitoredWork.completed();
> }
> }
>
>>
>>> Jun 7 17:24:21 blackpearl postgres[10670]: [7737-1]
>>> db=tapesystem,user=Administrator,app=[unknown],client=127.0.0.1
>>> LOG: duration: *2903612.206* ms execute fetch
>>> from S_2037436/C_2037437: SELECT * FROM ds3.blob WHERE EXISTS
>>> (SELECT * FROM ds3.job_entry WHERE blob_id = ds3.blob.id
>>> <http://ds3.blob.id> <http://ds3.blob.id> AND (job_id = $1))
>>> tapesystem=# explain analyze SELECT * FROM ds3.blob WHERE EXISTS
>>> (SELECT * FROM ds3.job_entry WHERE blob_id = ds3.blob.id
>>> <http://ds3.blob.id> <http://ds3.blob.id> AND (job_id =
>>> 'b51357cd-e07a-4c87-a50b-999c347a5c71'));
>>> QUERY PLAN
>>> ---------------------------------------------------------------------------------------------------------------------------------------
>>> Nested Loop (cost=0.84..6.89 rows=1 width=77) (actual
>>> time=0.044..0.044 rows=0 loops=1)
>>> -> Index Scan using job_entry_job_id_idx on
>>> job_entry (cost=0.42..2.44 rows=1 width=16) (actual
>>> time=0.042..0.042 rows=0 loops=1)
>>> Index Cond: (job_id =
>>> 'b51357cd-e07a-4c87-a50b-999c347a5c71'::uuid)
>>> -> Index Scan using blob_pkey on blob (cost=0.42..4.44 rows=1
>>> width=77) (never executed)
>>> Index Cond: (id = job_entry.blob_id)
>>> Planning time: 0.388 ms
>>> Execution time: 0.118 ms
>>> pidclient_portruntimequery_startdatnamestatequeryusename
>>> 10670112110 years 0 mons 0 days 0 hours 43 mins 28.852273 secs
>>> 2018-06-07 17:24:22.026384tapesystemactiveSELECT * FROM ds3.blob
>>> WHERE EXISTS (SELECT * FROM ds3.job_entry WHERE blob_id =
>>> ds3.blob.id <http://ds3.blob.id> <http://ds3.blob.id> AND (job_id =
>>> $1))Administrator
>>> From the system with 300M ds3.blob entries, which works fine (along
>>> with every other system in house):
>>> QUERY PLAN
>>> Nested Loop (cost=0.57..738052.90 rows=164379 width=75) (actual
>>> time=1.001..1947.029 rows=50000 loops=1)
>>> -> Seq Scan on job_entry (cost=0.00..10039.50 rows=164379
>>> width=16) (actual time=0.871..56.442 rows=50000 loops=1)
>>> Filter: (job_id = 'ef27d2fa-2727-424e-8f44-da9e33a5ca05'::uuid)
>>> Rows Removed by Filter: 60001
>>> -> Index Scan using blob_pkey on blob (cost=0.57..4.42 rows=1
>>> width=75) (actual time=0.037..0.037 rows=1 loops=50000)
>>> Index Cond: (id = job_entry.blob_id)
>>> Planning time: 6.330 ms
>>> Execution time: 1951.564 ms
>>> Please keep my CC of my work e-mail present.
>>> Best,
>>> Robert
>>
>>
>> --
>> Adrian Klaver
>> adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
>
What's the url doing in "blob_id = ds3.blob.id <http://ds3.blob.id"?