Re: bug report: slow getColumnTypeName

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: David Johnston <polobo(at)yahoo(dot)com>
Cc: Luis Flores <luiscamposflores(at)gmail(dot)com>, Eyal Wilde <eyal(dot)wilde(at)gmail(dot)com>, "pgsql-jdbc(at)postgresql(dot)org" <pgsql-jdbc(at)postgresql(dot)org>, dmp <danap(at)ttc-cmc(dot)net>
Subject: Re: bug report: slow getColumnTypeName
Date: 2012-10-11 13:40:16
Message-ID: CADK3HHJFR39tFnDdB2ATXFMih40hUVNU1jc5_Z7T-SMBji2SYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

David,

I don't think we can determine that from the server meta data, so I
would think the user is on their own in that case.

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On Thu, Oct 11, 2012 at 9:34 AM, David Johnston <polobo(at)yahoo(dot)com> wrote:
> What about a field that concatenates the serial value with a varchar to generate an auto-incrementing but non-numeric primary key? How should that be classified?
>
> David J.
>
> On Oct 11, 2012, at 3:55, Luis Flores <luiscamposflores(at)gmail(dot)com> wrote:
>
>> I'm still checking the fetchFieldMetadata(), and I will try to
>> identify a way of getting the serial info without the full penalty we
>> are having, or to understand if loading metadata one field at a time
>> is much slower than the current method.
>>
>> And I understand, and agree, it is strange to have irregular performance.
>>
>> But, for now, we can understand and justify the need to call
>> isAutoIncrement in int4 and int8 columns, so that we detect if they
>> are in reality serial columns. We can't justify the isAutoIncrement
>> call in all other cases. Aside from the huge performance hit, it's the
>> wrong algorithm, we shouldn't call a method if we don't need the
>> answer.
>>
>> The irregular performance comes from added complexity for some data
>> types, not from a buggy algorithm. To me it seems it would be easier
>> to understand the performance hit cause if it only happened on serial
>> columns.
>>
>>
>> Luis Flores
>>
>> On Thu, Oct 11, 2012 at 5:12 AM, Eyal Wilde <eyal(dot)wilde(at)gmail(dot)com> wrote:
>>> If i may... after we spent two days tracking this strange behavior, we
>>> decided to avoid getColumnTypeName at all. The optimization suggested would
>>> cause even more strange and un-solid performance, from the everage user's
>>> point of view.
>>>
>>> I believe this extra info of serial/bigserial is not available as a result
>>> of optimization. so maybe it's time to ask the pg-server guys, to arrange
>>> something in thier area?
>>>
>>> Regards.
>>>
>>> On Oct 11, 2012 12:46 AM, "Luis Flores" <luiscamposflores(at)gmail(dot)com> wrote:
>>>>
>>>> I'm sorry, but I'm a bit sleepy ...
>>>>
>>>> I believe the code is more readable/better like this:
>>>>
>>>> public String getColumnTypeName(int column) throws SQLException
>>>> {
>>>> String type = getPGType(column);
>>>> if ( ( "int4".equals(type) && isAutoIncrement(column)) {
>>>> type = "serial";
>>>> } else if ("int8".equals(type) && isAutoIncrement(column)) {
>>>> type "bigserial";
>>>> }
>>>>
>>>> return type;
>>>> }
>>>>
>>>> Luis Flores
>>>>
>>>> On Wed, Oct 10, 2012 at 11:39 PM, Luis Flores
>>>> <luiscamposflores(at)gmail(dot)com> wrote:
>>>>> I've looked at the AbstractJdbc2ResultSetMetadata getColumnTypeName,
>>>>> for version 1000 and 802.
>>>>>
>>>>> The change is simple, there is an extra call to isAutoIncrement, to be
>>>>> able to return correct values in the serial and bigserial cases.
>>>>>
>>>>> The isAutoIncrement call is slow because it triggers
>>>>> fetchFieldMetadata, witch get all metada for all fields.
>>>>>
>>>>> One simple optimization is to change the current method to:
>>>>>
>>>>> public String getColumnTypeName(int column) throws SQLException
>>>>> {
>>>>> String type = getPGType(column);
>>>>> if ( ( "int4".equals(type) || "int8".equals(type) ) &&
>>>>> isAutoIncrement(column)) {
>>>>> if ("int4".equals(type)) {
>>>>> return "serial";
>>>>> } else if ("int8".equals(type)) {
>>>>> return "bigserial";
>>>>> }
>>>>> }
>>>>>
>>>>> return type;
>>>>> }
>>>>>
>>>>> In this case, the isAutoIncrement is only called on int4 and int8
>>>>> columns, causing the performance for all the other column types to
>>>>> remain the same.
>>>>>
>>>>> May they are better options, but I think this change is good, it
>>>>> delays fetching metadata, and speeds up the method, without side
>>>>> effects.
>>>>>
>>>>> Luis Flores
>>>>>
>>>>> On Wed, Oct 10, 2012 at 11:10 PM, Luis Flores
>>>>> <luiscamposflores(at)gmail(dot)com> wrote:
>>>>>> I also confirm the performance regression.
>>>>>>
>>>>>> Testing on Fedora17 64bit + PostgreSQL 9.1 + jdk 1.6.0_33, localhost
>>>>>> server, I get these numbers:
>>>>>> driver 802
>>>>>> getColumnType time (ms): 2777
>>>>>> getColumnTypeName time (ms): 1847
>>>>>> both time (ms): 1948
>>>>>>
>>>>>> driver 903
>>>>>> getColumnType time (ms): 3044
>>>>>> getColumnTypeName time (ms): 27123
>>>>>> both time (ms): 25142
>>>>>>
>>>>>> driver 1000
>>>>>> getColumnType time (ms): 2928
>>>>>> getColumnTypeName time (ms): 27214
>>>>>> both time (ms): 26407
>>>>>>
>>>>>> During the getColumnTypeName tests postgresql daemon used 100% cpu
>>>>>> time.
>>>>>>
>>>>>> Here is the full test class:
>>>>>>
>>>>>> import java.sql.Connection;
>>>>>> import java.sql.DriverManager;
>>>>>> import java.sql.ResultSet;
>>>>>> import java.sql.ResultSetMetaData;
>>>>>> import java.sql.Statement;
>>>>>>
>>>>>>
>>>>>> public class Main
>>>>>> {
>>>>>> public static void main( String arg[] )
>>>>>> throws Exception
>>>>>> {
>>>>>> Class.forName( "org.postgresql.Driver" );
>>>>>> Connection conn = DriverManager.getConnection(
>>>>>> "jdbc:postgresql://localhost/test", "postgres", "" );
>>>>>> Statement create = conn.createStatement();
>>>>>> create.execute( "CREATE TABLE t(id SERIAL, name
>>>>>> VARCHAR);" );
>>>>>> create.close();
>>>>>> long l = System.currentTimeMillis();
>>>>>> for( int i = 0; i < 10000; ++i )
>>>>>> {
>>>>>> Statement stm = conn.createStatement();
>>>>>> stm.execute( "SELECT id, name FROM t;" );
>>>>>> ResultSet rs = stm.getResultSet();
>>>>>> ResultSetMetaData rsmd = rs.getMetaData();
>>>>>> rsmd.getColumnType( 1 );
>>>>>> // rsmd.getColumnTypeName( 1 );
>>>>>> rs.close();
>>>>>> stm.close();
>>>>>> }
>>>>>> System.out.println( "getColumnType time (ms): " + (
>>>>>> System.currentTimeMillis() - l ) );
>>>>>> l = System.currentTimeMillis();
>>>>>> for( int i = 0; i < 10000; ++i )
>>>>>> {
>>>>>> Statement stm = conn.createStatement();
>>>>>> stm.execute( "SELECT id, name FROM t;" );
>>>>>> ResultSet rs = stm.getResultSet();
>>>>>> ResultSetMetaData rsmd = rs.getMetaData();
>>>>>> // rsmd.getColumnType( 1 );
>>>>>> rsmd.getColumnTypeName( 1 );
>>>>>> rs.close();
>>>>>> stm.close();
>>>>>> }
>>>>>> System.out.println( "getColumnTypeName time (ms): " + (
>>>>>> System.currentTimeMillis() - l ) );
>>>>>> l = System.currentTimeMillis();
>>>>>> for( int i = 0; i < 10000; ++i )
>>>>>> {
>>>>>> Statement stm = conn.createStatement();
>>>>>> stm.execute( "SELECT id, name FROM t;" );
>>>>>> ResultSet rs = stm.getResultSet();
>>>>>> ResultSetMetaData rsmd = rs.getMetaData();
>>>>>> rsmd.getColumnType( 1 );
>>>>>> rsmd.getColumnTypeName( 1 );
>>>>>> rs.close();
>>>>>> stm.close();
>>>>>> }
>>>>>> System.out.println( "both time (ms): " + (
>>>>>> System.currentTimeMillis() - l ) );
>>>>>> Statement drop = conn.createStatement();
>>>>>> drop.execute( "DROP TABLE t;" );
>>>>>> drop.close();
>>>>>> conn.close();
>>>>>> }
>>>>>> }
>>>>>>
>>>>>>
>>>>>> Luis Flores
>>>>>>
>>>>>> On Tue, Oct 9, 2012 at 5:48 PM, dmp <danap(at)ttc-cmc(dot)net> wrote:
>>>>>>> Eyal Wilde wrote:
>>>>>>>>
>>>>>>>> I verified with wireshark that getColumnTypeName indeed do a request
>>>>>>>> to
>>>>>>>> the postgres server. This happens with 9.1-901 and 9.1-903 BUT NOT
>>>>>>>> WITH
>>>>>>>> 9.0-801!!
>>>>>>>>
>>>>>>>> So, the reason for the slowness seems to be obviouse now.
>>>>>>>>
>>>>>>>> it will take me some time to wrap up it into a test case. I would
>>>>>>>> realy
>>>>>>>> like to help, but please let me know if it's realy necessary.
>>>>>>>>
>>>>>>>> On Oct 5, 2012 1:07 AM, "Eyal Wilde" <eyal(at)impactsoft(dot)co(dot)il
>>>>>>>> <mailto:eyal(at)impactsoft(dot)co(dot)il>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> ResultSetMetaData __md = __rs.getMetaData();
>>>>>>>> //this is fine
>>>>>>>> String __sf1name = __md.getColumnName(1);
>>>>>>>> //this is fine
>>>>>>>> int __if1type = __md.getColumnType(1);
>>>>>>>> //this is fine
>>>>>>>> String __sf1type =
>>>>>>>> __md.getColumnTypeName(1); //this is SLOW!! ~15msec
>>>>>>>>
>>>>>>>> postgres server version is 9.1.5
>>>>>>>> jdbc version, i checked both 9.1-901 and 9.1-903
>>>>>>>> and... BTW, 9.0-801 works good (while connected pg9.1)!
>>>>>>>
>>>>>>>
>>>>>>> There does appear to be a change in the code that may have created the
>>>>>>> slowness that you are observing. Please try a a test case in which two
>>>>>>> back to back getColumnTypeName() calls are made. Is there a difference
>>>>>>> in time between the first and second and is there still on the second
>>>>>>> call a request to the postgres server.
>>>>>>>
>>>>>>> The code between 9.0-801 and later version does have a change in it
>>>>>>> that looks like for the caching for field metadata through
>>>>>>> fetchFieldMetaData().
>>>>>>> That method is called in the later versions for getColumnTypeName()
>>>>>>> with isAutoIncrement(), with was also added in later versions.
>>>>>>>
>>>>>>> danap.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Sent via pgsql-jdbc mailing list (pgsql-jdbc(at)postgresql(dot)org)
>>>>>>> To make changes to your subscription:
>>>>>>> http://www.postgresql.org/mailpref/pgsql-jdbc
>>
>>
>> --
>> Sent via pgsql-jdbc mailing list (pgsql-jdbc(at)postgresql(dot)org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-jdbc
>
>
> --
> Sent via pgsql-jdbc mailing list (pgsql-jdbc(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-jdbc

In response to

Browse pgsql-jdbc by date

  From Date Subject
Next Message Luis Flores 2012-10-11 14:07:09 Re: bug report: slow getColumnTypeName
Previous Message David Johnston 2012-10-11 13:34:38 Re: bug report: slow getColumnTypeName