Re: Performance pb vs SQLServer.

From: John Arbash Meinel <john(at)arbash-meinel(dot)com>
To: Stéphane COEZ <scoez(at)harrysoftware(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Performance pb vs SQLServer.
Date: 2005-08-15 00:27:38
Message-ID: 42FFE17A.6010208@arbash-meinel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Stéphane COEZ wrote:

>Hi,
>
>I have a perfomance issue :
>
>I run PG (8.0.3) and SQLServer2000 on a Windows2000 Server (P4 1,5Ghz 512Mo)
>I have a table (3200000 rows) and I run this single query :
>
>select cod from mytable group by cod
>I have an index on cod (char(4) - 88 different values)
>
>PG = ~ 20 sec.
>SQLServer = < 8 sec
>
>
>the explain is :
>
>HashAggregate (cost=64410.09..64410.09 rows=55 width=8)
> -> Seq Scan on mytable (cost=0.00..56325.27 rows=3233927 width=8)
>
>
>if I switch to "enable_hashagg = false" (just for a try...)
>the planner will choose my index :
>
>Group (cost=0.00..76514.01 rows=55 width=8)
> -> Index Scan using myindex on mytable (cost=0.00..68429.20 rows=3233927
>width=8)
>
>but performance will be comparable to previous test.
>
>So with or without using Index I have the same result.
>
>

My guess is that this is part of a larger query. There isn't really much
you can do. If you want all 3.2M rows, then you have to wait for them to
be pulled in.

What you generally can do for performance, is to restructure things, so
that you *don't* have to touch all 3.2M rows.
If you are just trying to determine what the unique entries are for cod,
you probably are better off doing some normalization, and keeping a
separate table of cod values.

I'm guessing the reason your query is faster with SQLServer is because
of how postgres handles MVCC. Basically, it still has to fetch the main
page to determine if a row exists. While SQL server doesn't do MVCC, so
it can just look things up in the index.

You might also try a different query, something like:

SELECT DISTINCT cod FROM mytable ORDER BY cod GROUP BY cod;
(You may or may not want order by, or group by, try the different
combinations.)
It might be possible to have the planner realize that all you want is
unique rows, just doing a group by doesn't give you that.

John
=:->

>
>Thanks for help.
>
>Stéphane COEZ
>
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2005-08-15 00:38:00 Re: How many views is ok?
Previous Message John Arbash Meinel 2005-08-15 00:22:12 Re: How many views is ok?