Re: Speeding up query, Joining 55mil and 43mil records.

From: nicky <nicky(at)valuecare(dot)nl>
To: Sven Geisler <sgeisler(at)aeccom(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Speeding up query, Joining 55mil and 43mil records.
Date: 2006-06-22 12:10:50
Message-ID: 449A88CA.3000108@valuecare.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello Sven,

We have the following indexes on src_faktuur_verrsec
/
CREATE INDEX src_faktuur_verrsec_idx0
ON src.src_faktuur_verrsec
USING btree
(id);

CREATE INDEX src_faktuur_verrsec_idx1
ON src.src_faktuur_verrsec
USING btree
(substr(code::text, 1, 2));

CREATE INDEX src_faktuur_verrsec_idx2
ON src.src_faktuur_verrsec
USING btree
(substr(correctie::text, 4, 1));/

and another two on src_faktuur_verricht

/ CREATE INDEX src_faktuur_verricht_idx0
ON src.src_faktuur_verricht
USING btree
(id);

CREATE INDEX src_faktuur_verricht_idx1
ON src.src_faktuur_verricht
USING btree
(date_part('year'::text, datum))
TABLESPACE src_index;/

PostgreSQL elects not to use them. I assume, because it most likely
needs to traverse the entire table anyway.

if i change: / substr(t0.code,1,2) not in
('14','15','16','17')/
to (removing the NOT): / substr(t0.code,1,2) in ('14','15','16','17')/

it uses the index, but it's not the query that needs to be run anymore.

Greetings,
Nick

Sven Geisler wrote:
> Hi Nicky,
>
> Did you tried to create an index to avoid the sequential scans?
>
> Seq Scan on src_faktuur_verrsec t0...
>
> I think, you should try
>
> CREATE INDEX src.src_faktuur_verrsec_codesubstr ON
> src.src_faktuur_verrsec (substr(src.src_faktuur_verrsec.code,1,2))
>
> Cheers
> Sven.
>
> nicky schrieb:
>> Hello again,
>>
>> thanks for all the quick replies.
>>
>> It seems i wasn't entirely correct on my previous post, i've mixed up
>> some times/numbers.
>>
>> Below the correct numbers
>>
>> MSSQL: SELECT COUNT(*) from JOIN (without insert) 17 minutes
>> PostgreSQL: SELECT COUNT(*) from JOIN (without insert) 33 minutes
>> PostgreSQL: complete query 55 minutes
>
> <snip snip snip>
>>
>> A lot of improvement also in the select count: 33 minutes vs 10 minutes.
>>
>>
>> To us, the speeds are good. Very happy with the performance increase
>> on that select with join, since 90% of the queries are SELECT based.
>>
>> The query results in 7551616 records, so that's about 4500 inserts
>> per second. I'm not sure if that is fast or not. Any further tips
>> would be welcome.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
> choose an index scan if your joining column's datatypes do not
> match
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Magnus Hagander 2006-06-22 12:17:29 Re: Speeding up query, Joining 55mil and 43mil records.
Previous Message Sven Geisler 2006-06-22 11:29:41 Re: Speeding up query, Joining 55mil and 43mil records.