Re: How to speed up product code and subcode match

From: Andrus <kobruleht2(at)hot(dot)ee>
To: "Bzm(at)g" <bzm(at)2bz(dot)de>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: How to speed up product code and subcode match
Date: 2023-05-23 13:38:30
Message-ID: 1e26fc09-46e4-6cdc-cec1-98bb5a975294@hot.ee
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi!

I ran

analyze toode;

create index vordlusajuhinnak_toode_pattern_idx on
vordlusajuhinnak(toode bpchar_pattern_ops);

create index vordlusajuhinnak_toode_idx on vordlusajuhinnak(toode);
analyze vordlusajuhinnak;

Select runs now more than one hour. Output from explain

explain create table peatoode as
select toode.toode , n2, n3, n4
from toode, vordlusajuhinnak
WHERE toode.toode=vordlusajuhinnak.toode OR  toode.toode LIKE
vordlusajuhinnak.toode||'/%'

"Gather  (cost=1000.55..443361894.55 rows=143828910 width=78)"
"  Workers Planned: 2"
"  ->  Nested Loop  (cost=0.55..428978003.55 rows=59928712 width=78)"
"        Join Filter: ((toode.toode = (vordlusajuhinnak.toode)::bpchar)
OR (toode.toode ~~ ((vordlusajuhinnak.toode)::text || '/%'::text)))"
"        ->  Parallel Index Only Scan using toode_pkey on toode 
(cost=0.55..95017.93 rows=303869 width=60)"
"        ->  Seq Scan on vordlusajuhinnak  (cost=0.00..721.33 rows=39433
width=32)"

with

Set enable_nestloop to off;

explain output is:

"Gather  (cost=10000001000.55..10443361906.55 rows=143828910 width=78)"
"  Workers Planned: 2"
"  ->  Nested Loop  (cost=10000000000.55..10428978015.55 rows=59928712
width=78)"
"        Join Filter: ((toode.toode = (vordlusajuhinnak.toode)::bpchar)
OR (toode.toode ~~ ((vordlusajuhinnak.toode)::text || '/%'::text)))"
"        ->  Parallel Index Only Scan using toode_pkey on toode 
(cost=0.55..95029.93 rows=303869 width=60)"
"        ->  Seq Scan on vordlusajuhinnak  (cost=0.00..721.33 rows=39433
width=32)"

How to speed it up?

Andrus.

23.05.2023 14:32 Bzm(at)g kirjutas:
> Great,
>
> However I think it is still way to slow.
> Next step is to run analyze also for the other table  vordlusajuhinnak.
>
> And make sure you have an index on vordlusajuhinnak.toode similar to
> the index on toode.toode
>
> --
> Boris
>
>
>> Am 23.05.2023 um 12:56 schrieb Andrus <kobruleht2(at)hot(dot)ee>:
>>
>> 
>>
>> Hi!
>>
>> I ran analyze firma2.toode and changed where clause to use like:
>>
>> create table peatoode as
>> select toode.toode , n2, n3, n4
>> from toode, vordlusajuhinnak
>> WHERE toode.toode=vordlusajuhinnak.toode OR  toode.toode LIKE
>> vordlusajuhinnak.toode||'/%'
>>
>> In this case it took 37 minutes, returned 277966 rows.
>>
>> Thank you for help.
>>
>> Andrus.
>>
>> 23.05.2023 11:24 Bzm(at)g kirjutas:
>>> Also your row count is way off I guess. Did you ever run analyze bigtable?
>>>
>>> --
>>> Boris
>>>
>>>
>>>> Am 23.05.2023 um 10:22 schriebbzm(at)2bz(dot)de:
>>>>
>>>> Hi there,
>>>>
>>>> I guess the main problem is the nested loop.
>>>>
>>>> As a quick recheck what happened if you run your query Without nested loop?
>>>>
>>>> This is not a solution but a quickt test
>>>>
>>>> In a Session
>>>>
>>>> Set enable_nestedloop = off;
>>>> Explain Select your query ;
>>>> --
>>>> Boris
>>>>
>>>>
>>>>> Am 23.05.2023 um 08:53 schrieb Andrus<kobruleht2(at)hot(dot)ee>:
>>>>>
>>>>> Hi!
>>>>>
>>>>> Price list of main products vordlusajuhinnak contains 3 prices for product (column toode) and has 39433 products:
>>>>>
>>>>> create table vordlusajuhinnak( toode varchar(60), n2 numeric(8,2), n3 numeric(8,2), n4 numeric(8,2) );
>>>>>
>>>>> toode column in unique, may be primary key in table and contains upper case letters, digits and - characters.
>>>>>
>>>>> product table (toode) contains 733021 products:
>>>>>
>>>>> CREATE TABLE toode (
>>>>> grupp character(1),
>>>>> toode character(60) primary key,
>>>>> ... lot of other columns
>>>>> );
>>>>>
>>>>> Both tables have pattern indexes to speed up queries:
>>>>>
>>>>> CREATE INDEX toode_toode_pattern_idx
>>>>> ON toode (toode bpchar_pattern_ops ASC NULLS LAST);
>>>>> -- This index is probably not used, should removed:
>>>>> CREATE INDEX vordlusajuhinnak_toode_pattern_idx ON vordlusajuhinnak(toode bpchar_pattern_ops);
>>>>>
>>>>> Product table as both main products and subproducts with sizes. Size is separated by main product code always by / character:
>>>>>
>>>>>
>>>>> SHOE1-BLACK
>>>>> SHOE1-BLACK/38
>>>>> SHOE1-BLACK/41
>>>>> SHOE1-BLACK/42
>>>>> SHOE1-BLACK/43
>>>>> SHOE2/XXL
>>>>> SHOE2/L
>>>>>
>>>>> Product codes contain upper characers only in this table also.
>>>>>
>>>>> Trying to get prices for all products using
>>>>>
>>>>> create table peatoode as
>>>>> select toode.toode , n2, n3, n4
>>>>> from toode, vordlusajuhinnak
>>>>> where toode.toode between vordlusajuhinnak.toode and vordlusajuhinnak.toode||'/z'
>>>>>
>>>>> Takes 4.65 hours. How to speed this query up?
>>>>> Output from explain:
>>>>>
>>>>>> "Nested Loop (cost=0.55..272273178.12 rows=3202240012 width=78)""
>>>>>> -> Seq Scan on vordlusajuhinnak (cost=0.00..721.33 rows=39433 width=32)" " -> Index Only Scan using toode_pkey on toode
>>>>>> (cost=0.55..6092.62 rows=81207 width=60)" " Index Cond: (toode
>>>>>>> = (vordlusajuhinnak.toode)::bpchar)" " Filter: ((toode)::text <= ((vordlusajuhinnak.toode)::text || '/z'::text))"
>>>>> Using PostgreSQL 13.2, compiled by Visual C++ build 1900, 64-bit in Windows server and psqlODBC driver.
>>>>> Upgrading Postgres is possible, if this helps.
>>>>>
>>>>> Tried also using like:
>>>>>
>>>>> WHERE toode.toode=vordlusajuhinnak.toode OR
>>>>> toode.toode LIKE vordlusajuhinnak.toode||'/%'
>>>>>
>>>>> Posted also in
>>>>>
>>>>> https://stackoverflow.com/questions/76311957/how-to-match-character-columns-with-separated-subcolumns
>>>>>
>>>>> Andrus.
>>>>>
>>>>>
>>>>>

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2023-05-23 14:00:25 Re: How to speed up product code and subcode match
Previous Message Stephen Frost 2023-05-23 12:46:57 Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?