Re: Need optimization in query

From: Shubham Mittal <mittalshubham30(at)gmail(dot)com>
To: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Need optimization in query
Date: 2022-06-13 08:21:16
Message-ID: CA+ERcR8Vg47EyowDG-Xr7Dubb_SON9CD_kP5Yw3M6V9=q0K0Kw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Team,

Does anybody tried to have a look at it and would like to suggest any
optimisations?

Thanks

On Sat, Jun 11, 2022 at 12:48 AM Shubham Mittal <mittalshubham30(at)gmail(dot)com>
wrote:

> Hi Team.
>
> *I have a use case to get the result as follows:*
>
> 1. asin_ymm is never null.
> 2. If there is more than 1 entry for an asin_ymm with both null and
> non-null submodelId, I should return rows with non-null submodelId only,
> otherwise if there is no submodelid present for a asin_ymm, then return
> that row with null submodelid.
> 3. Also if submodelid is null , assuming fitment_key would always be null
> in the table.
> 4. Using that resultset, If there is more than 1 entry for an (asin_ymm
> ,SubmodelID) with both null and non-null fitment_key, I should return rows
> with non-null fitment_key only, otherwise if there is no fitment_key
> present for a (asin_ymm,submodelId), then return that row with null
> fitment_key.
> 5. Using that resultset, i need to return those rows having maximum values
> in fitment_key(ie. for eg.(out of these two 'A', 'D','1--3-4' and A',
> 'D','1-2-3-4-7', i should return row having A', 'D','1-2-3-4-7)
>
> create table fitment_records(asin_ymm varchar(50), SubmodelID varchar(50),
> fitment_key varchar(50));
>
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A',
> null,null);
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('B',
> null,null);
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key ) values('C',
> null,null);
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A',
> 'D','1--3-4');
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('B',
> 'E','2-3-4-5');
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('B',
> 'E', null);
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A',
> 'F','2-3');
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A',
> 'E', null);
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A',
> 'D', null);
> insert into fitment_records(asin_ymm, SubmodelID,fitment_key) values('A',
> 'D','1-2-3-4-7');
>
> output should be:
>
> asin_ymm | submodelid | fitment_key
> ----------+------------+-------------
> A | D | 1-2-3-4-7
> A | E | null
> A | F | 2-3
> B | E | 2-3-4-5
> C | null | null
>
> Currently i have written these queries for this usecase. Can we optimise it further? Considering data is in millions
>
> create temporary view tv1 as (SELECT *
> FROM fitment_records fr_1
> WHERE fitment_key IS NOT NULL OR
> (fitment_key IS NULL AND
> NOT EXISTS (SELECT 1 FROM fitment_records fr_2
> WHERE fr_2.asin_ymm = fr_1.asin_ymm AND
> fr_2.SubmodelID = fr_1.SubmodelID and
> fr_2.fitment_key IS NOT NULL)));
>
> create temporary view tv2 as (select *
> FROM tv1 fr_1
> WHERE SubmodelID IS NOT NULL OR
> (SubmodelID IS NULL AND
> NOT EXISTS (SELECT 1 FROM fitment_records fr_2
> WHERE fr_2.asin_ymm = fr_1.asin_ymm AND
> fr_2.SubmodelID IS NOT NULL) ));
>
> create temporary view fitment_records_with_fitment_key_size as (
> select asin_ymm, SubmodelID, fitment_key, Array_Length(string_to_array(fitment_key, '-'),1) as fitment_key_size
> from tv2
> where SubmodelID is not null
> and fitment_key is not null
> group by asin_ymm, SubmodelID, fitment_key
> );
>
> create temporary view fitment_records_with_fitment_key_max_size as (
> select asin_ymm, SubmodelID, max(fitment_key_size) as max_fitment_key_size
> from fitment_records_with_fitment_key_size
> group by asin_ymm, SubmodelID
> );
>
> select * from tv2
> except
> select f2.*
> from fitment_records_with_fitment_key_size frws, fitment_records_with_fitment_key_max_size frwms,
> tv2 f2
> where frws.asin_ymm = frwms.asin_ymm
> AND frws.SubmodelID = frwms.SubmodelID
> AND frws.fitment_key_size < frwms.max_fitment_key_size
> AND frws.SubmodelID = f2.SubmodelID
> AND frws.asin_ymm = f2.asin_ymm
> AND frws.fitment_key = f2.fitment_key;
>
> Thanks & Regards
>
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message houzj.fnst@fujitsu.com 2022-06-13 10:08:16 RE: Support logical replication of DDLs
Previous Message Pavel Stehule 2022-06-13 06:07:01 Re: "A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one"