Re: Postgresql Sort cost Poor performance?

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: "tank(dot)zhang" <6220104(at)qq(dot)com>
Cc: postgres performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Postgresql Sort cost Poor performance?
Date: 2019-04-02 07:42:07
Message-ID: CAKJS1f8aMw0Fbd7K1D4n08fRMUS_19D9pTNXd6u9Qi+eMj6wOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Tue, 2 Apr 2019 at 20:00, tank.zhang <6220104(at)qq(dot)com> wrote:
> 2、 Adding a DISTINCT response time was very slow
>
> qis3_dp2=# SELECT COUNT(DISTINCT SVIN) AS CHECKCARNUM FROM
> QIS_CARPASSEDSTATION A WHERE 1=1 AND A.SSTATIONCD = 'VQ3_LYG' AND A.SLINENO
> IN ( '1F' , '2F' , '3F' ) AND A.DWORKDATE >=
> TO_DATE('2017-02-11','YYYY-MM-DD') AND A.DWORKDATE <=
> TO_DATE('2019-03-11','YYYY-MM-DD');
> checkcarnum
> -------------
> 1071367
> (1 row)

That's because of how DISTINCT is implemented within an aggregate
function in PostgreSQL. Internally within the aggregate code in the
executor, a sort is performed on the entire input to the aggregate
node. The planner is currently unable to make use of any indexes that
provide pre-sorted input.

One way to work around this would be to perform the DISTINCT and
COUNT(*) in separate stages using a subquery.

From your original query, something like:

SELECT COUNT(SVIN) AS CHECKCARNUM,SMTOC
FROM (
SELECT SMTOC,SVIN
FROM QIS_CARPASSEDSTATION A
WHERE 1=1 AND A.SSTATIONCD = 'VQ3_LYG'
AND A.SLINENO IN ( '1F' , '2F' , '3F' )
AND A.DWORKDATE >= TO_DATE('2017-02-11','YYYY-MM-DD')
AND A.DWORKDATE <= TO_DATE('2019-03-11','YYYY-MM-DD')
GROUP BY SMTOC,SVIN
) A GROUP BY SMTOC;

An index something like:
CREATE INDEX ON QIS_CARPASSEDSTATION (SMTOC, SVIN, SSTATIONCD, DWORKDATE);

Should help speed up the subquery and provide pre-sorted input to the
outer aggregate. If you like, you could add SLINENO to the end of the
index to allow an index-only scan which may result in further
performance improvements.

Without the index, you're forced to sort, but at least it's just one
sort instead of two.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message tank.zhang 2019-04-02 08:23:50 Re: Postgresql Sort cost Poor performance?
Previous Message tank.zhang 2019-04-02 07:00:09 Re: Postgresql Sort cost Poor performance?