From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | "tank(dot)zhang" <6220104(at)qq(dot)com> |
Cc: | postgres performance list <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Postgresql Sort cost Poor performance? |
Date: | 2019-04-02 07:42:07 |
Message-ID: | CAKJS1f8aMw0Fbd7K1D4n08fRMUS_19D9pTNXd6u9Qi+eMj6wOQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Tue, 2 Apr 2019 at 20:00, tank.zhang <6220104(at)qq(dot)com> wrote:
> 2、 Adding a DISTINCT response time was very slow
>
> qis3_dp2=# SELECT COUNT(DISTINCT SVIN) AS CHECKCARNUM FROM
> QIS_CARPASSEDSTATION A WHERE 1=1 AND A.SSTATIONCD = 'VQ3_LYG' AND A.SLINENO
> IN ( '1F' , '2F' , '3F' ) AND A.DWORKDATE >=
> TO_DATE('2017-02-11','YYYY-MM-DD') AND A.DWORKDATE <=
> TO_DATE('2019-03-11','YYYY-MM-DD');
> checkcarnum
> -------------
> 1071367
> (1 row)
That's because of how DISTINCT is implemented within an aggregate
function in PostgreSQL. Internally within the aggregate code in the
executor, a sort is performed on the entire input to the aggregate
node. The planner is currently unable to make use of any indexes that
provide pre-sorted input.
One way to work around this would be to perform the DISTINCT and
COUNT(*) in separate stages using a subquery.
From your original query, something like:
SELECT COUNT(SVIN) AS CHECKCARNUM,SMTOC
FROM (
SELECT SMTOC,SVIN
FROM QIS_CARPASSEDSTATION A
WHERE 1=1 AND A.SSTATIONCD = 'VQ3_LYG'
AND A.SLINENO IN ( '1F' , '2F' , '3F' )
AND A.DWORKDATE >= TO_DATE('2017-02-11','YYYY-MM-DD')
AND A.DWORKDATE <= TO_DATE('2019-03-11','YYYY-MM-DD')
GROUP BY SMTOC,SVIN
) A GROUP BY SMTOC;
An index something like:
CREATE INDEX ON QIS_CARPASSEDSTATION (SMTOC, SVIN, SSTATIONCD, DWORKDATE);
Should help speed up the subquery and provide pre-sorted input to the
outer aggregate. If you like, you could add SLINENO to the end of the
index to allow an index-only scan which may result in further
performance improvements.
Without the index, you're forced to sort, but at least it's just one
sort instead of two.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | tank.zhang | 2019-04-02 08:23:50 | Re: Postgresql Sort cost Poor performance? |
Previous Message | tank.zhang | 2019-04-02 07:00:09 | Re: Postgresql Sort cost Poor performance? |