Quick Links

Re: Slow SELECT

From:	Frank Millman <frank(at)chagford(dot)com>
To:	Olivier Gautherot <ogautherot(at)gautherot(dot)net>
Cc:	Postgres General <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Slow SELECT
Date:	2020-05-26 08:38:51
Message-ID:	8d91512c-5bdd-893e-af97-9336357e39fd@chagford.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On 2020-05-26 9:32 AM, Olivier Gautherot wrote:
> Hi Frank,
>
> On Tue, May 26, 2020 at 9:23 AM Frank Millman <frank(at)chagford(dot)com
> <mailto:frank(at)chagford(dot)com>> wrote:
>
> Hi all
>
> I have a SELECT that runs over 5 times slower on PostgreSQL compared
> with Sql Server and sqlite3. I am trying to understand why.
>
> I have a table that looks like this (simplified) -
>
> CREATE TABLE my_table (
> row_id SERIAL PRIMARY KEY,
> deleted_id INT DEFAULT 0,
> fld_1 INT REFERENCES table_1(row_id),
> fld_2 INT REFERENCES table_2(row_id),
> fld_3 INT REFERENCES table_3(row_id),
> fld_4 INT REFERENCES table_4(row_id),
> tran_date DATE,
> tran_total DEC(21,2)
> );
>
> CREATE UNIQUE INDEX my_table_ndx ON my_table (fld_1, fld_2, fld_3,
> fld_4, tran_date) WHERE deleted_id = 0;
>
> The table sizes are -
> my_table : 167 rows
> table_1 : 21 rows
> table_2 : 11 rows
> table_3 : 3 rows
> table_4 : 16 rows
>
> Therefore for each tran_date in my_table there are potentially
> 21x11x3x16 = 11088 rows. Most will be null.
>
> I want to select the row_id for the last tran_date for each of those
> potential groups. This is my select -
>
> SELECT (
> SELECT a.row_id FROM my_table a
> WHERE a.fld_1 = b.row_id
> AND a.fld_2 = c.row_id
> AND a.fld_3 = d.row_id
> AND a.fld_4 = e.row_id
> AND a.deleted_id = 0
> ORDER BY a.tran_date DESC LIMIT 1
> )
> FROM table_1 b, table_2 c, table_3 d, table_4 e
>
> Out of 11088 rows selected, 103 are not null.
>
> On identical data, this takes 0.06 sec on SQL Server, 0.04 sec on
> sqlite3, and 0.31 sec on PostgreSQL.
>
>
> SQL Server does a good job at caching data in memory. PostgreSQL does
> too on consecutive calls to the same table. What execution time do you
> get if you issue the query a second time?
>
> My first guess would be to add an index on my_table.tran_date and check
> in EXPLAIN that you don't have a SEQUENTIAL SCAN on that table.
>
> I have looked at the EXPLAIN, but I don't really know what to look for.
> I can supply it if that would help.
>
> Thanks for any advice.
>

Thanks Olivier. Unfortunately that did not help.

I was already running the query twice and only timing the second one.

I added the index on tran_date. The timing is the same, and EXPLAIN
shows that it is using a SEQUENTIAL SCAN.

Here is the EXPLAIN -

Nested Loop (cost=0.00..64155.70 rows=11088 width=4)
-> Nested Loop (cost=0.00..10.36 rows=528 width=12)
-> Nested Loop (cost=0.00..2.56 rows=33 width=8)
-> Seq Scan on table_2 c (cost=0.00..1.11 rows=11 width=4)
-> Materialize (cost=0.00..1.04 rows=3 width=4)
-> Seq Scan on table_3 d (cost=0.00..1.03 rows=3
width=4)
-> Materialize (cost=0.00..1.24 rows=16 width=4)
-> Seq Scan on table_4 e (cost=0.00..1.16 rows=16 width=4)
-> Materialize (cost=0.00..1.31 rows=21 width=4)
-> Seq Scan on table_1 b (cost=0.00..1.21 rows=21 width=4)
SubPlan 1
-> Limit (cost=5.77..5.77 rows=1 width=8)
-> Sort (cost=5.77..5.77 rows=1 width=8)
Sort Key: a.tran_date DESC
-> Seq Scan on my_table a (cost=0.00..5.76 rows=1
width=8)
Filter: ((fld_1 = b.row_id) AND (fld_2 =
c.row_id) AND (fld_3 = d.row_id) AND (fld_4 = e.row_id) AND (deleted_id
= 0))

Frank

In response to

Re: Slow SELECT at 2020-05-26 07:32:14 from Olivier Gautherot

Responses

Re: Slow SELECT at 2020-05-26 09:10:45 from Charles Clavadetscher

Browse pgsql-general by date

	From	Date	Subject
Next Message	Charles Clavadetscher	2020-05-26 09:10:45	Re: Slow SELECT
Previous Message	Olivier Gautherot	2020-05-26 07:32:14	Re: Slow SELECT