Re: Terribly slow query with very good plan?

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Les <nagylzs(at)gmail(dot)com>
Cc: Pgsql Performance <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Re: Terribly slow query with very good plan?
Date: 2022-02-05 21:25:02
Message-ID: CAM+6J95rPmi4BeuHjaNzk26axtyUWyZXSG48JSQE=n5EHWu5Kg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, 4 Feb 2022 at 14:41, Les <nagylzs(at)gmail(dot)com> wrote:

> Hello,
>
> The "replath" field contains the path of the file/folder. For example:
> "/folder1/folder2/folder3/filename4.ext5". The replath field is managed by
> triggers. There are about 1M rows for files and 600K folder rows in the
> database. The files are well distributed between folders, and there are
> only 45 root folders ( parent_id is null)
>
> Replying in a separate thread, just in case this does not help.
It seems you already store relpath but as text via triggers, will the
'ltree' extension be of any help to get your results faster (to help
implement path enumeration, but has a limitation of 65K objects)
https://www.postgresql.org/docs/current/ltree.html
https://github.com/postgres/postgres/blob/master/contrib/ltree/sql/ltree.sql
(test cases for elaborate examples)
https://patshaughnessy.net/2017/12/14/manipulating-trees-using-sql-and-the-postgres-ltree-extension

also, another pattern i came across was via closure tables
https://www.slideshare.net/billkarwin/models-for-hierarchical-data
https://stackoverflow.com/questions/19834400/what-is-the-simplest-way-to-save-a-file-tree-in-a-postgres-database/19835575

ex. (from the doc)
postgres=# drop table test;
DROP TABLE
postgres=# CREATE TABLE test (path ltree);
INSERT INTO test VALUES ('Top');
INSERT INTO test VALUES ('Top.Science');
INSERT INTO test VALUES ('Top.Science.Astronomy');
INSERT INTO test VALUES ('Top.Science.Astronomy.Astrophysics');
INSERT INTO test VALUES ('Top.Science.Astronomy.Cosmology');
INSERT INTO test VALUES ('Top.Hobbies');
INSERT INTO test VALUES ('Top.Hobbies.Amateurs_Astronomy');
INSERT INTO test VALUES ('Top.Collections');
INSERT INTO test VALUES ('Top.Collections.Pictures');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Stars');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Galaxies');
INSERT INTO test VALUES ('Top.Collections.Pictures.Astronomy.Astronauts');
CREATE INDEX path_gist_idx ON test USING GIST (path);
CREATE INDEX path_idx ON test USING BTREE (path); -- we can even make this
unique index as there would only be one path
-- we can also create partial indexes depending on the query pattern

#my focus is no rows filtered (hence less wasted operations)
postgres=# analyze test;
ANALYZE
postgres=# explain analyze select exists (SELECT 1 FROM test WHERE path ~
'*.Stars');
QUERY PLAN
------------------------------------------------------------------------------------------------------
Result (cost=1.16..1.17 rows=1 width=1) (actual time=0.010..0.010 rows=1
loops=1)
InitPlan 1 (returns $0)
-> Seq Scan on test (cost=0.00..1.16 rows=1 width=0) (actual
time=0.008..0.009 rows=1 loops=1)
Filter: (path ~ '*.Stars'::lquery)
Rows Removed by Filter: 10
Planning Time: 0.248 ms
Execution Time: 0.023 ms
(7 rows)

postgres=# set enable_seqscan TO 0; -- small table, hence
SET
postgres=# explain analyze select exists (SELECT 1 FROM test WHERE path ~
'*.Stars');
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Result (cost=8.15..8.16 rows=1 width=1) (actual time=0.020..0.021 rows=1
loops=1)
InitPlan 1 (returns $0)
-> Index Scan using path_gist_idx on test (cost=0.13..8.15 rows=1
width=0) (actual time=0.019..0.019 rows=1 loops=1)
Index Cond: (path ~ '*.Stars'::lquery)
Planning Time: 0.079 ms
Execution Time: 0.037 ms
(6 rows)

Please ignore, if not relevant to the discussion.

--
Thanks,
Vijay
LinkedIn - Vijaykumar Jain <https://www.linkedin.com/in/vijaykumarjain/>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Valli Annamalai 2022-02-07 05:15:12 Query choosing Bad Index Path
Previous Message Les 2022-02-04 14:43:55 Re: Terribly slow query with very good plan?