Re: (VERY) Slow Query - PostgreSQL 9.2

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: drum(dot)lucas(at)gmail(dot)com
Cc: Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: (VERY) Slow Query - PostgreSQL 9.2
Date: 2016-05-03 11:32:32
Message-ID: BDDAF314-0B45-43F4-B7CC-2268A1574068@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> On 03 May 2016, at 11:55, drum(dot)lucas(at)gmail(dot)com wrote:
>
> Hi all,
>
> I'm trying to get the query below a better performance.. but just don't know what else I can do...
>
> Please, have a look and let me know if you can help somehow.. also.. if you need some extra data jet ask me please.
>
> * Note that the gorfs.inode_segments table is 1.7TB size
>
> I have the following Query:
>
> explain analyze
>
> SELECT split_part(full_path, '/', 4)::INT AS account_id,
> split_part(full_path, '/', 6)::INT AS note_id,
> split_part(full_path, '/', 9)::TEXT AS variation,
> st_size,
> segment_index,
> reverse(split_part(reverse(full_path), '/', 1)) as file_name,
> i.st_ino,
> full_path,
> (i.st_size / 1000000::FLOAT)::NUMERIC(5,2) || 'MB' AS size_mb
> FROM gorfs.inodes i
> JOIN gorfs.inode_segments s
> ON i.st_ino = s.st_ino_target
> WHERE i.checksum_md5 IS NOT NULL
> AND s.full_path ~ '^/userfiles/account/[0-9]+/[a-z]+/[0-9]+'
> AND i.st_size > 0;

(Stripped the 1-and-a-half extra queries in there, but that incomplete one might be why you're waiting?)

> • Explain analyze link: http://explain.depesz.com/s/Oc6
> The query is taking ages, and I can't get the problem solved.
>
> These are the index I've already created on the inode_segments table:

> What else can I do to improve the Performance of the Query?

The first thing I notice in your query is that you're making use of hierarchically organised data without storing it hierarchically, namely that full_path field. The result of that is that both your table and your index contain a lot of redundant information.

Now I'm not so sure a hierarchical table + query are going to help get you much performance out of this (probably worth an experiment or two, mind that O/S's usually use inode trees for such things), but reducing the redundancy in the index would probably help:

create index gorfs.inodes_accounts_idx on gorfs.inodes (substring (full_path from 20)) where full_path like '/userfiles/account/%';

and then use similar expressions in your query of course:

where full_path like '/userfiles/account/%' and substring(full_path from 20) ~ '^[0-9]+/[a-z]+/[0-9]+';

Good luck!
Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Jan Keirse 2016-05-03 11:43:44 Vacuum full of parent without partitions possible?
Previous Message Durumdara 2016-05-03 11:28:34 Field size become unlimited in union...