Re: postgresql long running query

From: liam saffioti <liam(dot)saffiotti(at)gmail(dot)com>
To: "Dischner, Anton" <Anton(dot)Dischner(at)med(dot)uni-muenchen(dot)de>
Cc: pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>, Julien Rouhaud <rjuju123(at)gmail(dot)com>
Subject: Re: postgresql long running query
Date: 2021-12-08 08:34:42
Message-ID: CAGHjuaaAKsrSWjyjexo8AzgWkpdo_mKtp72hr=Q9hy7oqOHjLw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi again,
I created additional index for just one column. (CREATE INDEX ON
"T_CMN_SLAHISTORY"
("DefinitionId");
The query now runs faster. But I don't think that's my problem.
My query takes a very long time sometimes, not always. I could not
understand the reason for this.
I mean, why is it taking longer at that time for example?

My another example is here:

SELECT nspname AS schema_name, relname AS table_name,
pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN
pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN
('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid) >
100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY
pg_total_relation_size(C.oid) DESC LIMIT 30;

This query takes 800ms to run. But this query took 18 minutes last night. I
don't understand why the query that normally always takes a short time is
taking so long at this time. Also there are no locks.

The query logs are here:

2021-12-08 04:23:48.235 [local] surface postgres 508411 SELECT LOG:
*duration:
1124822.358 ms* statement: SELECT nspname as schema_name, relname AS
table_name, pg_total_relation_size (C .oid) AS table_size FROM
pg_class C LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE
nspname NOT IN ( 'pg_catalog', 'information_schema' ) and
pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname
!~ '^pg_toast' ORDER BY pg_total_relation_size (C .oid) DESC LIMIT 30

2021-12-08 04:23:48.227 [local] surface postgres 524884 SELECT LOG: d*uration:
525254.222 ms* statement: SELECT nspname as schema_name, relname AS
table_name, pg_total_relation_size (C .oid) AS table_size FROM
pg_class C LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE
nspname NOT IN ( 'pg_catalog', 'information_schema' ) and
pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname
!~ '^pg_toast' ORDER BY pg_total_relation_size (C .oid) DESC LIMIT 30

2021-12-08 04:23:48.235 [local] surface postgres 516992 SELECT LOG: * duration:
824718.012 ms* statement: SELECT nspname as schema_name, relname AS
table_name, pg_total_relation_size (C .oid) AS table_size FROM
pg_class C LEFT JOIN pg_namespace N ON (N.oid = C .relnamespace) WHERE
nspname NOT IN ( 'pg_catalog', 'information_schema' ) and
pg_total_relation_size (C .oid)>100000000 AND C .relkind <> 'i' AND nspname
!~ '^pg_toast' ORDER BY pg_total_relation_size (C .oid) DESC LIMIT 30

Finally, let me point out; the execution plan is here

EXPLAIN ANALYZE SELECT nspname AS schema_name, relname AS table_name,
pg_total_relation_size(C.oid) AS table_size FROM pg_class C LEFT JOIN
pg_namespace N ON (N.oid = C.relnamespace) WHERE nspname NOT IN
('pg_catalog', 'information_schema') AND pg_total_relation_size(C.oid) >
100000000 AND C.relkind <> 'i' AND nspname !~ '^pg_toast' ORDER BY
pg_total_relation_size(C.oid) DESC LIMIT 30;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=34231.14..34231.22 rows=30 width=136) (actual
time=837.358..837.367 rows=30 loops=1)
-> Sort (cost=34231.14..34233.24 rows=840 width=136) (actual
time=837.356..837.362 rows=30 loops=1)
Sort Key: (pg_total_relation_size((c.oid)::regclass)) DESC
Sort Method: top-N heapsort Memory: 38kB
-> Hash Join (cost=204.31..34206.33 rows=840 width=136) (actual
time=70.550..837.181 rows=161 loops=1)
Hash Cond: (c.relnamespace = n.oid)
-> Seq Scan on pg_class c (cost=0.00..33995.56 rows=1664
width=72) (actual time=0.246..801.685 rows=202 loops=1)
Filter: ((relkind <> 'i'::"char") AND
(pg_total_relation_size((oid)::regclass) > 100000000))
Rows Removed by Filter: 9022
-> Hash (cost=166.91..166.91 rows=2992 width=68) (actual
time=15.494..15.494 rows=3001 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 326kB
-> Seq Scan on pg_namespace n (cost=0.00..166.91
rows=2992 width=68) (actual time=0.026..14.616 rows=3001 loops=1)
Filter: ((nspname <> ALL
('{pg_catalog,information_schema}'::name[])) AND (nspname !~
'^pg_toast'::text))
Rows Removed by Filter: 2926
Planning Time: 2.627 ms
*Execution Time: 837.660 ms*
(16 rows)

Thank you so much for your help.

Dischner, Anton <Anton(dot)Dischner(at)med(dot)uni-muenchen(dot)de>, 6 Ara 2021 Pzt, 11:21
tarihinde şunu yazdı:

> Hi all,
>
>
>
> please report your findings trying to use a new index special for
> DefinitionId as suggested by Edward.
>
>
>
> We recently found for our data:
>
>
>
> By adding indexes the running times of queries were reduced from days to
> several seconds.
>
> Don’t be scared: you can always remove indexes by dropping them!
>
>
>
> best,
>
>
>
> Anton
>
>
>
> *Von:* liam saffioti <liam(dot)saffiotti(at)gmail(dot)com>
> *Gesendet:* Freitag, 3. Dezember 2021 14:12
> *An:* Dischner, Anton <Anton(dot)Dischner(at)med(dot)uni-muenchen(dot)de>
> *Cc:* pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>; Julien Rouhaud <
> rjuju123(at)gmail(dot)com>
> *Betreff:* Re: postgresql long running query
>
>
>
> Hi Anton and Samed,
>
> Thank you for your reply.
>
> I use PostgreSQL 12.9 on RHEL 8.4.
>
> There are no DDL queries on the database in a whole day, so there is no
> lock state.
>
> The pgbadger slow queries report shows this query ran for 43min 11sec.
>
> And table and indexes definition is:
>
>
>
> \d+ "T_CMN_SLAHISTORY"
> Table
> "T_CMN_SLAHISTORY"
> Column | Type | Collation | Nullable |
> Default | Storage | Stats target | Description
>
> ------------------+-----------------------------+-----------+----------+----------------------------------+----------+--------------+-------------
> HistoryId | integer | | not null |
> generated by default as identity | plain | |
> DefinitionId | integer | | not null |
> | plain | |
> RuleName | character varying(200) | | not null |
> | extended | |
> CreateUserId | integer | | not null |
> | plain | |
> CreateUserTime | timestamp without time zone | | not null |
> | plain | |
> LastRunningTime | timestamp without time zone | | |
> | plain | |
> Cost | integer | | |
> | plain | |
> UpdateUserId | integer | | |
> | plain | |
> UpdateUserTime | timestamp without time zone | | |
> | plain | |
> RecId | character varying(36) | | |
> | extended | |
> MyTimeStamp | bytea | | |
> | extended | |
> IsOldRecordBatch | boolean | | |
> | plain | |
> InstanceId | bigint | | |
> | plain | |
> Indexes:
> "T_CMN_SLAHISTORY_HistoryId_idx" UNIQUE, btree ("HistoryId")
> "Index-20180207-152712" btree ("RuleName", "DefinitionId")
> "Index-20180712-192739" btree ("DefinitionId", "IsOldRecordBatch")
> "Index-20180712-192836" btree ("InstanceId")
> Access method: heap
>
>
>
>
>
> Schema | Name | Type | Owner | Size | Description
> ---------+------------------+-------+----------+--------+-------------
> public | T_CMN_SLAHISTORY | table | postgres | 244 MB |
>
>
>
> Dischner, Anton <Anton(dot)Dischner(at)med(dot)uni-muenchen(dot)de>, 3 Ara 2021 Cum,
> 14:39 tarihinde şunu yazdı:
>
> Hi Liam,
>
>
>
> top tuning tip for every relational database is: indexes!
>
>
>
> I have adminstrated databases where data was 300 GB and created indexes
> 600 GB
>
> You may think thats useless redundant but ist definitively not.
>
>
>
> Rules for indexing:
>
>
>
> Does it make sense to create an index for a table with one or several
> douzends of entries? -> YES!
>
> Does it make sense to create an index that is optimized for certain
> queries? -> YES!
>
> Does it make sense to create an index, make the query and drop the index?
> -> YES
>
>
>
> Even with tables with billions of entries you should get answer times of
> less than a second!
>
> … if you have a good index: date for example where the database engine
> plays ist full potental.
>
>
>
> For example see:
> https://www.enterprisedb.com/postgres-tutorials/overview-postgresql-indexes
>
>
>
> Which indexes do you have?
>
>
>
> best,
>
>
>
> A
>
>
>
> *Von:* liam saffioti <liam(dot)saffiotti(at)gmail(dot)com>
> *Gesendet:* Freitag, 3. Dezember 2021 12:24
> *An:* pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>; Julien Rouhaud <
> rjuju123(at)gmail(dot)com>
> *Betreff:* postgresql long running query
>
>
>
> Hello Team,
>
>
>
> I have a problem with a query that consumes a long time.
>
> The query' execution plan is :
>
>
>
>
>
> EXPLAIN ANALYZE SELECT x."HistoryId", x."SlaDefinition" FROM
> "T_CMN_SLAHISTORY" AS x WHERE x."DefinitionId" = '302';
>
> QUERY PLAN
>
>
> -----------------------------------------------------------------------------------------------------------------------------------------------------------
> Bitmap Heap Scan on "T_CMN_SLAHISTORY" x (cost=2144.77..35086.42
> rows=139012 width=8) (actual time=58.806..116.874 rows=135498 loops=1)
> Recheck Cond: ("DefinitionId" = 302)
> Heap Blocks: exact=2175
> -> Bitmap Index Scan on "Index-20180712-192739" (cost=0.00..2110.02
> rows=139012 width=0) (actual time=57.043..57.045 rows=135498 loops=1)
> Index Cond: ("DefinitionId" = 302)
> Planning Time: 11.132 ms
> Execution Time: 120.320 ms
>
>
>
>
>
>
>
> But, the query execution time was 43min 11seconds in the morning in
> pgbadger report. I don't understand why the query is taking so long. Can
> you guide me?
>
>
>
> Thank you so much.
>
>
>
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Paul Smith 2021-12-08 09:18:22 Re: PostgreSQL(v9.6.5) not releasing old/inactive IDLE connections
Previous Message Ron 2021-12-08 07:23:32 Re: PostgreSQL(v9.6.5) not releasing old/inactive IDLE connections