Need help with slow query - postgres 12 on linux RHEL 8

From: "Wilson, Maria Louise (LARC-E301)[RSES]" <m(dot)l(dot)wilson(at)nasa(dot)gov>
To: "pgsql-admin(at)lists(dot)postgresql(dot)org" <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Need help with slow query - postgres 12 on linux RHEL 8
Date: 2023-12-27 21:40:41
Message-ID: 902F274D-3E26-4D7C-BF7F-07536C3857E6@nasa.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hello folks!

I am having a complex query slowing over time increasing in duration. If anyone has a few cycles that they could lend a hand or just point me in the right direction with this – I would surely appreciate it! Fairly beefy Linux server with Postgres 12 (latest) – this particular query has been getting slower over time & seemingly slowing everything else down. The server is dedicated entirely to this particular database. Let me know if I can provide any additional information!! Thanks in advance!

Here’s my background – Linux RHEL 8 – PostgreSQL 12.17. –

MemTotal: 263216840 kB

MemFree: 3728224 kB

MemAvailable: 197186864 kB

Buffers: 6704 kB

Cached: 204995024 kB

SwapCached: 19244 kB

free -m

total used free shared buff/cache available

Mem: 257047 51860 3722 10718 201464 192644

Swap: 4095 855 3240

Here are a few of the settings in our postgres server:

max_connections = 300 # (change requires restart)

shared_buffers = 10GB

temp_buffers = 24MB

work_mem = 2GB

maintenance_work_mem = 1GB

most everything else is set to the default.

The query is complex with several joins:

SELECT anon_1.granule_collection_id AS anon_1_granule_collection_id, anon_1.granule_create_date AS anon_1_granule_create_date, anon_1.granule_delete_date AS anon_1_granule_delete_date, ST_AsGeoJSON(anon_1.granule_geography) AS anon_1_granule_geography, ST_AsGeoJSON(anon_1.granule_geometry) AS anon_1_granule_geometry, anon_1.granule_is_active AS anon_1_granule_is_active, anon_1.granule_properties AS anon_1_granule_properties, anon_1.granule_update_date AS anon_1_granule_update_date, anon_1.granule_uuid AS anon_1_granule_uuid, anon_1.granule_visibility_last_update_date AS anon_1_granule_visibility_last_update_date, anon_1.granule_visibility_id AS anon_1_granule_visibility_id, collection_1.id AS collection_1_id, collection_1.entry_id AS collection_1_entry_id, collection_1.short_name AS collection_1_short_name, collection_1.version AS collection_1_version, file_1.id AS file_1_id, file_1.location AS file_1_location, file_1.md5 AS file_1_md5, file_1.name AS file_1_name, file_1.size AS file_1_size, file_1.type AS file_1_type, visibility_1.id AS visibility_1_id, visibility_1.name AS visibility_1_name, visibility_1.value AS visibility_1_value

FROM (SELECT granule.collection_id AS granule_collection_id, granule.create_date AS granule_create_date, granule.delete_date AS granule_delete_date, granule.geography AS granule_geography, granule.geometry AS granule_geometry, granule.is_active AS granule_is_active, granule.properties AS granule_properties, granule.update_date AS granule_update_date, granule.uuid AS granule_uuid, granule.visibility_last_update_date AS granule_visibility_last_update_date, granule.visibility_id AS granule_visibility_id

FROM granule JOIN collection ON collection.id = granule.collection_id

WHERE granule.is_active = true AND (collection.entry_id LIKE 'AJAX_CO2_CH4_1' OR collection.entry_id LIKE 'AJAX_O3_1' OR collection.entry_id LIKE 'AJAX_CH2O_1' OR collection.entry_id LIKE 'AJAX_MMS_1') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, beginning_date_time}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') > '2015-10-06T23:59:59+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, start_date}') > '2015-10-06T23:59:59+00:00') AND ((granule.properties #>> '{temporal_extent, range_date_times, 0, end_date_time}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, single_date_times, 0}') < '2015-10-09T00:00:00+00:00' OR (granule.properties #>> '{temporal_extent, periodic_date_times, 0, end_date}') < '2015-10-09T00:00:00+00:00') ORDER BY granule.uuid

LIMIT 26) AS anon_1 LEFT OUTER JOIN collection AS collection_1 ON collection_1.id = anon_1.granule_collection_id LEFT OUTER JOIN (granule_file AS granule_file_1 JOIN file AS file_1 ON file_1.id = granule_file_1.file_id) ON anon_1.granule_uuid = granule_file_1.granule_uuid LEFT OUTER JOIN visibility AS visibility_1 ON visibility_1.id = anon_1.granule_visibility_id ORDER BY anon_1.granule_uuid

Here’s the explain:

Sort (cost=10914809.92..10914810.27 rows=141 width=996)

Sort Key: granule.uuid

-> Hash Left Join (cost=740539.73..10914804.89 rows=141 width=996)

Hash Cond: (granule.visibility_id = visibility_1.id)

-> Hash Right Join (cost=740537.56..10914731.81 rows=141 width=1725)

Hash Cond: (granule_file_1.granule_uuid = granule.uuid)

-> Hash Join (cost=644236.90..10734681.93 rows=22332751 width=223)

Hash Cond: (file_1.id = granule_file_1.file_id)

-> Seq Scan on file file_1 (cost=0.00..9205050.88 rows=22068888 width=207)

-> Hash (cost=365077.51..365077.51 rows=22332751 width=20)

-> Seq Scan on granule_file granule_file_1 (cost=0.00..365077.51 rows=22332751 width=20)

-> Hash (cost=96300.33..96300.33 rows=26 width=1518)

-> Nested Loop Left Join (cost=96092.55..96300.33 rows=26 width=1518)

-> Limit (cost=96092.27..96092.33 rows=26 width=1462)

-> Sort (cost=96092.27..96100.47 rows=3282 width=1462)

Sort Key: granule.uuid

-> Nested Loop (cost=0.56..95998.73 rows=3282 width=1462)

-> Seq Scan on collection (cost=0.00..3366.24 rows=1 width=4)

Filter: (((entry_id)::text ~~ 'AJAX_CO2_CH4_1'::text) OR ((entry_id)::text ~~ 'AJAX_O3_1'::text) OR ((entry_id)::text ~~ 'AJAX_CH2O_1'::text) OR ((entry_id)::text ~~ 'AJAX_MMS_1'::text))

-> Index Scan using ix_granule_collection_id on granule (cost=0.56..92445.36 rows=18713 width=1462)

Index Cond: (collection_id = collection.id)

Filter: (is_active AND (((properties #>> '{temporal_extent,range_date_times,0,beginning_date_time}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,single_d

ate_times,0}'::text[]) > '2015-10-06T23:59:59+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,start_date}'::text[]) > '2015-10-06T23:59:59+00:00'::text)) AND (((properties #>> '{temporal_extent,range_date_times,0,end_

date_time}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,single_date_times,0}'::text[]) < '2015-10-09T00:00:00+00:00'::text) OR ((properties #>> '{temporal_extent,periodic_date_times,0,end_date}'::text[])

< '2015-10-09T00:00:00+00:00'::text)))

-> Index Scan using collection_pkey on collection collection_1 (cost=0.28..7.99 rows=1 width=56)

Index Cond: (id = granule.collection_id)

-> Hash (cost=1.52..1.52 rows=52 width=16)

-> Seq Scan on visibility visibility_1 (cost=0.00..1.52 rows=52 width=16)

Heres a bit about the tables –

Granule
Collection
Granule_file
Visibility

Granule:

public | granule | table | ims_api_writer | 36 GB |

ims_api=# \d+ granule

Table "public.granule"

Column | Type | Collation | Nullable | Default | Storage | Stats target | Description

-----------------------------+-----------------------------+-----------+----------+---------+----------+--------------+-------------

collection_id | integer | | not null | | plain | |

create_date | timestamp without time zone | | not null | | plain | |

delete_date | timestamp without time zone | | | | plain | |

geometry | geometry(Geometry,4326) | | | | main | |

is_active | boolean | | | | plain | |

properties | jsonb | | | | extended | |

update_date | timestamp without time zone | | not null | | plain | |

uuid | uuid | | not null | | plain | |

visibility_id | integer | | not null | | plain | |

geography | geography(Geometry,4326) | | | | main | |

visibility_last_update_date | timestamp without time zone | | | | plain | |

Indexes:

"granule_pkey" PRIMARY KEY, btree (uuid)

"granule_is_active_idx" btree (is_active)

"granule_properties_producer_id_idx" btree ((properties ->> 'producer_granule_id'::text))

"granule_update_date_idx" btree (update_date)

"idx_granule_geometry" gist (geometry)

"ix_granule_collection_id" btree (collection_id)

Foreign-key constraints:

"granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

"granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

Referenced by:

TABLE "granule_file" CONSTRAINT "granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

Triggers:

granule_temporal_range_trigger AFTER INSERT OR DELETE OR UPDATE ON granule FOR EACH ROW EXECUTE FUNCTION sync_granule_temporal_range()

Access method: heap

Collection:

public | collection | table | ims_api_writer | 39 MB |

ims_api=# \d collection

Table "public.collection"

Column | Type | Collation | Nullable | Default

------------------------------+-----------------------------+-----------+----------+----------------------------------------

id | integer | | not null | nextval('collection_id_seq'::regclass)

access_constraints | text | | |

additional_attributes | jsonb | | |

ancillary_keywords | character varying(160)[] | | |

create_date | timestamp without time zone | | not null |

dataset_language | character varying(80)[] | | |

dataset_progress | text | | |

data_resolutions | jsonb | | |

dataset_citation | jsonb | | |

delete_date | timestamp without time zone | | |

distribution | jsonb | | |

doi | character varying(220) | | |

entry_id | character varying(80) | | not null |

entry_title | character varying(1030) | | |

geometry | geometry(Geometry,4326) | | |

is_active | boolean | | not null |

iso_topic_categories | character varying[] | | |

last_update_date | timestamp without time zone | | not null |

locations | jsonb | | |

long_name | character varying(1024) | | |

metadata_associations | jsonb | | |

metadata_dates | jsonb | | |

personnel | jsonb | | |

platforms | jsonb | | |

processing_level_id | integer | | |

product_flag | text | | |

project_id | integer | | |

properties | jsonb | | |

quality | jsonb | | |

references | character varying(12000)[] | | |

related_urls | jsonb | | |

summary | jsonb | | |

short_name | character varying(80) | | |

temporal_extents | jsonb | | |

version | character varying(80) | | |

use_constraints | jsonb | | |

version_description | text | | |

visibility_id | integer | | not null |

world_date | timestamp without time zone | | |

tiling_identification_system | jsonb | | |

collection_data_type | text | | |

standard_product | boolean | | not null | false

Indexes:

"collection_pkey" PRIMARY KEY, btree (id)

"collection_entry_id_key" UNIQUE CONSTRAINT, btree (entry_id)

"idx_collection_geometry" gist (geometry)

Foreign-key constraints:

"collection_processing_level_id_fkey" FOREIGN KEY (processing_level_id) REFERENCES processing_level(id)

"collection_project_id_fkey" FOREIGN KEY (project_id) REFERENCES project(id)

"collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

Referenced by:

TABLE "collection_organization" CONSTRAINT "collection_organization_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

TABLE "collection_science_keyword" CONSTRAINT "collection_science_keyword_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

TABLE "collection_spatial_processing_hint" CONSTRAINT "collection_spatial_processing_hint_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

TABLE "granule" CONSTRAINT "granule_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

TABLE "granule_temporal_range" CONSTRAINT "granule_temporal_range_collection_id_fkey" FOREIGN KEY (collection_id) REFERENCES collection(id)

Granule_file:

public | granule_file | table | ims_api_writer | 1108 MB |

\d granule_file

Table "public.granule_file"

Column | Type | Collation | Nullable | Default

--------------+---------+-----------+----------+---------

granule_uuid | uuid | | |

file_id | integer | | |

Foreign-key constraints:

"granule_file_file_id_fkey" FOREIGN KEY (file_id) REFERENCES file(id)

"granule_file_granule_uuid_fkey" FOREIGN KEY (granule_uuid) REFERENCES granule(uuid)

Visibility:

public | visibility | table | ims_api_writer | 40 kB |

\d visibility

Table "public.visibility"

Column | Type | Collation | Nullable | Default

--------+-----------------------+-----------+----------+----------------------------------------

id | integer | | not null | nextval('visibility_id_seq'::regclass)

name | character varying(80) | | not null |

value | integer | | not null |

Indexes:

"visibility_pkey" PRIMARY KEY, btree (id)

"visibility_name_key" UNIQUE CONSTRAINT, btree (name)

"visibility_value_key" UNIQUE CONSTRAINT, btree (value)

Referenced by:

TABLE "collection" CONSTRAINT "collection_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

TABLE "granule" CONSTRAINT "granule_visibility_id_fkey" FOREIGN KEY (visibility_id) REFERENCES visibility(id)

Thanks for the help!

Maria Wilson

Nasa/Langley Research Center

Hampton, Virginia USA

m(dot)l(dot)wilson(at)nasa(dot)gov<mailto:m(dot)l(dot)wilson(at)nasa(dot)gov>

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Rajesh Kumar 2023-12-28 05:38:54 Re: Switchover
Previous Message Tom Lane 2023-12-27 21:38:21 Re: how to troubleshoot a wrong query plan