Quick Links

Understanding query planner cpu usage

From:	Lucas Fairchild-Madar <lucas(dot)madar(at)gmail(dot)com>
To:	pgsql-general(at)postgresql(dot)org
Subject:	Understanding query planner cpu usage
Date:	2018-02-21 22:56:43
Message-ID:	CAJmoq7M4iFo3=W891x5H+N5+axSOh5biWesAmrHsjVMvaBuSYA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

I'm having an perplexing issue in PG 10.1 wherein deleting a large amount
of rows from a table causes query planning time to spike dramatically for a
while. This happens with or without autovacuums so vacuuming isn't the
issue.

CPU usage during this time spikes as well. I can't determine if the query
planning issue is caused by CPU starvation or if the CPU starvation is
caused by the query planning. Either way, query planning spikes from 15ms
to nearly 7 seconds, and the load average on the machine spikes from around
1.0 to 80 or 90. Despite query planning time increasing by several orders
of magnitude, query execution time remains very similar.

The schema and query are similar to what's below, with each table
containing about 60 million rows. The load spike happens after deleting
approximately 1% of the table data from each of the tables in a
transaction. It happens immediately after the transaction commits, not
during the transaction itself. The transaction takes about a minute to
execute.

It's possible there are other longer lived transactions that may be working
on data in these tables, and I think there's probably a clue to the cause
here.

Does anyone have any clues or pointers for what I can look for? Is there a
way to interrogate the query planner and see what it's doing? (explain
analyze only gives the final chosen plan and how long query planning took
to execute)

CREATE TABLE data (
data_id bigint NOT NULL PRIMARY KEY
);

CREATE TABLE segment (
data_id bigint NOT NULL,
segment_id bigint NOT NULL PRIMARY KEY
);

CREATE TABLE raw (
segment_id bigint NOT NULL,
raw_id bigint NOT NULL PRIMARY KEY
);

CREATE TABLE processed (
segment_id bigint NOT NULL,
raw_id bigint NOT NULL,
processed_id bigint NOT NULL PRIMARY KEY,
magical_id integer
);

All primary keys are auto-incrementing sequences.

data -> segment is 1:many (but typically 1:1)
segment -> raw is 1:many (but almost always 1:1)
raw -> processed is 1:1

select * from processed
left join raw using (raw_id, segment_id)
left join segment using (segment_id)
left join data using (data_id)
where processed.magical_id = [magical_id]
order by processed_id desc;

Responses

Re: Understanding query planner cpu usage at 2018-02-22 00:14:36 from Tom Lane

Browse pgsql-general by date

	From	Date	Subject
Next Message	Rich Shepard	2018-02-21 23:17:12	Re: initdb when data/ folder has mount points
Previous Message	Ron Johnson	2018-02-21 22:38:53	initdb when data/ folder has mount points