Query takes way longer with LIMIT, and EXPLAIN takes way longer than actual query

From: Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com>
To: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Query takes way longer with LIMIT, and EXPLAIN takes way longer than actual query
Date: 2020-08-14 21:34:52
Message-ID: CAD3a31UJVhu-Ojsikjz12PEdvbz_ha_4U-BzPn5smKUyrECO6A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi. I've got a query that runs fine (~50ms). When I add a "LIMIT 25" to
it though, it takes way longer. The query itself then takes about 4.5
seconds. And when I do an explain, it takes 90+ seconds for the same query!

Explains and detailed table/view info below. tbl_log has 1.2M records,
tbl_reference has 550k. This is 9.6.19 on CentOS 6 with PDGG packages.

I know the query itself could be re-written, but it's coming from an ORM,
so I'm really focused on why the adding a limit is causing such performance
degradation, and what to do about it. Any help or insight would be
appreciated. Also the discrepancy between the actual query and the
explain. Thanks!

Ken

*The good query (no LIMIT):*

agency=> EXPLAIN (ANALYZE,VERBOSE,BUFFERS,TIMING) SELECT * FROM Log WHERE
log_id IN (SELECT CASE WHEN to_table='client' THEN from_id WHEN
from_table='client' THEN to_id END FROM reference WHERE ((from_id_field =
E'client_id'
AND from_id = E'34918'
AND from_table = E'client'
AND to_table = E'log'
)
OR (to_id_field = E'client_id'
AND to_id = E'34918'
AND to_table = E'client'
AND from_table = E'log'
))) ORDER BY added_at DESC;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------
Sort (cost=167065.81..168594.77 rows=611586 width=336) (actual
time=43.942..46.566 rows=1432 loops=1)
Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject,
tbl_log.log_text, tbl_log.occurred_at, tbl_log.shift_report,
tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by,
tbl_log.changed_at, tbl_log.sys_log
Sort Key: tbl_log.added_at DESC
Sort Method: quicksort Memory: 999kB
Buffers: shared hit=7026
-> Nested Loop (cost=4313.36..14216.18 rows=611586 width=336) (actual
time=10.837..38.177 rows=1432 loops=1)
Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject,
tbl_log.log_text, tbl_log.occurred_at, tbl_log.shift_report,
tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by,
tbl_log.changed_at, tbl_log.sys_log
Buffers: shared hit=7026
-> HashAggregate (cost=4312.93..4325.68 rows=1275 width=136)
(actual time=10.802..13.800 rows=1433 loops=1)
Output: tbl_reference.to_table, tbl_reference.from_id,
tbl_reference.from_table, tbl_reference.to_id
Group Key: CASE WHEN (tbl_reference.to_table =
'client'::name) THEN tbl_reference.from_id WHEN (tbl_reference.from_table =
'client'::name) THEN tbl_reference.to_id ELSE NULL::integer END
Buffers: shared hit=1288
-> Bitmap Heap Scan on public.tbl_reference
(cost=46.69..4309.74 rows=1276 width=136) (actual time=0.747..6.822
rows=1433 loops=1)
Output: tbl_reference.to_table, tbl_reference.from_id,
tbl_reference.from_table, tbl_reference.to_id, CASE WHEN
(tbl_reference.to_table = 'client'::name) THEN tbl_reference.from_id WHEN
(tbl_reference.from_table = 'client'::name) THEN tbl_reference.to_id ELSE
NULL::integer END
Recheck Cond: ((tbl_reference.from_id_field =
'client_id'::name) OR ((tbl_reference.to_table = 'client'::name) AND
(tbl_reference.to_id = 34918)))
Filter: ((NOT tbl_reference.is_deleted) AND
(((tbl_reference.from_id_field = 'client_id'::name) AND
(tbl_reference.from_id = 34918) AND (tbl_reference.from_table =
'client'::name) AND (tbl_reference.to_table = 'log'::name)) OR
((tbl_reference.to_id_field = 'client_id'::name) AND (tbl_reference.to_id =
34918) AND (tbl_reference.to_table = 'client'::na
me) AND (tbl_reference.from_table = 'log'::name))))
Rows Removed by Filter: 15
Heap Blocks: exact=1275
Buffers: shared hit=1288
-> BitmapOr (cost=46.69..46.69 rows=1319 width=0)
(actual time=0.453..0.454 rows=0 loops=1)
Buffers: shared hit=13
-> Bitmap Index Scan on
index_tbl_reference_from_id_field (cost=0.00..4.43 rows=1 width=0) (actual
time=0.025..0.026 rows=0 loops=1)
Index Cond: (tbl_reference.from_id_field =
'client_id'::name)
Buffers: shared hit=3
-> Bitmap Index Scan on
index_tbl_reference_to_table_id (cost=0.00..41.61 rows=1319 width=0)
(actual time=0.421..0.423 rows=1448 loops=1)
Index Cond: ((tbl_reference.to_table =
'client'::name) AND (tbl_reference.to_id = 34918))
Buffers: shared hit=10
-> Index Scan using tbl_log_pkey on public.tbl_log
(cost=0.43..7.75 rows=1 width=336) (actual time=0.007..0.009 rows=1
loops=1433)
Output: tbl_log.log_id, tbl_log.log_type_code,
tbl_log.subject, tbl_log.log_text, tbl_log.occurred_at, tbl_log.written_by,
tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by, tbl_log.changed_at,
tbl_log.sys_log, tbl_log.shift_report
Index Cond: (tbl_log.log_id = CASE WHEN
(tbl_reference.to_table = 'client'::name) THEN tbl_reference.from_id WHEN
(tbl_reference.from_table = 'client'::name) THEN tbl_reference.to_id ELSE
NULL::integer END)
Buffers: shared hit=5738
Planning time: 0.866 ms
Execution time: 48.915 ms
(33 rows)

*The bad query (LIMIT):*

agency=> EXPLAIN (ANALYZE,VERBOSE,BUFFERS,TIMING) SELECT * FROM Log WHERE
log_id IN (SELECT CASE WHEN to_table='client' THEN from_id WHEN
from_table='client' THEN to_id END FROM reference WHERE ((from_id_field =
E'client_id'
AND from_id = E'34918'
AND from_table = E'client'
AND to_table = E'log'
)
OR (to_id_field = E'client_id'
AND to_id = E'34918'
AND to_table = E'client'
AND from_table = E'log'
))) ORDER BY added_at DESC LIMIT 25;

QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------
Limit (cost=47.11..1329.32 rows=25 width=336) (actual
time=47.103..97236.235 rows=25 loops=1)
Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject,
tbl_log.log_text, tbl_log.occurred_at, tbl_log.shift_report,
tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by,
tbl_log.changed_at, tbl_log.sys_log
Buffers: shared hit=3820
-> Nested Loop Semi Join (cost=47.11..31367302.81 rows=611586
width=336) (actual time=47.098..97236.123 rows=25 loops=1)
Output: tbl_log.log_id, tbl_log.log_type_code, tbl_log.subject,
tbl_log.log_text, tbl_log.occurred_at, tbl_log.shift_report,
tbl_log.written_by, tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by,
tbl_log.changed_at, tbl_log.sys_log
Join Filter: (tbl_log.log_id = CASE WHEN (tbl_reference.to_table =
'client'::name) THEN tbl_reference.from_id WHEN (tbl_reference.from_table =
'client'::name) THEN tbl_reference.to_id ELSE NULL::integer END)
Rows Removed by Join Filter: 28364477
Buffers: shared hit=3820
-> Index Scan Backward using tbl_log_added_at on public.tbl_log
(cost=0.43..147665.96 rows=1223171 width=336) (actual time=0.016..123.097
rows=19794 loops=1)
Output: tbl_log.log_id, tbl_log.log_type_code,
tbl_log.subject, tbl_log.log_text, tbl_log.occurred_at, tbl_log.written_by,
tbl_log.added_by, tbl_log.added_at, tbl_log.changed_by, tbl_log.changed_at,
tbl_log.sys_log, tbl_log.shift_report
Buffers: shared hit=2532
-> Materialize (cost=46.69..4316.12 rows=1276 width=136) (actual
time=0.002..2.351 rows=1433 loops=19794)
Output: tbl_reference.to_table, tbl_reference.from_id,
tbl_reference.from_table, tbl_reference.to_id
Buffers: shared hit=1288
-> Bitmap Heap Scan on public.tbl_reference
(cost=46.69..4309.74 rows=1276 width=136) (actual time=0.508..5.594
rows=1433 loops=1)
Output: tbl_reference.to_table, tbl_reference.from_id,
tbl_reference.from_table, tbl_reference.to_id
Recheck Cond: ((tbl_reference.from_id_field =
'client_id'::name) OR ((tbl_reference.to_table = 'client'::name) AND
(tbl_reference.to_id = 34918)))
Filter: ((NOT tbl_reference.is_deleted) AND
(((tbl_reference.from_id_field = 'client_id'::name) AND
(tbl_reference.from_id = 34918) AND (tbl_reference.from_table =
'client'::name) AND (tbl_reference.to_table = 'log'::name)) OR
((tbl_reference.to_id_field = 'client_id'::name) AND (tbl_reference.to_id =
34918) AND (tbl_reference.to_table = 'client'::na
me) AND (tbl_reference.from_table = 'log'::name))))
Rows Removed by Filter: 15
Heap Blocks: exact=1275
Buffers: shared hit=1288
-> BitmapOr (cost=46.69..46.69 rows=1319 width=0)
(actual time=0.313..0.315 rows=0 loops=1)
Buffers: shared hit=13
-> Bitmap Index Scan on
index_tbl_reference_from_id_field (cost=0.00..4.43 rows=1 width=0) (actual
time=0.011..0.013 rows=0 loops=1)
Index Cond: (tbl_reference.from_id_field =
'client_id'::name)
Buffers: shared hit=3
-> Bitmap Index Scan on
index_tbl_reference_to_table_id (cost=0.00..41.61 rows=1319 width=0)
(actual time=0.296..0.298 rows=1448 loops=1)
Index Cond: ((tbl_reference.to_table =
'client'::name) AND (tbl_reference.to_id = 34918))
Buffers: shared hit=10
Planning time: 0.650 ms
Execution time: 97236.582 ms
(31 rows)

Time: 97238.387 ms

*The bad query, as actual query, not explain:*

agency=> SELECT * FROM Log WHERE log_id IN (SELECT CASE WHEN
to_table='client' THEN from_id WHEN from_table='client' THEN to_id END FROM
reference WHERE ((from_id_field = E'client_id'
AND from_id = E'34918'
AND from_table = E'client'
AND to_table = E'log'
)
OR (to_id_field = E'client_id'
AND to_id = E'34918'
AND to_table = E'client'
AND from_table = E'log'
))) ORDER BY added_at DESC LIMIT 25;
log_id | log_type_code | subject
|

log_text

| occurred_at | shift_report | written_by | added_by |
added_at | changed_by | changed_at | sys_log
---------+---------------+---------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------+---------------------+--------------+------------+----------+---------------------+------------+---------------------+---------
(actual results snipped)

*Time: 4654.190 ms*

*Description of tables and views:*

agency=> \d+ log
View "public.log"
Column | Type | Modifiers | Storage |
Description
---------------+--------------------------------+-----------+----------+-------------
log_id | integer | | plain |
log_type_code | character varying(10)[] | | extended |
subject | character varying(80) | | extended |
log_text | text | | extended |
occurred_at | timestamp(0) without time zone | | plain |
shift_report | boolean | | plain |
written_by | integer | | plain |
added_by | integer | | plain |
added_at | timestamp(0) without time zone | | plain |
changed_by | integer | | plain |
changed_at | timestamp(0) without time zone | | plain |
sys_log | text | | extended |
View definition:
SELECT tbl_log.log_id,
tbl_log.log_type_code,
tbl_log.subject,
tbl_log.log_text,
tbl_log.occurred_at,
tbl_log.shift_report,
tbl_log.written_by,
tbl_log.added_by,
tbl_log.added_at,
tbl_log.changed_by,
tbl_log.changed_at,
tbl_log.sys_log
FROM tbl_log;

agency=> \d tbl_log
Table "public.tbl_log"
Column | Type |
Modifiers
---------------+--------------------------------+----------------------------------------------------------
log_id | integer | not null default
nextval('tbl_log_log_id_seq'::regclass)
log_type_code | character varying(10)[] | not null
subject | character varying(80) | not null
log_text | text |
occurred_at | timestamp(0) without time zone |
written_by | integer | not null
added_by | integer | not null
added_at | timestamp(0) without time zone | not null default now()
changed_by | integer | not null
changed_at | timestamp(0) without time zone | not null default now()
sys_log | text |
shift_report | boolean | default false
Indexes:
"tbl_log_pkey" PRIMARY KEY, btree (log_id)
"index_tbl_log_log_type_code" btree (log_type_code)
"tbl_log_added_at" btree (added_at)
"tbl_log_added_by" btree (added_by)
"tbl_log_added_by2" btree (added_at DESC)
"tbl_log_event_time" btree ((COALESCE(occurred_at, added_at)))
"tbl_log_log_type_code" btree (log_type_code)
"tbl_log_log_type_code_gin" gin (log_type_code)
"tbl_log_occurred_at" btree (occurred_at)
"tbl_log_subject" btree (subject)
"tbl_log_test" btree (added_at, log_type_code)
"tbl_log_test2" btree (log_type_code, added_at)
"tbl_log_written_by" btree (written_by)
Foreign-key constraints:
"tbl_log_added_by_fkey" FOREIGN KEY (added_by) REFERENCES
tbl_staff(staff_id)
"tbl_log_changed_by_fkey" FOREIGN KEY (changed_by) REFERENCES
tbl_staff(staff_id)
"tbl_log_written_by_fkey" FOREIGN KEY (written_by) REFERENCES
tbl_staff(staff_id)
Triggers:
tbl_log_changed_at_update BEFORE UPDATE ON tbl_log FOR EACH ROW EXECUTE
PROCEDURE auto_changed_at_update()
tbl_log_log_chg AFTER INSERT OR DELETE OR UPDATE ON tbl_log FOR EACH
ROW EXECUTE PROCEDURE table_log()

agency=> \d+ reference
View "public.reference"
Column | Type | Modifiers | Storage |
Description
-----------------+--------------------------------+-----------+----------+-------------
reference_id | integer | | plain |
from_table | name | | plain |
from_id_field | name | | plain |
from_id | integer | | plain |
to_table | name | | plain |
to_id_field | name | | plain |
to_id | integer | | plain |
added_at | timestamp(0) without time zone | | plain |
added_by | integer | | plain |
changed_at | timestamp(0) without time zone | | plain |
changed_by | integer | | plain |
is_deleted | boolean | | plain |
deleted_at | timestamp(0) without time zone | | plain |
deleted_by | integer | | plain |
deleted_comment | text | | extended |
sys_log | text | | extended |
View definition:
SELECT tbl_reference.reference_id,
tbl_reference.from_table,
tbl_reference.from_id_field,
tbl_reference.from_id,
tbl_reference.to_table,
tbl_reference.to_id_field,
tbl_reference.to_id,
tbl_reference.added_at,
tbl_reference.added_by,
tbl_reference.changed_at,
tbl_reference.changed_by,
tbl_reference.is_deleted,
tbl_reference.deleted_at,
tbl_reference.deleted_by,
tbl_reference.deleted_comment,
tbl_reference.sys_log
FROM tbl_reference
WHERE NOT tbl_reference.is_deleted;

agency=> \d+ tbl_reference
Table
"public.tbl_reference"
Column | Type |
Modifiers | Storage | Stats target |
Description
-----------------+--------------------------------+----------------------------------------------------------------------+----------+--------------+-------------
reference_id | integer | not null default
nextval('tbl_reference_reference_id_seq'::regclass) | plain |
|
from_table | name | not null
| plain | |
from_id_field | name | not null
| plain | |
from_id | integer | not null
| plain | |
to_table | name | not null
| plain | |
to_id_field | name | not null
| plain | |
to_id | integer | not null
| plain | |
added_at | timestamp(0) without time zone | not null default now()
| plain | |
added_by | integer | not null
| plain | |
changed_at | timestamp(0) without time zone | not null default now()
| plain | |
changed_by | integer | not null
| plain | |
is_deleted | boolean | not null default false
| plain | |
deleted_at | timestamp(0) without time zone |
| plain | |
deleted_by | integer |
| plain | |
deleted_comment | text |
| extended | |
sys_log | text |
| extended | |
Indexes:
"tbl_reference_pkey" PRIMARY KEY, btree (reference_id)
"unique_index_tbl_reference" UNIQUE, btree (from_table, from_id_field,
from_id, to_table, to_id_field, to_id)
"index_tbl_reference_from_id" btree (from_id)
"index_tbl_reference_from_id_field" btree (from_id_field)
"index_tbl_reference_from_table" btree (from_table)
"index_tbl_reference_is_deleted" btree (is_deleted)
"index_tbl_reference_to_id" btree (to_id)
"index_tbl_reference_to_id_field" btree (to_id_field)
"index_tbl_reference_to_table" btree (to_table)
"index_tbl_reference_to_table_id" btree (to_table, to_id)
Foreign-key constraints:
"tbl_reference_added_by_fkey" FOREIGN KEY (added_by) REFERENCES
tbl_staff(staff_id)
"tbl_reference_changed_by_fkey" FOREIGN KEY (changed_by) REFERENCES
tbl_staff(staff_id)
"tbl_reference_deleted_by_fkey" FOREIGN KEY (deleted_by) REFERENCES
tbl_staff(staff_id)
Triggers:
tbl_reference_alert_notify AFTER INSERT OR DELETE OR UPDATE ON
tbl_reference FOR EACH ROW EXECUTE PROCEDURE table_alert_notify()
tbl_reference_changed_at_update BEFORE UPDATE ON tbl_reference FOR EACH
ROW EXECUTE PROCEDURE auto_changed_at_update()
tbl_reference_log_chg AFTER INSERT OR DELETE OR UPDATE ON tbl_reference
FOR EACH ROW EXECUTE PROCEDURE table_log()

--
AGENCY Software
A Free Software data system
By and for non-profits
*http://agency-software.org/ <http://agency-software.org/>*
*https://demo.agency-software.org/client
<https://demo.agency-software.org/client>*
ken(dot)tanzer(at)agency-software(dot)org
(253) 245-3801

Subscribe to the mailing list
<agency-general-request(at)lists(dot)sourceforge(dot)net?body=subscribe> to
learn more about AGENCY or
follow the discussion.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Justin Pryzby 2020-08-14 22:04:24 Re: Query takes way longer with LIMIT, and EXPLAIN takes way longer than actual query
Previous Message Tom Lane 2020-08-11 23:30:58 Re: Hstore index for full text search