Re: index scan backward plan question

From: Kevin Murphy <murphy(at)genome(dot)chop(dot)edu>
To: PostgreSQL general <pgsql-general(at)postgresql(dot)org>
Subject: Re: index scan backward plan question
Date: 2006-03-23 21:15:49
Message-ID: 44231005.6070208@genome.chop.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> Neil Conway wrote:
> There is no such thing as "PG 8.3".

I meant 8.1.3

> please provide the queries that trigger the problem and the
> relevant schema definitions.

Sorry about not posting more details initially. I was running out the
door and was hasty.

> Try re-running ANALYZE and retrying EXPLAIN ANALYZE.

It was analyzed (and unmodified since analysis). I also rebuilt the
table from scratch - same results.

I'm beginning to think my performance problem is mostly related to tight
memory. The same query that was insanely slow on one machine (1GB RAM)
was reasonably fast on another with more memory (8GB RAM). Also, when I
freed a bit more memory on the original machine, the performance wall
went away. PostgreSQL seems to be making OK decisions about when to use
the 'index scan backward' plan, but under tight memory conditions, the
index scan would take forever while the alternative would be fine.

My curiosity has sunk below my need to get real work done, so I don't
require any further response.

If anybody has too much time on his or her hands, you can read what I
wrote (below) before I decided it wasn't worth pursuing this any further.

Thanks,
Kevin Murphy

\d merged
Table "public.merged"
Column | Type | Modifiers
--------------+---------+-----------
pmid | integer |
first_author | text |
authors | text |
title | text |
journal | text |
date | date |
display_date | text |
volume | text |
issue | text |
pages | text |
symbol | text |
weight | real |
Indexes:
"merged_data_idx" btree (date, first_author)
"merged_first_author_idx" btree (first_author)
"merged_pmid_idx" btree (pmid)
"merged_symbol_idx" btree (symbol)
"merged_weight_date_idx" btree (weight, date)

The query in question searches on 'symbol', orders by the combination of
'weight' and 'date', and constrains by a limit and offset. The table
has 7,839,274 rows; 16,247 distinct symbol values; 17,279 distinct
weight values; 831,007 distinct weight-date combinations.

Depending on the query search term and offset value, one of two
different plans is chosen. For search terms matching large numbers of
rows, and with a low enough offset, an 'index scan backward' plan is
chosen, which is sometimes extremely fast (faster than the other plan).
However, for the search terms matching very large numbers of rows, the
'index scan backward' plan can be molasses-slow (as much as 2 and 3
orders of magnitude). During such slow queries, the CPU is not doing
much: postgres is consuming <=3% of CPU, and nothing else is happening
on the box ('top' is using the most CPU). Disk I/O is consistent and
low at around 1MB/sec while this is going on. Top doesn't show any
swapping, although free physical memory is very low, but on the other
hand there is lots of inactive memory (720MB). During these very long
queries, I see a lock on merged_weight_date_idx (which is 168 MB in
size, BTW) the whole time. There are no messages in the system or
postgresql logs. Can it really be taking PG 490 seconds (see below) to
read some fraction of 168MB into memory?

* * * * * * * *

postgresql.conf settings:

shared_buffers = 15000
work_mem = 32768
maintenance_work_mem = 65536
default_statistics_target = 1000

I just noticed that default_statistics_target is set at one extreme.
Since many of the columns have large numbers of values, highly skewed
distributions, and the oddballs are often important, it was my
understanding that this was not unreasonable (and early tests seemed to
bear this out). I stand ready to be corrected.

* * * * * * * *

OS:

OS X 10.4, 1 GB RAM, dual 1.8 Ghz PPC

* * * * * * * *

The following plan is produced for almost all values of 'symbol' except
the values with the most records. It is also produced for the
high-frequency symbols if the OFFSET is large enough. The following is
a search for a less common symbol than the problematic cases ('CFTR'
matches 8920 rows):

EXPLAIN ANALYZE SELECT pmid, authors, title, journal, date,
display_date, volume, issue, pages
FROM merged
WHERE symbol = 'CFTR'
ORDER BY weight DESC, date DESC
OFFSET 0 LIMIT 25;
QUERY
PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=29364.77..29364.83 rows=25 width=229) (actual
time=1351.026..1351.040 rows=25 loops=1)
-> Sort (cost=29364.77..29385.54 rows=8310 width=229) (actual
time=1351.023..1351.028 rows=25 loops=1)
Sort Key: weight, date
-> Bitmap Heap Scan on merged (cost=55.08..28823.76 rows=8310
width=229) (actual time=60.076..1198.231 rows=8920 loops=1)
Recheck Cond: (symbol = 'CFTR'::text)
-> Bitmap Index Scan on merged_symbol_idx
(cost=0.00..55.08 rows=8310 width=0) (actual time=39.128..39.128
rows=8920 loops=1)
Index Cond: (symbol = 'CFTR'::text)
Total runtime: 1357.684 ms

* * * * * * * *

In contrast, low-offset searches for the symbol 'ERVK6' (which matches
142814 rows) produce a different plan, based on 'index scan backward'.
For symbol 'ERVK6', I found that offset values of 1234 and less produce
this plan (with a limit of 25). Offset 1235 and above produce the
non-"index scan backward" plan.

Here's the plan for symbol = 'ERVK6', OFFSET = 1234:

EXPLAIN ANALYZE SELECT pmid, authors, title, journal, date,
display_date, volume, issue, pages
FROM merged
WHERE symbol = 'ERVK6'
ORDER BY weight DESC, date DESC
OFFSET 1234 LIMIT 25;

QUERY
PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=272879.06..278407.40 rows=25 width=229) (actual
time=489694.717..489893.358 rows=25 loops=1)
-> Index Scan Backward using merged_weight_date_idx on merged
(cost=0.00..31295954.88 rows=141525 width=229) (actual
time=4.974..489891.348 rows=1259 loops=1)
Filter: (symbol = 'ERVK6'::text)
Total runtime: 489893.584 ms

* * * * * * * *

Here's the plan for the same symbol with OFFSET = 1235:

EXPLAIN ANALYZE SELECT pmid, authors, title, journal, date,
display_date, volume, issue, pages
FROM merged
WHERE symbol = 'ERVK6'
ORDER BY weight DESC, date DESC
OFFSET 1235 LIMIT 25;


QUERY
PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=278441.55..278441.61 rows=25 width=229) (actual
time=5940.268..5940.299 rows=25 loops=1)
-> Sort (cost=278438.46..278792.27 rows=141525 width=229) (actual
time=5938.818..5940.130 rows=1260 loops=1)
Sort Key: weight, date
-> Bitmap Heap Scan on merged (cost=919.34..245370.50
rows=141525 width=229) (actual time=71.680..1169.222 rows=142814 loops=1)
Recheck Cond: (symbol = 'ERVK6'::text)
-> Bitmap Index Scan on merged_symbol_idx
(cost=0.00..919.34 rows=141525 width=0) (actual time=69.068..69.068
rows=142814 loops=1)
Index Cond: (symbol = 'ERVK6'::text)
Total runtime: 6692.934 ms

The difference doesn't seem to be due to disk caching. I can bounce
back and forth between the two queries getting consistent relative
timings (slow/fast/slow/fast).

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bjørn T Johansen 2006-03-23 21:19:06 Re: Logging of sql statements?
Previous Message Chris Browne 2006-03-23 21:03:32 Re: Advantages of PostgreSQL over MySQL 5.0