Re: Inconsistent performance with LIKE and bind variable on long-lived connection

From: Steven Grimm <sgrimm(at)thesegovia(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Inconsistent performance with LIKE and bind variable on long-lived connection
Date: 2017-06-11 08:10:44
Message-ID: CAOFXwWXa8xeL2oc3k08MHph_tgYyU4fQYnzEyBEcC3keO6cm+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, Jun 10, 2017 at 11:10 PM, David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> https://www.postgresql.org/docs/current/static/sql-prepare.html
>
> Specifically, the notes section.

That seems to fit the behavior. Thanks; I wasn't aware of that feature of
prepared statements. I changed the Python code to do EXPLAIN ANALYZE
EXECUTE rather than EXECUTE, and I do indeed see a change in plan after the
fifth iteration:

(first five iterations)

Index Only Scan using test_pkey on test (cost=0.29..476.29 rows=9999
width=4) (actual time=0.058..2.439 rows=10000 loops=1)
Index Cond: (col1 = 'xyz'::text)
Filter: (col2 ~~ '%'::text)
Heap Fetches: 0
Execution time: 2.957 ms

(iterations 6+)

Sort (cost=205.41..205.54 rows=50 width=4) (actual time=104.986..105.784
rows=10000 loops=1)
Sort Key: col2
Sort Method: quicksort Memory: 853kB
-> Seq Scan on test (cost=0.00..204.00 rows=50 width=4) (actual
time=0.014..2.100 rows=10000 loops=1)
Filter: ((col2 ~~ $2) AND (col1 = $1))
Execution time: 106.282 ms

So the problem here may be that the cost estimate for the generic execution
plan is way off in the case of a LIKE bind variable that matches a large
number of rows. I did make sure to have the Java code do a VACUUM ANALYZE
after doing its inserts, just to eliminate lack of statistics as a possible
explanation. Maybe the incorrect row count estimate (50 instead of 10000)
is causing it to think the quicksort will be a lot cheaper than it ends up
being with the actual rows?

Interesting that the Java version switches to the suboptimal plan after 9
iterations rather than 5. I don't know how to get the JDBC driver to do an
EXPLAIN on a prepared statement, so I can't confirm that the same thing is
happening there, but it seems plausible. Happy to try that if there's a way
to do it.

-Steve

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2017-06-11 08:17:01 Re: ERROR: unexpected chunk number 0 (expected 1) for toast value 76753264 in pg_toast_10920100
Previous Message David G. Johnston 2017-06-11 06:10:19 Re: Inconsistent performance with LIKE and bind variable on long-lived connection