Re: Number of characters in column preventing index usage

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Number of characters in column preventing index usage
Date: 2017-02-17 23:19:15
Message-ID: CAKFQuwaDX7vJSrX2Rkz9njXseFVRe8eAStOvCQzqd4pc_vhcNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Feb 17, 2017 at 3:49 PM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> That may seem a bit strange, but I'd bet it finds the short value in some
> statistic (MCV, histogram) ans so can provide very accurate estimate.

​​ -> Index Only Scan using tab_idx1 on tab (cost=0.27..8.29 rows=1
width=0) (actual time=0.043..0.043 rows=0 loops=1)

​I'm not seeing how any of the statistic columns would capture a value that
doesn't actually appear in the table...(actual ... row=0)​

Unless there is some prefix matching going on here since the short value is
a substring(1, n) of the longer one which does appear 5 times.

​I guess maybe because the value doesn't appear it uses the index (via IOS)
to confirm absence (or near absence, i.e., 1) while, knowing the larger
value appears 5 times out of 223, it decides a quick table scan is faster
than any form of double-lookup (whether on the visibility map or the heap).

​https://www.postgresql.org/docs/9.6/static/indexes-index-only-scans.html​

​David J.​

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2017-02-18 00:04:41 Re: Number of characters in column preventing index usage
Previous Message Hustler DBA 2017-02-17 23:16:23 Re: Number of characters in column preventing index usage