Re: Number of characters in column preventing index usage

From: Hustler DBA <hustlerdba(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Number of characters in column preventing index usage
Date: 2017-02-18 00:16:48
Message-ID: CAM00CHFeSv-01S31WKNTkeyNJFf7buZMgrFqnU-4mbmdBKZbjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks you guys are correct... the size of the table caused the optimizer
to do a seq scan instead of using the index. I tried it on a 24 MB and 1
GB table and the expected index was used.

On Fri, Feb 17, 2017 at 7:04 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> > On Fri, Feb 17, 2017 at 3:49 PM, Tomas Vondra <
> tomas(dot)vondra(at)2ndquadrant(dot)com>
> > wrote:
> >> That may seem a bit strange, but I'd bet it finds the short value in
> some
> >> statistic (MCV, histogram) ans so can provide very accurate estimate.
>
> > ​I'm not seeing how any of the statistic columns would capture a value
> that
> > doesn't actually appear in the table...(actual ... row=0)​
>
> I think it's the other way around. It found
> '01625cfa-2bf8-45cf-bf4c-aa5f3c6fa8ea' in the stats, concluded
> (accurately) that there would be five matches, and on the strength of that
> decided that a seqscan over this very tiny table would be faster than an
> indexscan. In the other case, the short string exists neither in the
> table nor the stats, and the default estimate is turning out to be that
> there's a single match, for which it likes the indexscan solution. This
> is all pretty unsurprising if '01625cfa-2bf8-45cf-bf4c-aa5f3c6fa8ea'
> is in the most-common-values list. Anything that's *not* in that list
> is going to get a smaller rowcount estimate. (I don't think that the
> string length, per se, has anything to do with it.)
>
> I'm not sure what performance problem the OP was looking to solve,
> but expecting experiments on toy-sized tables to give the same plans
> as you get on large tables is a standard mistake when learning to work
> with the PG planner.
>
> Also, if toy-sized tables are all you've got, meaning the whole database
> can be expected to stay RAM-resident at all times, it'd be a good idea
> to reduce random_page_cost to reflect that. The default planner cost
> settings are meant for data that's mostly on spinning rust.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mike Beaton 2017-02-18 07:57:10 Re: Correct use of cursors for very large result sets in Postgres
Previous Message Tom Lane 2017-02-18 00:04:41 Re: Number of characters in column preventing index usage