From: | Hustler DBA <hustlerdba(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Number of characters in column preventing index usage |
Date: | 2017-02-18 00:16:48 |
Message-ID: | CAM00CHFeSv-01S31WKNTkeyNJFf7buZMgrFqnU-4mbmdBKZbjQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Thanks you guys are correct... the size of the table caused the optimizer
to do a seq scan instead of using the index. I tried it on a 24 MB and 1
GB table and the expected index was used.
On Fri, Feb 17, 2017 at 7:04 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> > On Fri, Feb 17, 2017 at 3:49 PM, Tomas Vondra <
> tomas(dot)vondra(at)2ndquadrant(dot)com>
> > wrote:
> >> That may seem a bit strange, but I'd bet it finds the short value in
> some
> >> statistic (MCV, histogram) ans so can provide very accurate estimate.
>
> > I'm not seeing how any of the statistic columns would capture a value
> that
> > doesn't actually appear in the table...(actual ... row=0)
>
> I think it's the other way around. It found
> '01625cfa-2bf8-45cf-bf4c-aa5f3c6fa8ea' in the stats, concluded
> (accurately) that there would be five matches, and on the strength of that
> decided that a seqscan over this very tiny table would be faster than an
> indexscan. In the other case, the short string exists neither in the
> table nor the stats, and the default estimate is turning out to be that
> there's a single match, for which it likes the indexscan solution. This
> is all pretty unsurprising if '01625cfa-2bf8-45cf-bf4c-aa5f3c6fa8ea'
> is in the most-common-values list. Anything that's *not* in that list
> is going to get a smaller rowcount estimate. (I don't think that the
> string length, per se, has anything to do with it.)
>
> I'm not sure what performance problem the OP was looking to solve,
> but expecting experiments on toy-sized tables to give the same plans
> as you get on large tables is a standard mistake when learning to work
> with the PG planner.
>
> Also, if toy-sized tables are all you've got, meaning the whole database
> can be expected to stay RAM-resident at all times, it'd be a good idea
> to reduce random_page_cost to reflect that. The default planner cost
> settings are meant for data that's mostly on spinning rust.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
From | Date | Subject | |
---|---|---|---|
Next Message | Mike Beaton | 2017-02-18 07:57:10 | Re: Correct use of cursors for very large result sets in Postgres |
Previous Message | Tom Lane | 2017-02-18 00:04:41 | Re: Number of characters in column preventing index usage |