Re: Substantial different index use between 9.5 and 9.6

From: Bill Measday <bill(at)measday(dot)com>
To: Daniel Blanch Bataller <daniel(dot)blanch(dot)bataller(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Substantial different index use between 9.5 and 9.6
Date: 2016-12-03 20:42:02
Message-ID: 6586dbf1-3c1e-e7b1-bdae-c2a3d2914f49@measday.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Seems to be a replicable issue in PostGis - ticket raised at their end,
so I'll wait for a resolution of the root cause.

Thanks for your help/thoughts.

Rgds

Bill

On 3/12/2016 2:41 AM, Daniel Blanch Bataller wrote:
> ANALYZE takes samples at random, so statistics might be different even
> with same postgresql version:
>
> https://www.postgresql.org/docs/current/static/sql-analyze.html
>
> For large tables, ANALYZE takes a random sample of the table
> contents, rather than examining every row. This allows even very
> large tables to be analyzed in a small amount of time. Note,
> however, that the statistics are only approximate, and will change
> slightly each time ANALYZE is run, even if the actual table
> contents did not change. This might result in small changes in the
> planner's estimated costs shown by EXPLAIN
> <https://www.postgresql.org/docs/current/static/sql-explain.html>.
> In rare situations, this non-determinism will cause the planner's
> choices of query plans to change after ANALYZE is run. To avoid
> this, raise the amount of statistics collected by ANALYZE, as
> described below.
>
>
> Though, having that round (x 1000) difference, my bet is that you have
> different statistics target whether on database, table or columns, see:
>
> The extent of analysis can be controlled by adjusting the
> default_statistics_target
> <https://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-DEFAULT-STATISTICS-TARGET> configuration
> variable, or on a column-by-column basis by setting the per-column
> statistics target with ALTER TABLE ... ALTER COLUMN ... SET
> STATISTICS (see ALTER TABLE
> <https://www.postgresql.org/docs/current/static/sql-altertable.html>).
> The target value sets the maximum number of entries in the
> most-common-value list and the maximum number of bins in the
> histogram. The default target value is 100, but this can be
> adjusted up or down to trade off accuracy of planner estimates
> against the time taken for ANALYZE and the amount of space
> occupied in pg_statistic. In particular, setting the statistics
> target to zero disables collection of statistics for that column.
> It might be useful to do that for columns that are never used as
> part of the WHERE, GROUP BY, or ORDER BY clauses of queries, since
> the planner will have no use for statistics on such columns.
>
>
> Here is some help on how to see statistics per column:
>
> http://stackoverflow.com/questions/15034622/check-statistics-targets-in-postgresql
>
> Check if this is the case.
>
>
>
>
>
>
>> El 2 dic 2016, a las 1:26, Bill Measday <bill(at)measday(dot)com
>> <mailto:bill(at)measday(dot)com>> escribió:
>>
>> Thanks Tom.
>>
>> First, this wasn't a migration but new db loaded from scratch (if
>> that matters).
>>
>> As per the end of the original post "I have vacuum analysed both
>> tables". I assume this is what you meant?
>>
>> My gut feel was that it isn't a postgis issue since the third example
>> I gave uses the index, but I will take it up with them too.
>>
>> Rgds
>>
>>
>> Bill
>>
>> On 2/12/2016 10:48 AM, Tom Lane wrote:
>>> Bill Measday <bill(at)measday(dot)com <mailto:bill(at)measday(dot)com>> writes:
>>>> Substantial different index use between 9.5 and 9.6
>>> Maybe you missed an ANALYZE after migrating? The plan difference
>>> seems to be due to a vast difference in rowcount estimate for the
>>> m_elevations condition:
>>>
>>>> -> Bitmap Heap Scan on m_elevations e
>>>> (cost=282802.21..37401439.43 rows=3512160 width=8)
>>>> -> Seq Scan on m_elevations e
>>>> (cost=10000000000.00..13296950520.12 rows=3512159563 width=8)
>>> If you don't know where that factor-of-1000 came from, maybe take
>>> it up with the postgis folk. It'd mostly be coming out of their
>>> selectivity estimation routines.
>>>
>>> regards, tom lane
>>
>>
>>
>> --
>> Sent via pgsql-performance mailing list
>> (pgsql-performance(at)postgresql(dot)org
>> <mailto:pgsql-performance(at)postgresql(dot)org>)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2016-12-05 02:28:16 Re: performance issue with bitmap index scans on huge amounts of big jsonb documents
Previous Message Daniel Blanch Bataller 2016-12-02 15:41:56 Re: Substantial different index use between 9.5 and 9.6