Re: 8.4.4, 9.0, and 9.1 Planner Differences

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Anthony Presley <anthony(at)resolution(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>, Erik Van Gilder <evg(at)resolution(dot)com>
Subject: Re: 8.4.4, 9.0, and 9.1 Planner Differences
Date: 2011-10-22 15:58:26
Message-ID: 9317.1319299106@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Anthony Presley <anthony(at)resolution(dot)com> writes:
> We have a dev machine running 9.0.1 (an i3 laptop, with a regular hard disk,
> with 4GB of RAM, and a mostly untuned postgresql.conf file). The changed
> lines are:
> shared_buffers = 512MB
> temp_buffers = 48MB
> work_mem = 32MB
> maintenance_work_mem = 348MB
> checkpoint_segments = 10
> effective_cache_size = 512MB

> The same database is loaded onto a production server running 9.1.1 (dual QC
> processors, RAID-10 SAS drives, 36GB of RAM), which replicates to a backup
> server. This has a lot of changed properties:
> shared_buffers = 8500MB
> work_mem = 35MB
> maintenance_work_mem = 512MB
> wal_level = hot_standby
> checkpoint_segments = 50
> max_wal_senders = 3
> wal_keep_segments = 144
> random_page_cost = 1.0
> effective_cache_size = 16384MB
> effective_io_concurrency = 6

That random_page_cost setting is going to have a huge effect on the
planner's choices, and the larger effective_cache_size setting will
likely affect plans too. I don't find it surprising in the least
that you're getting different plan choices ... and even less so when
your "dev" and "production" DBs aren't even the same major version.
You might want to think about making your dev environment more like
your production.

> The same DB is loaded on both the production and the dev environment, and in
> all cases (about 5000 distinct different queries), the production
> environment is about 500x faster, except for one type of query (both
> databases were loaded from the same pg_dump on an 8.4.4 database):

> On the dev box, we have: http://explain.depesz.com/s/rwU - about 131
> seconds
> On the production box, we have: http://explain.depesz.com/s/3dt -
> about .25 seconds

Did you mislabel these? Because if you didn't, the numbers are right
in line with what you say above. But anyway, the problem with the
slower query appears to be poor rowcount estimates, leading the planner
to use a nestloop join when it shouldn't. You haven't provided nearly
enough context to let anyone guess why the estimates are off, other
than boilerplate suggestions like making sure the tables have been
ANALYZEd recently, and maybe increasing the statistics targets.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Anthony Presley 2011-10-22 23:12:51 Re: 8.4.4, 9.0, and 9.1 Planner Differences
Previous Message Radhya sahal 2011-10-22 01:11:41 explain workload