Re: Odd problem with performance in duplicate database

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgreSQL(dot)org
Subject: Re: Odd problem with performance in duplicate database
Date: 2003-08-12 00:51:39
Message-ID: 200308111751.39998.josh@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Tom,

> Partly. The numbers suggest that in ANALYZE's default sample of 3000
> rows, it's only finding about a dozen non-null tgroup_ids (yielding the
> 0.996 null_frac value); and that in one case all dozen are different and
> in the other case there are two duplicates. It would help if you
> boosted the stats target for this column by a factor of 10. (You can
> do that in 7.2, btw --- IIRC the only problem is that a pg_dump won't
> show that you did so.)

Hmmm. No dice. I raised the selectivity to 1000, which increased n_distinct
to 108, which is pretty close to accurate. However, the planner still
insists on using a seq scan on case_clients unless I drop random_page_cost to
1.5 (which is up from 1.2 but still somewhat unreasonable).

> But the other part of the problem is that in 7.2, the join selectivity
> estimator is way off when you are joining a unique column (like the pkey
> on the other side) to a column with a very large fraction of nulls.
> We only discovered this recently; it's fixed as of 7.3.3:

OK, I'll talk to the client about upgrading.

--
-Josh Berkus

______AGLIO DATABASE SOLUTIONS___________________________
Josh Berkus
Complete information technology josh(at)agliodbs(dot)com
and data management solutions (415) 565-7293
for law firms, small businesses fax 621-2533
and non-profit organizations. San Francisco

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message xin fu 2003-08-12 02:35:04 about performance of postgreSQL
Previous Message Christopher Kings-Lynne 2003-08-12 00:50:48 Re: Perfomance Tuning