Updating histogram_bounds after a delete

From: Derrick Rice <derrick(dot)rice(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Updating histogram_bounds after a delete
Date: 2011-03-16 19:40:55
Message-ID: AANLkTik6FnXpdEF+edr7Jh-cmEUfjsaPRoGOmMaxxPFO@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Greetings.

I recently ran into a problem with a planner opting for a sequential scan
rather than a bitmap heap scan because the stats suggested that my delete
query was going to affect 33% of the rows, rather than the 1% it really
was. I was able to follow the planner's logic and came to the realization
that it was a result of the histogram_bounds for that column being out of
date.

The table is regularly purged of some of it's oldest data, and new data is
constantly added. It seems to me that PostgreSQL *should* be able to
identify a query which is going to delete all rows within a histogram
bucket, and could possibly react by updating the histogram_bounds at
commit-time, rather than needing an additional analyze or needing
auto-analyze settings jacked way up.

Alternatively, it might be nice to be able to manually describe the table
(I've been following the "no hints" discussion) by providing information
along the lines of "always assume that column event_date is uniformly
distributed". This would be provided as schema information, not additional
SQL syntax for hints.

Is this something that is remotely feasible, has the suggestion been made
before, or am I asking for something where a solution already exists?

Thanks,

Derrick

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Derrick Rice 2011-03-16 19:41:55 Re: Updating histogram_bounds after a delete
Previous Message Claudio Freire 2011-03-16 18:39:26 Re: Re: Adding additional index causes 20,000x slowdown for certain select queries - postgres 9.0.3