From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Mark Mielke <mark(at)mark(dot)mielke(dot)cc> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Conor Walsh <ctw(at)adverb(dot)ly>, jd(at)commandprompt(dot)com, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: Does auto-analyze work on dirty writes? |
Date: | 2011-02-10 17:51:31 |
Message-ID: | AANLkTimeTk_reda4x3WwsgOfXN6NO6Be63YyjJ5UnZLz@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On Fri, Feb 4, 2011 at 8:50 PM, Mark Mielke <mark(at)mark(dot)mielke(dot)cc> wrote:
> On 02/04/2011 10:41 AM, Tom Lane wrote:
>>
>> 1. Autovacuum fires when the stats collector's insert/update/delete
>> counts have reached appropriate thresholds. Those counts are
>> accumulated from messages sent by backends at transaction commit or
>> rollback, so they take no account of what's been done by transactions
>> still in progress.
>>
>> 2. Only live rows are included in the stats computed by ANALYZE.
>> (IIRC it uses SnapshotNow to decide whether rows are live.)
>>
>> Although the stats collector does track an estimate of the number of
>> dead rows for the benefit of autovacuum, this isn't used by planning.
>> Table bloat is accounted for only in terms of growth of the physical
>> size of the table in blocks.
>
> Thanks, Tom.
>
> Does this un-analyzed "bloat" not impact queries? I guess the worst case
> here is if autovaccum is disabled for some reason and 99% of the table is
> dead rows. If I understand the above correctly, I think analyze might
> generate a bad plan under this scenario, thinking that a value is unique,
> using the index - but every tuple in the index has the same value and each
> has to be looked up in the table to see if it is visible?
It sounds like you're describing something like a one-row table with a
unique index on one of its column, getting updates that can't be made
HOT, and not getting vacuumed. That scenario does suck - I had a test
case I was using it a while back that generated something similar -
but I'm not sure how much it's worth worrying about the plan, because
either an index scan or a sequential scan is going to be awful.
To put that another way, I've founded that the optimizer copes pretty
well with adjusting plans as tables get bloated - mostly by using
index scans rather than sequential scans. It's possible there is some
improvement still to be had there, but I would be a lot more
interested in fixing the bloat, at least based on my own experiences.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2011-02-10 17:56:10 | Re: Why we don't want hints Was: Slow count(*) again... |
Previous Message | Kevin Grittner | 2011-02-10 17:44:29 | Re: Why we don't want hints Was: Slow count(*) again... |
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2011-02-10 17:56:10 | Re: Why we don't want hints Was: Slow count(*) again... |
Previous Message | Kevin Grittner | 2011-02-10 17:44:29 | Re: Why we don't want hints Was: Slow count(*) again... |