Quick Links

Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	"Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong
Date:	2008-04-02 23:04:09
Message-ID:	17953.1207177449@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-hackers

I wrote:
> ... What I propose therefore
> is combining this patch with my older one so that ANALYZE counts
> according to the following rules:

> REDIRECT line pointer: ignore
> DEAD line pointer: count as dead
> HEAPTUPLE_LIVE tuple: count as live, include in statistics pool
> HEAPTUPLE_DEAD: count as dead
> HEAPTUPLE_RECENTLY_DEAD: count as dead
> HEAPTUPLE_INSERT_IN_PROGRESS: ignore
> HEAPTUPLE_DELETE_IN_PROGRESS: count as live

While working on this I realized that there's a special case ANALYZE has
to face that is not faced by VACUUM: it might see tuples inserted or
deleted by its own transaction. For example consider

begin;
... load lots of data into mytable ...
analyze mytable;
... issue complex queries against mytable ...
commit;

This is not an uncommon scenario, particularly with respect to temporary
tables. ANALYZE's historical behavior of sampling everything that's
good according to SnapshotNow does the right thing here, but ignoring
INSERT_IN_PROGRESS tuples would not.

The right way seems to be to treat our own insertions as live during
ANALYZE, but then subtract off our own pending insertions from the
live-tuples count sent to the stats collector. pgstat_report_analyze()
can handle the latter part by groveling through the backend's pending
statistics data.

Comments?

regards, tom lane

In response to

Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong at 2008-04-02 20:29:31 from Tom Lane

Responses

Re: [GENERAL] ANALYZE getting dead tuple count hopelessly wrong at 2008-04-03 16:12:48 from Pavan Deolasee

Browse pgsql-general by date

	From	Date	Subject
Next Message	Ron Mayer	2008-04-02 23:41:16	modules
Previous Message	Bruce Momjian	2008-04-02 23:03:09	Re: [GENERAL] SHA1 on postgres 8.3

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Decibel!	2008-04-02 23:05:39	Re: writing a MIN(RECORD) aggregate
Previous Message	Bruce Momjian	2008-04-02 23:03:09	Re: [GENERAL] SHA1 on postgres 8.3