Re: Massive table bloat

From: Thom Brown <thombrown(at)gmail(dot)com>
To: Markus Wollny <Markus(dot)Wollny(at)computec(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Massive table bloat
Date: 2010-03-03 15:56:07
Message-ID: bddc86151003030756o722902f7t31accd588365ef3b@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 3 March 2010 15:33, Markus Wollny <Markus(dot)Wollny(at)computec(dot)de> wrote:
> Hi!
>
> I've set up some system to track slow page executions in one of our (as
> yet not live) web apps. The tracking itself is handled completely within
> the database using a function. Within a very short time (approx. 1 week)
> and although we haven't got that much traffic on our testpages, the
> table in question as grown beyond a size of 23 GB, even though a SELECT
> count(*) on it will tell me that it only contains 235 rows. I'm sure I
> must be missing something obvious here...
>
> Here's the DDL for the table:
>
> CREATE TABLE stats.slowpages
> (
> url text NOT NULL,
> lastexecduration integer NOT NULL,
> avgslowexecduration integer,
> execcount integer,
> lastexectime timestamp without time zone,
> site_id integer NOT NULL,
> slowestexecduration integer,
> totaltimespent bigint,
> CONSTRAINT "slowpages_pkey" PRIMARY KEY (url)
> )WITHOUT OIDS;
>
> -- Indexes
> CREATE INDEX idx_slowpages_duration ON stats.slowpages USING btree
> (lastexecduration);
> CREATE INDEX idx_slowpages_avgduration ON stats.slowpages USING btree
> (avgslowexecduration);
> CREATE INDEX idx_slowpages_execcount ON stats.slowpages USING btree
> (execcount);
> CREATE INDEX idx_slowpages_lastexec ON stats.slowpages USING btree
> (lastexectime);
> CREATE INDEX idx_slowpages_site ON stats.slowpages USING btree
> (site_id);
> CREATE UNIQUE INDEX uidx_slowpages_url_site ON stats.slowpages USING
> btree (url, site_id);
> CREATE INDEX idx_slowpages_totaltimespent ON stats.slowpages USING btree
> (totaltimespent);
>
> And this here is the function we use to insert or update entries in this
> table:
>
> CREATE or REPLACE FUNCTION "stats"."iou_slowpages"(
> IN "_site_id" integer,
> IN "_url" text,
> IN "_duration" integer)
> RETURNS void AS
> $BODY$
> BEGIN
>    LOOP
>
>        UPDATE stats.slowpages
>               SET  avgslowexecduration =
> ((avgslowexecduration*execcount)+_duration)/(execcount+1)
>               ,    execcount = execcount+1
>               ,    lastexectime = now()
>               ,    lastexecduration = _duration
>               ,    totaltimespent = totaltimespent + _duration
>               ,    slowestexecduration = CASE WHEN _duration >
> slowestexecduration
>                    THEN _duration ELSE slowestexecduration END
>               WHERE url = _url AND site_id = _site_id;
>        IF found THEN
>            RETURN;
>        END IF;
>
>        BEGIN
>            INSERT INTO
> stats.slowpages(url,lastexecduration,avgslowexecduration,slowestexecdura
> tion,totaltimespent,execcount,lastexectime,site_id)
>            VALUES (_url, _duration, _duration,_duration,_duration, 1,
> now(), _site_id);
>            RETURN;
>        EXCEPTION WHEN unique_violation THEN
>
>        END;
>    END LOOP;
> END;
> $BODY$
> LANGUAGE 'plpgsql' VOLATILE;
>
> _site_id is a small integer value, _url is a full URL string to a page
> and _duration is a value in milliseconds. We're on PostgreSQL 8.3.7.
>
> Any idea about what I may be missing here?
>
> Kind regards
>
>   Markus
>

If you update rows, it actually creates a new version of it. The old
one doesn't get removed until the VACUUM process cleans it up, so
maybe you need to run that against the database?

Regards

Thom

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Grzegorz Jaśkiewicz 2010-03-03 15:57:24 Re: Massive table bloat
Previous Message Richard Huxton 2010-03-03 15:50:03 Re: disable triggers isolated to transaction only?