Re: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Dario Fumagalli <dfumagalli(at)tin(dot)it>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)
Date: 2002-09-03 14:23:51
Message-ID: 20020904002351.D19438@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Sep 03, 2002 at 04:08:18PM +0200, Dario Fumagalli wrote:
> Martijn van Oosterhout wrote:
> > What happens if you say "REINDEX INDEX products_pkey" or "REINDEX INDEX
> > det_prod_dep_consumpt_pkey". Do those numbers change?
> >
>
>
> Yes, it did it!
>
> proj_store=# REINDEX INDEX products_pkey;
> REINDEX
> proj_store=# REINDEX INDEX det_prod_dep_consumpt_pkey
> REINDEX
> proj_store=# select relname, relfilenode as node, reltoastrelid as relid,
> proj_store=# reltoastidxid as idxid, relkind as k, relpages as pag from
> pg_class order by relname;
>
> [Snip]
> relname | node | relid | idxid | k | pag
> --------------------------------+---------+--------+--------+---+------
> det_prod_dep_consumpt_pkey | 1224634 | 0 | 0 | i | 6
> [Snip]
> products_pkey | 1224633 | 0 | 0 | i | 5
>
> BUT... there is a but...
>
> The du -h command says that, despite the reported index reduction, the
> overall database size has increased to 105 MB (?)!.
>
> And this raises a question: why a reindex proj_store force in single
> user mode did not packed the indexes while from psql / multiuser
> postmaster it did? Bah!

It sounds like there is actually an awful lot of activity going on.

> NOTICE: Index idx_products_csc: Pages 832; Tuples 976: Deleted 90768.
> CPU 0.59s
> /1.12u sec.

There you go. 90000 deleted tuples, 900 active. When was the last time you
did a vacuum? Basically, you need to do a vacuum whenever you want to reuse
space. Especially prior to 7.2.

> NOTICE: --Relation pg_toast_258417--
> NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
> Keep/VTL 0/
> 0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space
> 0/0; EndEm
> pty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
> NOTICE: Index pg_toast_258417_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.

Toast table/index is empty, so it's not that.

> I finally made a VACUUM VERBOSE ANALYZE.
>
> It did things (lots of deletions and some moves) on the affected tables.
> But now, if I reissue the pg_class query, I get figures like the following:
> relname | node | relid | idxid | k | pag
> ---------------------------------+---------+--------+--------+---+------
> idx_det_prod_comp_vid_connsp | 1094065 | 0 | 0 | i | 1
> idx_det_prod_dep_consumptdp | 1094068 | 0 | 0 | i | 479
> idx_det_prod_vid_connsp | 1094059 | 0 | 0 | i | 1
> idx_det_prod_vid_resp | 1094053 | 0 | 0 | i | 1
> idx_det_prod_vid_stdsp | 1094056 | 0 | 0 | i | 1
> idx_det_turnover_c | 1094095 | 0 | 0 | i | 4
> idx_det_turnover_t | 1094098 | 0 | 0 | i | 4
> idx_products_b | 1094044 | 0 | 0 | i | 733
> idx_products_b2 | 1094047 | 0 | 0 | i | 281
> idx_products_csc | 1094041 | 0 | 0 | i | 832
> idx_products_v | 1094050 | 0 | 0 | i | 270
>
> So, new big index file have born!
> After this vacuum, du -h reports:
>
> [postgres(at)web base]$ du -h
> 1.6M ./1
> 1.5M ./18719
> 24M ./242014
> 11M ./46821
> 1.7M ./197097
> 2.3M ./279236
> 43M
>
> A lower disk usage than after the reindex above (still more than the
> expected 26MB). The development machine (with 3 weeks old outdated data,
> unfortunately, but still with the same record figures (+-5%)):

Your disk space seems to fluctuate an awful lot. Are you doing any queries
that UPDATE rows, even if the values are not changing?

> This question is about a different database server for another (very
> big) company. Here records are not hundreds, they are 10 millions up.
> And a vacuum I fear will last for more that some seconds.

Quite true. Hence the non-blocking vacuum in 7.2. Similarly, scheduling a
VACUUM at 4am and no-one will notice :)

> I hoped to be able to use a Debian. I had to mediate between Open Source
> (Debian, the preferred for us the developers) and a closed, commercial
> solution (loved by management... sigh). The solution was a Caldera... a
> Linux solution but with some commercial bells and whistles on it to make
> everyone happy. But I don't like it as Debian for programming purposes.
> It is too Red-Hattish :) and comes with ancient tools and libraries.

Hey, if it works, it works, right :)

> - Given that single user REINDEX [database name] FORCE in single user
> modedid rebuild the indexes but did not optimize them;
> - Given that a REINDEX INDEX [index name] did optimize and shrink a
> single index but increased the overall data size;
> - Given that a VACUUM VERBOSE ANALYZE somehow alters the overall
> database files size (seems to re-distribute the file space evenly across
> indexes);
> - Given that I was at a loss and ready to do anything with a logical
> sense (or not);

Hmm. BTW, I'd never heard of REINDEX DATABASE before yesterday, so I'm not
quite clear on what it does. REINDEX INDEX has always worked for me.

> 1) I reindexed all excessively big indexes one by one in psql:
>
> 2) I performed a VACUUM VERBOSE ANALYZE, that having nothing to
> re-distribute, simply "ate" all the wasted space.
>
> I know this is probably the weirdest and uninformed / untechnical
> "reasoning" you saw in this list from years, but it has one small,
> almost unnoticeable advantage: IT WORKED AND NOW THE DATABASE IS ITS
> ORIGINAL SIZE AGAIN (even less).

Actually, that seems to me to be the way to squeeze the most space out of
the DB. It should work very reliably.

Anyway, you should be up and running now. Still odd though.
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2002-09-03 14:24:36 Re: DELETE SQL too slow.
Previous Message Dario Fumagalli 2002-09-03 14:08:18 Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)