Re: index bloat question

From: Szymon Guz <mabewlun(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: Re: index bloat question
Date: 2011-10-17 16:57:28
Message-ID: CAFjNrYuaA3npvMcpb7KZpJg6BVsysbKm0rjNy=wGwVp-At2YgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 17 October 2011 15:42, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Mon, Oct 17, 2011 at 2:15 AM, Szymon Guz <mabewlun(at)gmail(dot)com> wrote:
> >
> >
> > On 17 October 2011 02:01, Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com> wrote:
> >>
> >> On Sun, Oct 16, 2011 at 2:41 PM, Szymon Guz <mabewlun(at)gmail(dot)com> wrote:
> >> > Hi,
> >> > just a couple of questions:
> >> > will there be an index bloat if I have:
> >> > - a serial column and only add rows to the table?
> >> > - a text column and I only add rows to the table?
> >> > For the serial column the numbers are only incremented, for the text
> >> > column
> >> > I add random strings.
> >>
> >> With no deletes or updates, the only bloat will be from a non 100% fill
> >> factor.
> >
> > Hi Scott,
> >
> > if there is no bloat, how could you explain this:
> >
> > Simple test:
> >
> > CREATE TABLE test (
> > id text primary key,
> > category_id text not null
> > );
> >
> > CREATE INDEX i_category ON test (category_id);
> >
> >
> > I make 500k inserts in one transaction using a python script.
> > For the random text I use random uuid from the function:
> >
> > uuid.uuid4()
> >
> > After those inserts I create another index:
> >
> > CREATE INDEX i_new ON test (category_id);
> >
> >
> > select
> > pg_size_pretty(pg_relation_size('i_category')),
> > pg_size_pretty(pg_relation_size('i_new'))
> > ;
> >
> > Results:
> >
> > '37 MB';'28 MB'
>
> You didn't post your insertion script. btree indexes are always going
> to have extra space in them due to pages splitting and being only
> partially filled -- insertion order over the range of your datum plays
> into this (you'll get different index arrangements from random vs
> ordered insertion). for kicks, try reindexing both indexes and see
> what the size is afterwords.
>
> merlin
>

Yep, after reindexing the sizes are the same, I even get it why sizes were
different, thanks for the info.

regards
Szymon

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Lincoln Yeoh 2011-10-17 17:03:54 Re: 9.1 got really fast ;)
Previous Message Tom Lane 2011-10-17 15:44:34 Re: 9.1 got really fast ;)