From: | Richard Broersma Jr <rabroersma(at)yahoo(dot)com> |
---|---|
To: | Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: For index bloat: VACUUM ANALYZE vs REINDEX/CLUSTER |
Date: | 2007-09-19 14:29:37 |
Message-ID: | 616354.597.qm@web31802.mail.mud.yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
--- Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com> wrote:
> Then I am confused again about how the fill factor stuff works. Let's
> say I have a table with four BTREE indexes. Should all of them have a
> fill factor of about 60 (lower than the default 90, that is) to be
> effective? Or will it help if I lower the fill factor on only a couple
> of the most often used ones? The primary key index is very, very
> rarely updated so I don't need it to have a fill factor.
>
> I could try and see these one by one, but that's no better than
> touching/feeling the database blind-folded. I would love to see some
> writeup about this whole thing, but it seems hard to come by!
I will try to explain everything that I understand about indexes and tables. I am sure that if
some part of my understanding is incorrect, someone will chime in to correct me.
In PostgreSQL, tables are physically distinct from indexes. This means that any give table is
written to disk as a file(s), and indexes are also written to disk as a separate file(s). A table
and index are both saved to disk in segmented block referred to a pages (I believe the default
size is 8K).
The advantage of the index file is that it is significantly smaller in size, so it takes less time
to sequentially scan and less time to read from disk.
Now when you want to find a record, PostgreSQL will/(may choose to) sequentially scan the index
until it find the record entry that corresponds with your criteria. This entry has a table
cross-reference to the actual page that contains the record that is "pointed" at by the index.
Lastly, the entire table page containing your record is read from disk to memory for further query
processing.
<ASSUMPTION>
When you insert a record into a table that generates an entry into the b-tree index file,
PostgreSQL will scan the pages of the index file to find the correct place and index page to add
this entry. If the page is already full, PostgreSQL "probably" replaces the old full pages with
two new pages with a distribution of that chunk of the B-tree index, and then adds the new entry
to one of those pages. This operation can become very expensive if many new pages need to be
created from single INSERT/UPDATE statement.
</ASSUMPTION>
By using fill factor, you are telling PostgreSQL to automatically leave a portion of any newly
created index page partially blank for future use. When a newly created index entry needs to be
saved, it can be stored in one of the "holes" left in the index page.
A large fill factor create both advantages and dis-advantages. For writing operations, it is a
big advantage because, a large fill factor will leave alot of holes for INSERT and UPDATE
operations to use. This can help increase the number of UPDATE/INSERT per seconds that you server
can handle since, they index pages do not have to be rebuilt very often.
However, the disadvantage is that, a newly created index with a large fill factor has "index
bloat" designed into it. This mean that the index pages have a large portion of holes. So
PostgreSQL will create more index pages than it would normally in order to hold both your index
and the pre-defined holes that you specified by your fill-factor. Larger indexes require more
time to be read from disk to memory and will require more time to sequentially scan to find to
find the cross-reference table page location of records of interest. So the net effect is that
larger indexes will make SELECT statement slower.
This is my understanding for tables indexes and fill factor. I hope it helps.
Regards,
Richard Broersma Jr.
From | Date | Subject | |
---|---|---|---|
Next Message | Decibel! | 2007-09-19 14:38:22 | Re: Is this good spec for a PostgreSQL server? |
Previous Message | Martijn van Oosterhout | 2007-09-19 14:10:48 | Re: a problem with pg_hba.conf |