From: | Boszormenyi Zoltan <zb(at)cybertec(dot)at> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Cc: | Tomas Vondra <tv(at)fuzzy(dot)cz>, Hans-Juergen Schoenig <hs(at)cybertec(dot)at> |
Subject: | Re: FILLFACTOR and increasing index |
Date: | 2011-06-12 19:14:41 |
Message-ID: | 4DF51021.2020607@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
2011-05-12 00:28 keltezéssel, Tomas Vondra írta:
> Hi,
>
> I've studied the implementation of the btree indexes and how exactly the
> fillfactor is used, and in general
>
> - when a page split happens, the process needs to obtain more locks
> than with simple insert, which may result in contention with other
> processes that modify the index (the same pages)
>
> - the fillfactor is used only for the leaf pages, the rest of the index
> does not use it (directly)
>
> So lowering the number of page splits might remove some contention when
> there's a lot of processes accessing the same pages.
>
> But that's the theory - I really was not able to come up with a test
> that benefits from lower fillfactor. Actually the lower the fillfactor,
> the larger the index, which may be a significant issue with huge indexes.
We recently had a testcase for exercising FILLFACTOR on indexes.
Several (15+) GB raw data arrives daily and must be imported into
the database for analytic purposes, the table is heavily partitioned
and each partition has 5 or 6 indexes. The importer is heavily threaded
and uses COPY to insert the data. This is strictly an INSERT-only scenario,
the analysis comes later. This is where FILLFACTOR=70 helped to
reduce the index contention.
With the default 90% (and 100% on the table itself) when a lot of data
arrived in burst that were in the same time interval so 4-8 threads tried
to push data into the same partition, individual data chunks (about 10-15MB
each, about 63500 pieces) were imported in about 1.5-2.5 minutes, with
a few spikes over 4 minutes. This was the result of a few days long
network outage, the data collectors sent their backlogs.
When all indexes were re-created with FILLFACTOR=70, the import times
went down considerably. Out of about 63000 raw data chunks, there were
only 6 or 8 where the import time fell in the 10-25 seconds range, the rest
of them were imported under 10 seconds and the majority of them (65%)
were imported under 3 seconds. The system has 24 cores, so we can use
8 of them dedicated to importing. This almost 1TB data was imported
in about 3.5-4 hours after the FILLFACTOR change. Without it, it didn't
finish in 2 days.
Best regards,
Zoltán Böszörményi
> So in your case, I'd probably go with the default fillfactor (90), and
> maybe I'd consider even using fillfactor=100, to make the index as dense
> as possible.
>
> Anyway I guess the impact of this will be negligible, compared to other
> parameters (shared buffers, work_mem, etc.).
>
> regards
> Tomas
>
--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
http://www.postgresql.at/
From | Date | Subject | |
---|---|---|---|
Next Message | pedz | 2011-06-12 20:50:40 | Out of tree build issue |
Previous Message | Vincent Veyron | 2011-06-12 19:02:26 | Re: Tweaking bytea / large object block sizes? |