From: | Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Extremely slow intarray index creation and inserts. |
Date: | 2009-03-19 07:02:41 |
Message-ID: | Pine.LNX.4.64.0903191001380.31919@sn.sai.msu.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
We usually say about 200 unique values as a limit for
gist_int_ops.
On Wed, 18 Mar 2009, Tom Lane wrote:
> Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> writes:
>> Oleg Bartunov wrote:
>> OB:> it's not about short or long arrays, it's about small or big
>> OB:> cardinality of the whole set (the number of unique elements)
>
>> I'm re-reading the docs and still wasn't obvious to me. A
>> potential docs patch is attached below.
>
> Done, though not in exactly those words. I wonder though if we can
> be less vague about it --- can we suggest a typical cutover point?
> Like "use gist__intbig_ops if there are more than about 10,000 distinct
> array values"? Even a rough order of magnitude for where to worry
> about this would save a lot of people time.
>
> regards, tom lane
>
> Index: intarray.sgml
> ===================================================================
> RCS file: /cvsroot/pgsql/doc/src/sgml/intarray.sgml,v
> retrieving revision 1.5
> retrieving revision 1.6
> diff -c -r1.5 -r1.6
> *** intarray.sgml 10 Dec 2007 05:32:51 -0000 1.5
> --- intarray.sgml 18 Mar 2009 20:18:18 -0000 1.6
> ***************
> *** 237,245 ****
> <para>
> Two GiST index operator classes are provided:
> <literal>gist__int_ops</> (used by default) is suitable for
> ! small and medium-size arrays, while
> <literal>gist__intbig_ops</> uses a larger signature and is more
> ! suitable for indexing large arrays.
> </para>
>
> <para>
> --- 237,246 ----
> <para>
> Two GiST index operator classes are provided:
> <literal>gist__int_ops</> (used by default) is suitable for
> ! small- to medium-size data sets, while
> <literal>gist__intbig_ops</> uses a larger signature and is more
> ! suitable for indexing large data sets (i.e., columns containing
> ! a large number of distinct array values).
> </para>
>
> <para>
>
Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru)
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg(at)sai(dot)msu(dot)su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83
From | Date | Subject | |
---|---|---|---|
Next Message | Nimesh Satam | 2009-03-19 11:32:07 | Prepared statement does not exist |
Previous Message | Simon Riggs | 2009-03-18 23:07:34 | Re: Proposal of tunable fix for scalability of 8.4 |