Re: R: Index on integer or on string field

From: William Dunn <dunnwjr(at)gmail(dot)com>
To: Job <Job(at)colliniconsulting(dot)it>
Cc: PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: R: Index on integer or on string field
Date: 2015-05-15 23:36:52
Message-ID: CAEva=Vm8WJzu1KDOrEA9PJZK2jtJM849fVRnBuqoRZbE-f-DNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello Francesco,

You should probably set timing on, run an explain analyze, and use pgbadger
to diagnose your performance issue.

While it may be the case that comparison in the index might be slightly
faster because of the modulo arithmetic, those in-memory operations are
extremely fast and it is likely that the seek in that index is the fastest
part of your query. And since you only have 50 distinct values the btree is
probably extremely shallow and there will be very few comparisons anyway.

I don't know much about your query but I suspect that the issue is that
your index scan is not selective enough so Postgres needs to scan a lot
from disk (which is *extremely* slow). If you want to improve the
performance you should first try to make the query as selective as
possible, and try to put an index on a more selective column of the WHERE
clause.

If you really must rely primarily on that column which has only 50 distinct
values you can try periodically running a CLUSTER command on the table for
that column index (doc:
http://www.postgresql.org/docs/devel/static/sql-cluster.html) or partition
the table (doc:
http://www.postgresql.org/docs/devel/static/ddl-partitioning.html) so that
the data you are scanning is close together on disk and you can get as much
of it per IO operation as possible.

*Will J. Dunn*
*willjdunn.com <http://willjdunn.com>*

On Fri, May 15, 2015 at 1:32 PM, Arthur Silva <arthurprs(at)gmail(dot)com> wrote:

> Yes that's my suggestion. Btree-Gin deals with lots of repeated values
> much better than the Btree index as repeated keys are only stored once.
> Em 15/05/2015 12:38, "Job" <Job(at)colliniconsulting(dot)it> escreveu:
>
>> Hello Arthur!
>>
>> So, i read that btree-gin have got "the ability to enforce uniqueness".
>>
>> If in this 10.millions long table i have, in index, 50 recurring values,
>> i can leave the alphabetical field and change to btree-gin the index on it?!
>>
>> Thank you!
>> Francesco
>>
>> ------------------------------
>> *Da:* Arthur Silva [arthurprs(at)gmail(dot)com]
>> *Inviato:* venerdì 15 maggio 2015 17.26
>> *A:* Job
>> *Cc:* pgsql-general(at)postgresql(dot)org
>> *Oggetto:* Re: [GENERAL] Index on integer or on string field
>>
>> You should probably experiment with a btree-gin index on those.
>> Em 15/05/2015 12:22, "Job" <Job(at)colliniconsulting(dot)it> escreveu:
>>
>>> Hello,
>>>
>>> i have a table of about 10 millions of records, with the index on a
>>> string field.
>>> Actually is alphabetical; since queries are about 100/200 per seconds, i
>>> was looking for a better way to improve performance and reduce workload.
>>>
>>> The unique values, of that fields, are about the 50 (category name), and
>>> we could create a second table to codify, with numerical integer values,
>>> the 50 recurring names.
>>>
>>> Is index are integer and not characteral, performance are better and
>>> workload reduces?
>>>
>>> Is there any comparisons?
>>>
>>> Thank you!
>>> Francesco
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>>
>>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2015-05-16 00:31:01 Re: Index on integer or on string field
Previous Message Daniel Begin 2015-05-15 19:17:03 Re: Restarting DB after moving to another drive