Re: Building multiple indexes on one table.

From: Felipe Santos <felipepts(at)gmail(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>, Chris Ruprecht <chris(at)cdrbill(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Building multiple indexes on one table.
Date: 2014-07-23 20:19:13
Message-ID: CAPYcRiW9Xy4Nu8_VugpnNnodDPcNAakkJZBPdgJ41f-diZJ-Ng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Your question: Is there any way that I can build multiple indexes on one
table without having to scan the table multiple times?

My answer: I don't think so. Since each index has a different indexing
rule, it will analyze the same table in a different way. I've built indexes
on a 100GB table recently and it didn't take me too much time (Amazon EC2
with 8 CPU cores / 70 GB RAM). I don't remember how much time it took, but
that's a good sign right ;-) ? Painful jobs are always remembered... (ok,
the hardware helped a lot).

So, my advice is: get yourself a good maintenance window and just build
indexes, remember that they will help a lot of people querying this table.

2014-07-23 16:49 GMT-03:00 Claudio Freire <klaussfreire(at)gmail(dot)com>:

> On Wed, Jul 23, 2014 at 4:40 PM, Marc Mamin <M(dot)Mamin(at)intershop(dot)de> wrote:
> >>On Thu, Jul 17, 2014 at 7:47 PM, Chris Ruprecht <chris(at)cdrbill(dot)com>
> wrote:
> >>> Is there any way that I can build multiple indexes on one table
> without having to scan the table multiple times? For small tables, that's
> probably not an issue, but if I have a 500 GB table that I need to create 6
> indexes on, I don't want to read that table 6 times.
> >>> Nothing I could find in the manual other than reindex, but that's not
> helping, since it only rebuilds indexes that are already there and I don't
> know if that reads the table once or multiple times. If I could create
> indexes inactive and then run reindex, which then reads the table once, I
> would have a solution. But that doesn't seem to exist either.
> >>
> >>Just build them with separate but concurrent connections, and the
> >>scans will be synchronized so it will be only one.
> >>
> >>Btw, reindex rebuilds one index at a time, so what I do is issue
> >>separate reindex for each index in parallel, to avoid the repeated
> >>scans as well.
> >>
> >>Just make sure you've got the I/O and CPU capacity for it (you'll be
> >>writing many indexes at once, so there is a lot of I/O).
> >
> > Index creation on large tables are mostly CPU bound as long as no swap
> occurs.
> > I/O may be an issue when all your indexes are similar; e.g. all on
> single int4 columns.
> > in other cases the writes will not all take place concurrently.
> > To reduce I/O due to swap, you can consider increasing
> maintenance_work_mem on the connextions/sessionns
> > that build the indexes.
>
> Usually there will always be swap, unless you've got toy indexes.
>
> But swap I/O is all sequential I/O, with a good readahead setting
> there should be no problem.
>
> It's the final writing step that can be a bottleneck if you have a
> lame I/O system and try to push 5 or 6 indexes at once.
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Reza Taheri 2014-07-24 01:18:05 High rate of transaction failure with the Serializable Isolation Level
Previous Message Claudio Freire 2014-07-23 19:49:56 Re: Building multiple indexes on one table.