Re: Clustered index to preserve data locality in a multitenant application?

From: Nicolas Grilly <nicolas(at)gardentechno(dot)com>
To: Kenneth Marshall <ktm(at)rice(dot)edu>
Cc: Vick Khera <vivek(at)khera(dot)org>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Clustered index to preserve data locality in a multitenant application?
Date: 2016-08-31 21:55:47
Message-ID: CAG3yVS5zM97YdwMK5y7NU-vYATnrx89DAQg5kR=sPzV1CmRTog@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Aug 30, 2016 at 8:17 PM, Kenneth Marshall <ktm(at)rice(dot)edu> wrote:

> We have been using the extension pg_repack to keep a table groomed into
> cluster order. With an appropriate FILLFACTOR to keep updates on the same
> page, it works well. The issue is that it needs space to rebuild the new
> index/table. If you have that, it works well.
>

It looks like Instagram has been using pg_reorg (the ancestor of pg_repack)
to keep all likes from the same user contiguous on disk, in order to
minimize disk seeks.

http://instagram-engineering.tumblr.com/post/40781627982/handling-growth-with-postgres-5-tips-from

This is very similar to what I'm trying to achieve.

The article is 3 years old. I'd be curious to know if they still do that.

Nicolas

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ben Chobot 2016-08-31 22:05:11 Re: Clustered index to preserve data locality in a multitenant application?
Previous Message Kenneth Marshall 2016-08-31 16:22:15 Re: Clustered index to preserve data locality in a multitenant application?