Re: Will pg_repack improve this query performance?

From: Josh Kupershmidt <schmiddy(at)gmail(dot)com>
To: Alban Hertroys <haramrae(at)gmail(dot)com>
Cc: Abelard Hoffman <abelardhoffman(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Will pg_repack improve this query performance?
Date: 2014-10-16 20:32:44
Message-ID: CAK3UJREvAASY4s4PDpkzAdoZkcnX=upDU0ifVzqrz4fA-FoyNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Oct 15, 2014 at 5:03 AM, Alban Hertroys <haramrae(at)gmail(dot)com> wrote:
> A CLUSTER would help putting rows with the same to_id together. Disk access would be less random that way, so it would help some.
>
> According to your query plan, accessing disks (assuming that’s what made the difference) was 154 (7700 ms / 50 ms) times slower than accessing memory. I don’t have the numbers for your disks or memory, but that doesn’t look like an incredibly unrealistic difference. That begs the question, how random was that disk access and how much can be gained from clustering that data?

Other than grouping tuples in a more favorable order to minimize I/O,
the big benefit of running a CLUSTER or pg_repack is that you
eliminate any accumulated bloat. (And if bloat is your real problem,
ideally you can adjust your autovacuum settings to avoid the problem
in the future.) You may want to check on the bloat of that table and
its indexes with something like this:

https://wiki.postgresql.org/wiki/Show_database_bloat

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Abelard Hoffman 2014-10-16 21:41:11 Re: Will pg_repack improve this query performance?
Previous Message Adrian Klaver 2014-10-16 20:22:54 Re: COPY data into a table with a SERIAL column?