Quick Links

Re: Clustering with enough work_mem: copy heap in mem?

From:	Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>
To:	Scara Maccai <m_lists(at)yahoo(dot)it>
Cc:	pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Clustering with enough work_mem: copy heap in mem?
Date:	2009-08-20 14:35:25
Message-ID:	dcc563d10908200735o843e986sfbecc3990098d3b7@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Thu, Aug 20, 2009 at 8:28 AM, Scara Maccai<m_lists(at)yahoo(dot)it> wrote:
> Hi,
>
> I have a table with 15M rows. Table is around 5GB on disk.
>
> Clustering the table takes 5 minutes.
>
> A seq scan takes 20 seconds.
>
> I guess clustering is done using a seq scan on the index and then fetching the proper rows in the heap.
> If that's the case, fetching random rows on disk is the cause of the enormous time it takes to cluster the table.

Yep.

> Since I can set work_mem > 5GB. couldn't postgres do something like:
>
> - read the whole table in memory
> - access the table in memory instead of the disk when reading the "indexed" data

I've found it easier to select everything into another table, truncate
the original table, then insert the rows as:

insert into orig_table select * from mytemptable order by field1,field2;

If needs be you can lock the original table to prevent modifications
while doing this.

In response to

Clustering with enough work_mem: copy heap in mem? at 2009-08-20 14:28:12 from Scara Maccai

Responses

Re: Clustering with enough work_mem: copy heap in mem? at 2009-08-20 14:46:50 from Scara Maccai

Browse pgsql-general by date

	From	Date	Subject
Next Message	Scara Maccai	2009-08-20 14:46:50	Re: Clustering with enough work_mem: copy heap in mem?
Previous Message	Scara Maccai	2009-08-20 14:28:12	Clustering with enough work_mem: copy heap in mem?