From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | "Sahagian, David" <david(dot)sahagian(at)emc(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | trying to use CLUSTER |
Date: | 2013-02-14 01:01:06 |
Message-ID: | CAMkU=1zcK+O+_hvQpj+4D+-+mH_LiVxpY+hzW8hOL3wtaQosxQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Tuesday, February 12, 2013, Sahagian, David wrote:
> Version=9.1.7****
>
> ** **
>
> INFO: clustering "my_cool_table" using sequential scan and sort****
>
> INFO: "my_cool_table": found 1 removable, 1699139 nonremovable row
> versions in 49762 pages****
>
> Detail: 1689396 dead row versions cannot be removed yet.****
>
> CPU 9.80s/4.98u sec elapsed 175.92 sec. ****
>
> **
>
...
> ** **
>
> Can somebody tell me why some "dead row versions cannot be removed yet" ?*
> ***
>
> I assume that means CLUSTER must write them to the new table ?
>
It means that while the CLUSTER itself considers them dead, they might
still look alive to some older transaction, so they need to be copied.
(That older transaction must not have touched the table yet, or else it
would hold a lock that would prevent the CLUSTER from taking place). You
might want to hunt down the source of those long-lived transactions and try
to eliminated them.
> It seems very costly to do the CLUSTER, if the new table is not really
> going to be a tiny fraction of the old table.****
>
> Is there a way for me to discover the approx number of "non-removables"
> BEFORE I do the CLUSTER ?
>
Not that I can think of. Well, other than doing a VACUUM VERBOSE, but that
itself would be very costly to do and wasteful if it most of the table is
dead and immediately going to be CLUSTERed anyway.
You might be able to use the contrib module pageinspect to come up with
your own sampling technique. Although I don't immediately see how to do
that.
> **
>
> Also, does the use of [index scan on "pk_cool"] basically depend on the
> ratio of removable/nonremovable row versions ?
>
I don't think so. It mostly depends on the correlation (i.e. how well it
is already clustered), the maintenance_work_mem, and the size the table in
pages, and the estimated size/number of the "live" rows. (The true costs
depend on the live+nonremovable, but the planner does not have that number
available to it so it uses "live" instead in making the estimates)
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Aleksey Tsalolikhin | 2013-02-14 01:17:26 | Graphing query results from within psql. |
Previous Message | Sergey Konoplev | 2013-02-14 00:44:18 | Re: Determining if an hstore is empty |