Re: 10.1: hash index size exploding on vacuum full analyze

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, AP <pgsql(at)inml(dot)weebeastie(dot)net>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: 10.1: hash index size exploding on vacuum full analyze
Date: 2017-12-16 03:54:03
Message-ID: CAE9k0Pn=yKbDLu-TKQBMj4GL38bYR-GFo4-7G08tCaoWisdnMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On Sat, Dec 16, 2017 at 8:34 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> On Sat, Dec 16, 2017 at 8:03 AM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
wrote:
>> Hi,
>>
>> On Fri, Dec 15, 2017 at 8:08 PM, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>>> Hi!
>>>
>>>> I think if we update the stats in copy_heap_data after copying the
>>>> data, then we don't see such problem. Attached patch should fix the
>>>> issue. You can try this patch to see if it fixes the issue for you.
>>>

Here are the stats i saw before and after VACUUM FULL - with and without
the patch. Please note i have followed the steps shared in - [1].

*A) Without patch - Stats before and after VACUUM FULL:*

postgres[43768]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages |
live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 2560 | 2024 | 1 | 0 |
1000000 | 0 | 46.4793701521013 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)

Time: 6.531 ms

postgres[43768]=# select oid, relname, relpages, reltuples from pg_class
where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 4586 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)

Time: 0.369 ms
postgres[43768]=# VACUUM FULL;
VACUUM
Time: 4145.813 ms (00:04.146)

postgres[43768]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages |
live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 3072 | 2019 | 1 | 0 |
1000000 | 0 | 51.8093562713087 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)

Time: 9.194 ms

postgres[43768]=# select oid, relname, relpages, reltuples from pg_class
where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 5093 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)

Time: 1.289 ms

*B) With Patch- Stats before and after VACUUM FULL:*

postgres[31111]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages |
live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 2560 | 2024 | 1 | 0
| 1000000 | 0 | 46.4793701521013 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)

Time: 6.539 ms

postgres[31111]=# select oid, relname, relpages, reltuples from pg_class
where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 4586 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)

Time: 0.379 ms

postgres[31111]=# VACUUM FULL;
VACUUM
Time: 4265.662 ms (00:04.266)

postgres[31111]=# select * from pgstathashindex('hash_index');
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| version | bucket_pages | overflow_pages | bitmap_pages | unused_pages |
live_items | dead_items | free_percent |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
| 4 | 2560 | 2024 | 1 | 0
| 1000000 | 0 | 46.4793701521013 |
+---------+--------------+----------------+--------------+--------------+------------+------------+------------------+
(1 row)

Time: 6.699 ms

postgres[31111]=# select oid, relname, relpages, reltuples from pg_class
where relname = 'hash_index';
+-------+------------+----------+-----------+
| oid | relname | relpages | reltuples |
+-------+------------+----------+-----------+
| 16398 | hash_index | 4586 | 1e+06 |
+-------+------------+----------+-----------+
(1 row)

Time: 0.893 ms

[1] -
https://www.postgresql.org/message-id/CAE9k0P%3DihFyPAKfrMX9GaDo5RaeGSJ4i4nb28fGev15wKOPYog%40mail.gmail.com

>>> I'm afraid I'm not able to reproduce the problem which patch should fix.
>>>
>>> What I did (today's master, without patch):
>>> autovacuum off
>>> pgbench -i -s 100
>>>
>>> select relname, relpages, reltuples from pg_class where relname =
>>> 'pgbench_accounts';
>>> relname | relpages | reltuples
>>> ------------------+----------+-----------
>>> pgbench_accounts | 163935 | 1e+07
>>>
>>> vacuum full pgbench_accounts;
>>>
>>> # select relname, relpages, reltuples from pg_class where relname =
>>> 'pgbench_accounts';
>>> relname | relpages | reltuples
>>> ------------------+----------+-----------
>>> pgbench_accounts | 163935 | 1e+07
>>>
>>>
>>> I've tried to add hash index to that table and print notice about
number of
>>> pages and tuples immediately after estimate_rel_size() in hashbuild().
hash
>>> index got right estimation even I deleted all rows before vacuum full.
What
>>> am I doing wrong?
>>>
>>> Patch looks good except, seems, updating stats is better to move to
>>> swap_relation_files(), then it will work even for toast tables.
>>>
>>>
>>
>> I haven't looked into the patch properly, but, i could reproduce the
>> issue. Here are the steps that i am following,
>>
>> CREATE TABLE hash_index_table (keycol INT);
>> INSERT INTO hash_index_table (keycol) SELECT (a - 1) % 1000 + 1 FROM
>> GENERATE_SERIES(1, 1000000) a;
>>
>> CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH
>> (keycol) with (fillfactor = 80);
>>
>> CREATE EXTENSION pgstattuple;
>>
>> select oid, relname, relpages, reltuples from pg_class where relname =
>> 'hash_index';
>>
>> select relname, relpages, reltuples from pg_class where relname =
>> 'hash_index_table';
>>
>> select * from pgstathashindex('hash_index');
>>
>> DROP INDEX hash_index;
>>
>> CREATE INDEX CONCURRENTLY hash_index on hash_index_table USING HASH
>> (keycol) with (fillfactor = 100);
>>
>> select * from pgstathashindex('hash_index');
>>
>> select oid, relname, relpages, reltuples from pg_class where relname =
>> 'hash_index';
>>
>> select relname, relpages, reltuples from pg_class where relname =
>> 'hash_index_table';
>>
>> VACUUM FULL;
>>
>> select * from pgstathashindex('hash_index');
>>
>> select oid, relname, relpages, reltuples from pg_class where relname =
>> 'hash_index';
>>
>> select relname, relpages, reltuples from pg_class where relname =
>> 'hash_index_table';
>>
>> I think the issue is only visible when VACUUM FULL is executed after
>> altering the index table fill-factor. Could you please try with above
>> steps and let us know your observations. Thanks.
>>
>> With patch, I could see that the index table stats before and after
>> VACUUM FULL are same.
>>
>
> I think you should have shared the value of stats before and after
> patch so that we can see if the above is a right way to validate.
>
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message AP 2017-12-16 06:57:29 Re: 10.1: hash index size exploding on vacuum full analyze
Previous Message Amit Kapila 2017-12-16 03:38:23 Re: 10.1: hash index size exploding on vacuum full analyze