From: | Scott Marlowe <scott(dot)marlowe(at)ihs(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Subject: bool / vacuum full bug followup part 2 |
Date: | 2002-05-03 23:30:47 |
Message-ID: | Pine.LNX.4.33.0205031702200.1471-100000@css120.ihs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, 3 May 2002, Tom Lane wrote:
> Scott Marlowe <scott(dot)marlowe(at)ihs(dot)com> writes:
> > Well, my keys aren't changing and the index is growing like they are.
>
> Could we see the exact details of your test case?
Sure. I think I posted most of it here already...
Here's my table:
scott.marlowe=# \d test
Table "test"
Column | Type | Modifiers
--------+---------+-----------
buf | text |
yn | boolean |
id | integer |
Indexes: test_id_dx,
test_yn_dx
Here's the indexes:
scott.marlowe=# \di test_id_dx
List of relations
Name | Type | Owner
------------+-------+---------------
test_id_dx | index | scott.marlowe
scott.marlowe=# \di test_yn_dx
List of relations
Name | Type | Owner
------------+-------+---------------
test_yn_dx | index | scott.marlowe
(1 row)
Here's the php script that creates the test data, it makes $count number
of rows and sets the bool to true or false randomly:
set_time_limit(3600);
$conn = pg_connect("dbname=scott.marlowe host=css120.ihs.com");
$count = 100000;
pg_exec($conn,"begin");
$flag[0]="false";
$flag[1]="true";
for ($i=0;$i<$count;$i++){
if ($i%1000==0) {
pg_exec($conn,"end");
pg_exec($conn,"begin");
}
$letter = chr(rand(65,91));
$tf = rand(0,1);
$query = "insert into test (buf,yn) values ('";
$query.= $letter."',".$flag[$tf].")";
pg_exec($conn,$query);
}
pg_exec($conn,"end");
Here's the files in my database directory, and their size by du in
kbytes after vacuum full;
vacuum;
reindex index test_id_dx;
reindex index test_yn_dx;
[postgres(at)css120 16556]$ oid2name -d scott.marlowe
All tables from database "scott.marlowe":
---------------------------------
126572 = accounts
126574 = accounts_pkey
126566 = branches
126568 = branches_pkey
126575 = history
126569 = tellers
126571 = tellers_pkey
16557 = test
1126687 = test_id_dx
1126688 = test_yn_dx
[postgres(at)css120 16556]$ du -s 16557 1126687 1126688
11448 16557 (test)
1772 1126687 (test_id_dx)
1772 1126688 (test_yn_dx)
WHAT I DID:
scott.marlowe=# update test set yn=true where yn=true;
UPDATE 50080
[postgres(at)css120 16556]$ du -s 16557 1126687 1126688
17176 16557
3516 1126687
2924 1126688
scott.marlowe=# vacuum;
VACUUM
[postgres(at)css120 16556]$ du -s 16557 1126687 1126688
17176 16557
3516 1126687
2924 1126688
scott.marlowe=# vacuum full;
VACUUM
[postgres(at)css120 16556]$ du -s 16557 1126687 1126688
11448 16557
3516 1126687
4052 1126688 <-- Notice that the index here just GREW
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
[postgres(at)css120 16556]$ du -s 16557 1126687 1126688
68744 16557
13980 1126687
15660 1126688
scott.marlowe=# vacuum;
VACUUM
[postgres(at)css120 16556]$ du -s 16557 1126687 1126688
68736 16557
13964 1126687
15652 1126688
scott.marlowe=# vacuum full;
VACUUM
[postgres(at)css120 16556]$ du -s 16557 1126687 1126688
11448 16557
13964 1126687
16808 1126688
***************
So, now thinking the problem might be just vacuum full, I try plain old
vacuums
***************
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# vacuum;
VACUUM
[postgres(at)css120 16556]$ du -s 16557 1126687 1126688
22908 16557
13964 1126687
20088 1126688
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# update test set yn=true where yn=true;
scott.marlowe=# vacuum;
VACUUM
[postgres(at)css120 16556]$ du -s 16557 1126687 1126688
22908 16557
13964 1126687
22380 1126688
*****************************
Nope, the index on the bools just keeps growing and growing.
Given a few million updates and it will be bigger than the data it is
supposed to index.
scott.marlowe=# reindex index test_yn_dx;
REINDEX
scott.marlowe=# reindex index test_id_dx;
REINDEX
oid2name -d scott.marlowe |grep dx
1126690 = test_id_dx
1126689 = test_yn_dx
[postgres(at)css120 16556]$ du -s 16557 1126690 1126689
22908 16557
1772 1126690
1772 1126689
and now they're small again.
It would at least be nice if reindex was operational in a transaction so
it would be safe to use on a live database, since it appears to not be
intended for this purpose really, but for fixing broken indexes. Til then
I'll write a script that asks pg_indexes that drops the index and
recreates it in a transaction to keep my data store svelt and clean.
Thanks for the attention to this. Scott.
From | Date | Subject | |
---|---|---|---|
Next Message | Uros Gruber | 2002-05-03 23:31:54 | Re: storing intermediate results in recursive plpgsql f |
Previous Message | Tom Lane | 2002-05-03 22:36:31 | Re: Subject: bool / vacuum full bug followup part 2 |