VJK <vjkmail(at)gmail(dot)com> wrote:
> the source 1.9GB (19MB x 100) resulted in 5GB of actual disk IO
> Deletion (delete from x2) took 32 seconds with 12 seconds CPU and
> 20 sec sleep + wait for IO. Actual disk IO was about 4GB.
>
> Since Pg does not use the concept of rollback segments, it is
> unclear why deletion produces so much disk IO (4GB).
One delete would mark the xmax of the tuple, so that transactions
without that transaction ID in their visible set would ignore it.
The next table scan would set hint bits, which would store
information within the tuple to indicate that the deleting
transaction successfully committed, then the vacuum would later wake
up and rewrite the page with the deleted tuples removed.
If you have enough battery backed cache space on a hardware RAID
controller card, and that cache is configured in write-back mode,
many of these writes might be combined -- the original delete, the
hint bit write, and the vacuum might all combine into one physical
write to disk. What does your disk system look like, exactly?
-Kevin