Re: Confine vacuum skip logic to lazy_scan_skip

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Noah Misch <noah(at)leadboat(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Subject: Re: Confine vacuum skip logic to lazy_scan_skip
Date: 2025-02-05 22:26:49
Message-ID: CAAKRu_ZoOSZHexy=7YZ9W0CfeSHc1tGx7VCperZHWBxxz0gxzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 18, 2025 at 11:51 AM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> Sure. I repeated the benchmark with v13, and it seems the behavior did
> change. I no longer see the "big" regression when most of the pages get
> updated (and need vacuuming).
>
> I can't be 100% sure this is due to changes in the patch, because I did
> some significant upgrades to the machine since that time - it has Ryzen
> 9900x instead of the ancient i5-2500k, new mobo/RAM/... It's pretty
> much a new machine, I only kept the "old" SATA SSD RAID storage so that
> I can do some tests with non-NVMe.
>
> So there's a (small) chance the previous runs were hitting a bottleneck
> that does not exist on the new hardware.
>
> Anyway, just to make this information more complete, the machine now has
> this configuration:
>
> * Ryzen 9 9900x (12/24C), 64GB RAM
> * storage:
> - data: Samsung SSD 990 PRO 4TB (NVMe)
> - raid-nvme: RAID0 4x Samsung SSD 990 PRO 1TB (NVMe)
> - raid-sata: RAID0 6x Intel DC3700 100GB (SATA)
>
> Attached is the script, raw results (CSV) and two PDFs summarizing the
> results as a pivot table for different test parameters. Compared to the
> earlier run I tweaked the script to also vary io_combine_limit (ioc), as
> I wanted to see how it interacts with effective_io_concurrency (eic).
>
> Looking at the new results, I don't see any regressions, except for two
> cases - data (single NVMe) and raid-nvme (4x NVMe). There's a small area
> of regression for eic=32 and perc=0.0005, but only with WAL-logging.
>
> I'm not sure this is worth worrying about too much. It's a heuristics
> and for every heuristics there's some combination parameters where it
> doesn't quite do the optimal thing. The area where the patch brings
> massive improvements (or does not regress) are much more significant.
>
> I personally am happy with this behavior, seems to be performing fine.

Yes, looking at these results, I also feel good about it. I've updated
the commit metadata in attached v14, but I could use a round of review
before pushing it.

- Melanie

Attachment Content-Type Size
v14-0002-Use-streaming-I-O-in-VACUUM-s-third-phase.patch text/x-patch 3.6 KB
v14-0001-Use-streaming-I-O-in-VACUUM-s-first-phase.patch text/x-patch 9.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2025-02-05 22:51:34 Re: RFC: Packing the buffer lookup table
Previous Message Jeff Davis 2025-02-05 22:26:44 Re: Confusing variable naming in LWLockRelease