From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date: | 2025-02-12 19:08:35 |
Message-ID: | CA+TgmoaKO=4tWsorTOpKtukMBuLauQCUWz3R0FhjPqXfBBZDzg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Feb 10, 2025 at 4:41 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
> I had taken to thinking of it as "queue depth". But I think that's not
> really accurate. Do you know why we set it to 1 as the default? I
> thought it was because the default should be just prefetch one block
> ahead. But if it was meant to be spindles, was it that most people had
> one spindle (I don't really know what a spindle is)?
AFAIK, we're imagining that people are running PostgreSQL on a hard
drive like the one my desktop machine had when I was in college, which
indeed only had one spindle. There were one or possibly several
magnetic platters rotating around a single point. But today, nobody
has that, certainly not on a production machine. For instance if you
are using RAID or LVM you have several drives hooked up together, so
that would be multiple spindles even if all of them were HDDs, but
probably they are all SSDs which don't have spindles anyway. If you
are on a virtual machine you have a slice of the underlying machine's
I/O and whatever that is it's probably something more complicated than
a single spindle.
To be fair, I guess the concept of effective_io_concurrency isn't
completely meaningless if you think about it as the number of
simultaneous I/O requests that can be in flight at a time. But I still
think it's true that it's hard to make any reasonable guess as to how
you should set this value to get good performance. Maybe people who
work as consultants and configure lots of systems have good intuition
here, but anybody else is probably either not going to know about this
parameter or flail around trying to figure out how to set it.
It might be interesting to try some experiments on a real small VM
from some cloud provider and see what value is optimal there these
days. Then we could set the default to that value or perhaps somewhat
more.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-02-12 19:43:29 | Re: Allow io_combine_limit up to 1MB |
Previous Message | Sami Imseih | 2025-02-12 19:01:15 | Re: pg_stat_statements and "IN" conditions |