Re: Trying out read streams in pgvector (an extension)

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Trying out read streams in pgvector (an extension)
Date: 2024-09-06 04:28:52
Message-ID: CA+hUKG+x2BcqWzBC77cN0ewhzMF0kYhC6c4G_T2gJLPbqYQ6Ow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 12, 2024 at 3:37 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
> If you're curious, I can fire up some of my more serious benchmarks on
> this to do a before/after to see if there's anything interesting. I have
> a few large datasets (10s of millions) of larger vectors (1536dim => 6KB
> payloads) that could see the net effect here.
>
> > (Make sure you remember to set effective_io_concurrency to an
> > interesting number if you want to generate a lot of overlapping
> > fadvise calls.)
>
> What would you recommend as an "interesting number?" - particularly
> using the data parameters above.

Hi Jonathan,

Sorry for not replying sooner (ETOOMANYPROJECTS). For HNSW, I think
the maximum useful effective_io_concurrency is bound by the number of
connections per HNSW layer ("m"). Here are some times I measured
using m=16 on two laptops:

| linux (xfs) | macos (apfs)
branch | eic | avg | speedup | stdev | avg | speedup | stdev
--------+-----+--------+---------+--------+--------+---------+--------
master | | 73.959 | 1.0 | 24.168 | 72.290 | 1.0 | 11.851
stream | 0 | 70.117 | 1.1 | 36.699 | 76.289 | 1.0 | 12.742
stream | 1 | 57.983 | 1.3 | 5.845 | 79.969 | 1.2 | 8.308
stream | 2 | 35.629 | 2.1 | 4.088 | 49.198 | 2.0 | 7.686
stream | 3 | 28.477 | 2.6 | 2.607 | 37.540 | 2.5 | 5.272
stream | 4 | 26.493 | 2.8 | 3.691 | 33.014 | 2.7 | 4.444
stream | 5 | 23.711 | 3.1 | 2.435 | 32.622 | 3.0 | 2.270
stream | 6 | 22.885 | 3.2 | 1.908 | 31.254 | 3.2 | 4.170
stream | 7 | 21.910 | 3.4 | 2.153 | 33.669 | 3.3 | 4.616
stream | 8 | 20.741 | 3.6 | 1.594 | 34.182 | 3.5 | 3.819
stream | 9 | 22.471 | 3.3 | 3.094 | 30.690 | 3.2 | 2.677
stream | 10 | 19.895 | 3.7 | 1.695 | 32.631 | 3.6 | 4.976
stream | 11 | 19.447 | 3.8 | 1.647 | 31.163 | 3.7 | 3.351
stream | 12 | 18.658 | 4.0 | 1.503 | 30.817 | 3.9 | 3.538
stream | 13 | 18.886 | 3.9 | 0.874 | 29.184 | 3.8 | 4.832
stream | 14 | 18.667 | 4.0 | 1.692 | 28.783 | 3.9 | 3.459
stream | 15 | 19.080 | 3.9 | 1.429 | 28.928 | 3.8 | 3.396
stream | 16 | 18.929 | 3.9 | 3.469 | 29.282 | 3.8 | 2.868

Those are millisecond times to run the test() function shown earlier,
with empty kernel cache and PostgreSQL cache (see below) for maximum
physical I/O. I ran the master test 30 times, and each
effective_io_concurrency level 10 times, to show that the variance
decreases even at the default effective_io_concurency = 1, so we're
not only talking about the avg speed improving.

The all-cached performance also seems to improve, ~8.9ms -> ~6.9ms on
Linux, but I can't fully explain why that is, maybe just some random
stuff about memory layout run-to-run in my quick and dirty test or
something like that, so I'm not claiming that is significant. It
certainly didn't get slower, anyway.

I think you would get very different numbers on a high latency storage
system (say, non-local cloud storage) and potentially much more
speedup with your large test indexes. Also my 6d random number test
may not be very representative and you may be able to come up with
much better tests.

Here's a new version with a TODO tidied up. I also understood that we
need to tweak the read_stream_reset() function, so that it doesn't
forget its current readhead distance when it hops between HNSW nodes
(which is something that comes up in several other potential uses
cases including another one I am working in in core). Without this
patch for PostgreSQL, it reads 1, 2, 4, 7 blocks (= 16 in total)
before it has to take a break to hop to a new page, and then it start
again at 1. Oops. With this patch, it is less forgetful, and reaches
the full possible I/O concurrency of 16 (or whatever the minimum of
HNSW's m parameter and effective_io_concurrency is for you).

PSA two patches, one for PostgreSQL and one for pgvector.

I am not actively working on this right now. If someone wants to try
to develop it further, please feel free! I haven't looked at IVFFlat
at all.

--- function to let you do SELECT uncache('t_embedding_idx'),
--- which is the opposite of SELECT pg_prewarm('t_embedding_idx')
--- see also "echo 1 | sudo tee /proc/sys/vm/drop_caches" (Linux)
--- "sudo purge" (macOS)
create extension pg_buffercache;
create or replace function uncache(name text) returns bool
begin atomic;
select bool_and(pg_buffercache_evict(bufferid))
from pg_buffercache where relfilenode = name::regclass;
end;

Attachment Content-Type Size
0001-Remember-ReadStream-look-ahead-distance-on-reset.patch text/x-patch 2.4 KB
v2-0001-Use-streaming-I-O-for-HNSW-blocks.patch text/x-patch 6.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2024-09-06 04:34:15 Re: pgsql: Add more SQL/JSON constructor functions
Previous Message Tatsuo Ishii 2024-09-06 04:21:31 Re: Add memory/disk usage for WindowAgg nodes in EXPLAIN