Re: Streaming read-ready sequential scan code

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Streaming read-ready sequential scan code
Date: 2024-05-18 19:00:00
Message-ID: d1fb5c09-dd03-2540-9ec2-86dbfdfa2c65@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Thomas,

18.05.2024 07:47, Thomas Munro wrote:
> After more debugging, we learned a lot more things...
>
> 1. That query produces spectacularly bad estimates, so we finish up
> having to increase the number of buckets in a parallel hash join many
> times. That is quite interesting, but unrelated to new code.
> 2. Parallel hash join is quite slow at negotiating an increase in the
> number of hash bucket, if all of the input tuples are being filtered
> out by quals, because of the choice of where workers check for
> PHJ_GROWTH_NEED_MORE_BUCKETS. That could be improved quite easily I
> think. I have put that on my todo list 'cause that's also my code,
> but it's not a new issue it's just one that is now highlighted...
> 3. This bit of read_stream.c is exacerbating unfairness in the
> underlying scan, so that 1 and 2 come together and produce a nasty
> slowdown, which goes away if you change it like so:
>
> - BlockNumber blocknums[16];
> + BlockNumber blocknums[1];
>
> I will follow up after some more study.

Thank you for the information!

Unfortunately, I can't see significant differences in my environment with
parallel_leader_participation=off.

With blocknums[1], timing is changed, but the effect is not persistent.
10 query15 executions in a row, b7b0f3f27:
277.932 ms
281.805 ms
278.335 ms
281.565 ms
284.167 ms
283.171 ms
281.165 ms
281.615 ms
285.394 ms
277.301 ms

b7b0f3f27~1:
159.789 ms
165.407 ms
160.893 ms
159.343 ms
160.936 ms
161.577 ms
161.637 ms
163.421 ms
163.143 ms
167.109 ms

b7b0f3f27 + blocknums[1]:
164.133 ms
280.920 ms
160.748 ms
163.182 ms
161.709 ms
161.998 ms
161.239 ms
276.256 ms
161.601 ms
160.384 ms

I placed PGDATA on tmpfs to rule out any blockdev specifics (increasing
blockdev ra from 256 to 4096 didn't help me with PGDATA on NVME either.)

Best regards,
Alexander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yasir 2024-05-18 19:26:11 Re: Ignore Visual Studio's Temp Files While Working with PG on Windows
Previous Message Andrey M. Borodin 2024-05-18 18:52:11 Sort functions with specialized comparators