Re: Speeding Up Bitmapset

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Speeding Up Bitmapset
Date: 2023-06-26 00:55:04
Message-ID: CAEudQAqArnSDczkz7a_wq9NCQFQ_r=6kHx_LODrOP+VVG0ywVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em dom., 25 de jun. de 2023 às 17:49, David Rowley <dgrowleyml(at)gmail(dot)com>
escreveu:

> There's no reason in the world
> that those will speed up Bitmapsets, so why include them?
>
Of course optimization is the most important thing,
but since you're going to touch the source, why not make it more readable.

>
> > v7 outperforms v4 by 29% (Query A)
>
> I tested v7 with query-a and I also see additional gains. However,
> it's entirely down to your changes to bms_is_subset(). It seems, by
> chance, with the given Bitmapsets that looping backwards for the given
> sets is able to determine the result more quickly
>
I redid the tests and it seems that most of the difference comes from
bms_subset.

>
> Here's some results from "perf top"
>
> query-a
> v4
>
> 30.08% postgres [.] bms_is_subset
> 15.84% postgres [.] create_join_clause
> 13.54% postgres [.] bms_equal
> 11.03% postgres [.] get_eclass_for_sort_expr
> 8.53% postgres [.] generate_implied_equalities_for_column
> 3.11% postgres [.] generate_join_implied_equalities_normal
> 1.03% postgres [.] add_child_rel_equivalences
> 0.82% postgres [.] SearchCatCacheInternal
> 0.73% postgres [.] AllocSetAlloc
> 0.53% postgres [.] find_ec_member_matching_expr
> 0.40% postgres [.] hash_search_with_hash_value
> 0.36% postgres [.] palloc
> 0.36% postgres [.] palloc0
>
> latency average = 452.480 ms
>
> v7
> 20.51% postgres [.] create_join_clause
> 15.33% postgres [.] bms_equal
> 14.17% postgres [.] get_eclass_for_sort_expr
> 12.05% postgres [.] bms_is_subset
> 10.40% postgres [.] generate_implied_equalities_for_column
> 3.90% postgres [.] generate_join_implied_equalities_normal
> 1.34% postgres [.] add_child_rel_equivalences
> 1.06% postgres [.] AllocSetAlloc
> 1.00% postgres [.] SearchCatCacheInternal
> 0.72% postgres [.] find_ec_member_matching_expr
> 0.58% postgres [.] palloc0
> 0.49% postgres [.] palloc
> 0.47% postgres [.] hash_search_with_hash_value
> 0.44% libc.so.6 [.] __memmove_avx_unaligned_erms
>
>
> latency average = 350.543 ms
>
> modified v7's bms_is_subset to go forwards then I get latency average
> = 445.987 ms.
>
> If I add some debugging to bms_is_subset to have it record how many
> words it checks, I see:
>
> postgres=# select sum(nwords) from forward;
> sum
> -----------
> 181490660
> (1 row)
>
> postgres=# select sum(nwords) from backwards;
> sum
> ----------
> 11322564
> (1 row)
>
> So, it took about 181 million loops in bms_is_member to plan query-a
> when looping forward, but only 11 million when looping backwards.
>
> I think unless you've got some reason that you're able to justify why
> we're always more likely to have to perform fewer word checks in
> bms_is_subset() by looping backwards instead of forwards then I think
> the performance gains you're showing here only happen to show better
> results due to the given workload. It's just as easy to imagine that
> you'll apply the equivalent slowdown for some other workload where the
> initial words differ but all the remaining words all match.
>
Have you seen bms_compare?
For some reason someone thought it would be better to loop backward the
array.

Since bms_subset performed very well with backward, I think that's a good
reason to leave it as bms_compare.
As in most cases, the array size is small, in general in both modes the
performance will be equivalent.

Anyway, I made a new patch v8, based on v4 but with some changes that I
believe improve it.

Windows 64 bits
msvc 2019 64 bits

== Query A ==
psql -U postgres -f c:\postgres_bench\tmp\bitmapset\create-tables-a.sql
psql -U postgres -f c:\postgres_bench\tmp\bitmapset\query-a.sql
=============

patched v4:
Time: 2305,445 ms (00:02,305)
Time: 2185,972 ms (00:02,186)
Time: 2177,434 ms (00:02,177)
Time: 2169,883 ms (00:02,170)

patched v8:
Time: 2143,532 ms (00:02,144)
Time: 2140,313 ms (00:02,140)
Time: 2138,481 ms (00:02,138)
Time: 2130,290 ms (00:02,130)

== Query B ==
psql -U postgres -f c:\postgres_bench\tmp\bitmapset\create-tables-b.sql
psql -U postgres -f c:\postgres_bench\tmp\bitmapset\query-b.sql

patched v4:
Time: 2684,360 ms (00:02,684)
Time: 2482,571 ms (00:02,483)
Time: 2452,699 ms (00:02,453)
Time: 2465,223 ms (00:02,465)

patched v8:
Time: 2493,281 ms (00:02,493)
Time: 2490,090 ms (00:02,490)
Time: 2432,515 ms (00:02,433)
Time: 2426,860 ms (00:02,427)

linux Ubuntu 64 bit
gcc 64 bits

== Query A ==
/usr/local/pgsql/bin/psql -U postgres -f
/home/ranier/Documentos/benchmarks/bitmapset/create-tables-a.sql
/usr/local/pgsql/bin/psql -U postgres -f
/home/ranier/Documentos/benchmarks/bitmapset/query-a.sql
=============

patched v4:
Time: 933,181 ms
Time: 931,520 ms
Time: 906,496 ms
Time: 872,446 ms

patched v8:
Time: 937,079 ms
Time: 930,408 ms
Time: 865,548 ms
Time: 865,382 ms

== Query B ==
/usr/local/pgsql/bin/psql -U postgres -f
/home/ranier/Documentos/benchmarks/bitmapset/create-tables-b.sql
/usr/local/pgsql/bin/psql -U postgres -f
/home/ranier/Documentos/benchmarks/bitmapset/query-b.sql

patched v4:
Time: 1581,317 ms (00:01,581)
Time: 1568,371 ms (00:01,568)
Time: 1468,036 ms (00:01,468)
Time: 1445,698 ms (00:01,446)

patched v8:
Time: 1437,997 ms (00:01,438)
Time: 1437,435 ms (00:01,437)
Time: 1440,422 ms (00:01,440)
Time: 1436,112 ms (00:01,436)

regards,
Ranier Vilela

Attachment Content-Type Size
speeding-up-bitmapset-v8.patch application/octet-stream 18.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2023-06-26 01:05:20 Re: Row pattern recognition
Previous Message Alena Rybakina 2023-06-25 23:39:50 Re: eqjoinsel_semi still sucks ...