| From: | Peter Smith <smithpb2250(at)gmail(dot)com> | 
|---|---|
| To: | Ajin Cherian <itsajin(at)gmail(dot)com> | 
| Cc: | Andres Freund <andres(at)anarazel(dot)de>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> | 
| Subject: | Re: row filtering for logical replication | 
| Date: | 2022-02-03 00:31:00 | 
| Message-ID: | CAHut+PsBXHXoQviWuayvcDUX4LKhVBn_OkL1wz7RPW3XEk9Vag@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Wed, Feb 2, 2022 at 8:16 PM Ajin Cherian <itsajin(at)gmail(dot)com> wrote:
>
> Hi Peter,
>
> I just tried scenario b that Andres suggested:
>
> For scenario b, I did some testing with row-filter-patch v74 and
> various levels of filtering. 0% replicated to 100% rows replicated.
> The times are in seconds, I did 5 runs each.
>
> Results:
>
> RUN  HEAD     "with patch 0%" "row-filter-patch 25%" "row-filter-patch
> v74 50%" "row-filter-patch 75%" "row-filter-patch v74 100%"
> 1   17.26178  12.573736       12.869635              13.742167
>           17.977112              17.75814
> 2   17.522473 12.919554       12.640879              14.202737
>           14.515481              16.961836
> 3   17.124001 12.640879       12.706631              14.220245
>           15.686613              17.219355
> 4   17.24122  12.602345       12.674566              14.219423
>           15.564312              17.432765
> 5   17.25352  12.610657       12.689842              14.210725
>           15.613708              17.403821
>
> As can see the performance seen on HEAD is similar to that which the
> patch achieves with all rows (100%) replication. The performance
> improves linearly with
> more rows filtered.
>
> The test scenario used was:
>
> 1. On publisher and subscriber:
> CREATE TABLE test (key int, value text, data jsonb, PRIMARY KEY(key, value));
>
> 2. On publisher: (based on which scenario is being tested)
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 0); -- 100% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 250000); -- 75% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 500000); -- 50% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 750000); -- 25% allowed
> CREATE PUBLICATION pub_1 FOR TABLE test WHERE (key > 1000000); -- 0% allowed
>
> 3. On the subscriber:
> CREATE SUBSCRIPTION sync_sub CONNECTION 'host=127.0.0.1 port=5432
> dbname=postgres application_name=sync_sub' PUBLICATION pub_1;
>
> 4. now modify the postgresql.conf on the publisher side
> synchronous_standby_names = 'sync_sub' and restart.
>
> 5. The test case:
>
> DO
> $do$
> BEGIN
> FOR i IN 1..1000001 BY 10 LOOP
> INSERT INTO test VALUES(i,'BAH', row_to_json(row(i)));
> UPDATE test SET value = 'FOO' WHERE key = i;
> IF I % 1000 = 0 THEN
> COMMIT;
> END IF;
> END LOOP;
> END
> $do$;
>
>
Thanks!
I have put your results as a bar chart same as for the previous workload case:
HEAD 17.25
v74 no filters NA
v74 allow 100% 17.35
v74 allow 75% 15.62
v74 allow 50% 14.21
v74 allow 25% 12.69
v74 allow 0% 12.62
PSA.
------
Kind Regards,
Peter Smith.
Fujitsu Australia.
| Attachment | Content-Type | Size | 
|---|---|---|
| workload-b.PNG | image/png | 160.5 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bruce Momjian | 2022-02-03 00:35:36 | Unclear problem reports | 
| Previous Message | Bruce Momjian | 2022-02-02 23:02:39 | Re: Replace pg_controldata output fields with macros for better code manageability |