From: | Önder Kalacı <onderkalaci(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date: | 2022-08-20 11:02:03 |
Message-ID: | CACawEhXavM7kjzVgMfRSN+YyB_Lxqjoi8PJ2Gh7MdL-eScb9vw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
I'm a little late to catch up with your comments, but here are my replies:
> My answer for the above assumes that your question is regarding what
> happens if you ANALYZE on a partitioned table. If your question is
> something different, please let me know.
> >
>
> I was talking about inheritance cases, something like:
> create table tbl1 (a int);
> create table tbl1_part1 (b int) inherits (tbl1);
> create table tbl1_part2 (c int) inherits (tbl1);
>
> What we do in such cases is documented as: "if the table being
> analyzed has inheritance children, ANALYZE gathers two sets of
> statistics: one on the rows of the parent table only, and a second
> including rows of both the parent table and all of its children. This
> second set of statistics is needed when planning queries that process
> the inheritance tree as a whole. The child tables themselves are not
> individually analyzed in this case."
Oh, I haven't considered inherited tables. That seems right, the
statistics of the children are not updated when the parent is analyzed.
> Now, the point I was worried about was what if the changes in child
> tables (*_part1, *_part2) are much more than in tbl1? In such cases,
> we may not invalidate child rel entries, so how will logical
> replication behave for updates/deletes on child tables? There may not
> be any problem here but it is better to do some analysis of such cases
> to see how it behaves.
>
I also haven't observed any specific issues. In the end, when the user (or
autovacuum) does ANALYZE on the child, it is when the statistics are
updated for the child. Although I do not have much experience with
inherited tables, this sounds like the expected behavior?
I also pushed a test covering inherited tables. First, a basic test on the
parent. Then, show that updates on the parent can also use indexes of the
children. Also, after an ANALYZE on the child, we can re-calculate the
index and use the index with a higher cardinality column.
> > Also, for the majority of the use-cases, I think we'd probably expect an
> index on a column with high cardinality -- hence use index scan. So, bitmap
> index scans are probably not going to be that much common.
> >
>
> You are probably right here but I don't think we can make such
> assumptions. I think the safest way to avoid any regression here is to
> choose an index when the planner selects an index scan. We can always
> extend it later to bitmap scans if required. We can add a comment
> indicating the same.
>
>
Alright, I got rid of the bitmap scans.
Though, it caused few of the new tests to fail. I think because of the data
size/distribution, the planner picks bitmap scans. To make the tests
consistent and small, I added `enable_bitmapscan to off` for this new test
file. Does that sound ok to you? Or, should we change the tests to make
sure they genuinely use index scans?
*
> + /*
> + * For insert-only workloads, calculating the index is not necessary.
> + * As the calculation is not expensive, we are fine to do here (instead
> + * of during first update/delete processing).
> + */
>
> I think here instead of talking about cost, we should mention that it
> is quite an infrequent operation i.e performed only when we first time
> performs an operation on the relation or after invalidation. This is
> because I think the cost is relative.
>
Changed, does that look better?
+
> + /*
> + * Although currently it is not possible for planner to pick a
> + * partial index or indexes only on expressions,
>
> It may be better to expand this comment by describing a bit why it is
> not possible in our case. You might want to give the function
> reference where it is decided.
>
> Make sense, added some more information.
Thanks,
Onder
Attachment | Content-Type | Size |
---|---|---|
v7_0001_use_index_on_subs_when_pub_rep_ident_full.patch | application/octet-stream | 65.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Erik Rijkers | 2022-08-20 13:32:20 | Re: Schema variables - new implementation for Postgres 15 |
Previous Message | Daniel Gustafsson | 2022-08-20 11:02:01 | Re: sslinfo extension - add notbefore and notafter timestamps |