From: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: [bug?] Missed parallel safety checks, and wrong parallel safety |
Date: | 2021-06-16 03:27:24 |
Message-ID: | OS0PR01MB57164B645E1B1AC364667EA7940F9@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tuesday, June 15, 2021 10:01 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Jun 15, 2021 at 7:05 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > Yeah, dealing with partitioned tables is tricky. I think if we don't
> > want to check upfront the parallel safety of all the partitions then
> > the other option as discussed could be to ask the user to specify the
> > parallel safety of partitioned tables.
>
> Just to be clear here, I don't think it really matters what we *want* to do. I don't
> think it's reasonably *possible* to check all the partitions, because we don't
> hold locks on them. When we're assessing a bunch of stuff related to an
> individual relation, we have a lock on it. I think - though we should
> double-check tablecmds.c - that this is enough to prevent all of the dependent
> objects - triggers, constraints, etc. - from changing. So the stuff we care about
> is stable. But the situation with a partitioned table is different. In that case, we
> can't even examine that stuff without locking all the partitions. And even if we
> do lock all the partitions, the stuff could change immediately afterward and we
> wouldn't know. So I think it would be difficult to make it correct.
>
> Now, maybe it could be done, and I think that's worth a little more thought. For
> example, perhaps whenever we invalidate a relation, we could also somehow
> send some new, special kind of invalidation for its parent saying, essentially,
> "hey, one of your children has changed in a way you might care about." But
> that's not good enough, because it only goes up one level. The grandparent
> would still be unaware that a change it potentially cares about has occurred
> someplace down in the partitioning hierarchy. That seems hard to patch up,
> again because of the locking rules. The child can know the OID of its parent
> without locking the parent, but it can't know the OID of its grandparent without
> locking its parent. Walking up the whole partitioning hierarchy might be an
> issue for a number of reasons, including possible deadlocks, and possible race
> conditions where we don't emit all of the right invalidations in the face of
> concurrent changes. So I don't quite see a way around this part of the problem,
> but I may well be missing something. In fact I hope I am missing something,
> because solving this problem would be really nice.
I think the check of partition could be even more complicated if we need to
check the parallel safety of partition key expression. If user directly insert into
a partition, then we need invoke ExecPartitionCheck which will execute all it's
parent's and grandparent's partition key expressions. It means if we change a
parent table's partition key expression(by 1) change function in expr or 2) attach
the parent table as partition of another parent table), then we need to invalidate
all its child's relcache.
BTW, currently, If user attach a partitioned table 'A' to be partition of another
partitioned table 'B', the child of 'A' will not be invalidated.
Best regards,
houzj
From | Date | Subject | |
---|---|---|---|
Next Message | Noah Misch | 2021-06-16 03:43:23 | Re: Improving isolationtester's data output |
Previous Message | Julien Rouhaud | 2021-06-16 03:20:55 | Re: Duplicate history file? |