Re: Partitioning with temp tables is broken

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Partitioning with temp tables is broken
Date: 2018-06-13 13:57:29
Message-ID: CAFjFpRcme4YLED_Lp3+orRC08OJ6oEaSvSu6WZp25TiwbNnFVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 13, 2018 at 5:36 PM, David Rowley
<david(dot)rowley(at)2ndquadrant(dot)com> wrote:
> Hi,
>
> It seems temp tables with partitioned tables is not working correctly.
> 9140cf8269b0c has not considered that in build_simple_rel() an
> AppendRelInfo could be missing due to it having been skipped in
> expand_partitioned_rtentry()
>
> Assert(cnt_parts == nparts); fails in build_simple_rel, and without an
> assert enabled build it just crashes later when trying to dereference
> the in prune_append_rel_partitions() or
> apply_scanjoin_target_to_paths(), depending if partition pruning is
> used and does not prune the relation.
>
> There's also something pretty weird around the removal of the temp
> relation from the partition bound. I've had cases where the session
> that attached the temp table is long gone, but \d+ shows the table is
> still there and I can't attach another partition due to an overlap,
> and can't drop the temp table due to the session not existing anymore.
> I've not got a test case for that one yet, but the test case for the
> crash is:
>
> -- Session 1:
> create table listp (a int) partition by list(a);
> create table listp1 partition of listp for values in(1);
> create temp table listp2 partition of listp for values in (2);
>
> -- Session 2:
> select * from listp;

Culprit here is following code in expand_partitioned_rtentry()
/* As in expand_inherited_rtentry, skip non-local temp tables */
if (RELATION_IS_OTHER_TEMP(childrel))
{
heap_close(childrel, lockmode);
continue;
}

>
> We might want to create the RelOptInfo and somehow set it as a dummy
> rel, or consider setting it to NULL and skipping the NULLs. I'd rather
> see the latter for the future, but for PG11 we might prefer the
> former. I've not looked into how this would be done or if it can be
> done sanely.

A temporary table is a base relation. In order to create a RelOptInfo
of a base relation we need to have an entry available for it in
query->rtables, simple_rel_array. We need corresponding RTE and
AppendRelInfo so that we can reserve an entry there. Somehow this
AppendRelInfo or the RTE should convey that it's a temporary relation
from the other session and mark the RelOptInfo empty when it gets
created in build_simple_rel() or when it gets in part_rels array. We
will require accessing metadata about the temporary table while
building RelOptInfo. That may be fine. I haven't checked.

I haven't thought about setting NULL in part_rels array. That may be
safer than the first option since the places where we scan part_rels
are fewer and straight forward.

But having a temporary table with partition has other effects also e.g.
\d+ t1
Table "public.t1"
Column | Type | Collation | Nullable | Default | Storage | Stats
target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
a | integer | | | | plain | |
b | integer | | | | plain | |
Partition key: RANGE (a)
Indexes:
"t1_a" btree (a)
Partitions: t1p1 FOR VALUES FROM (0) TO (100),
t1p2 FOR VALUES FROM (100) TO (200)

both t1p1 and t1p2 are regular tables.

create a temporary table as default partition from some other session
and insert data

create temporary table t1p3 partition of t1 default;
insert into t1 values (350, 1);

now try creating a partition from a session other than the one where
temporary table was created
create table t1p4 partition of t1 for values from (300) to (400);
ERROR: cannot access temporary tables of other sessions

The reason that happens because when creating a regular partition we
scan the default partition to check whether the regular partition has
any data that belongs to the partition being created.

Similar error will result if we try to insert a row to the partitioned
table which belongs to temporary partition from other session.

I think it's debatable whether that's a bug or not. But it's not as
bad as a crash. So, a solution to fix crash your reproduction is to
just remove the code in expand_partitioned_rtentry() which skips the
temporary tables from the other session
1712 /* As in expand_inherited_rtentry, skip non-local temp tables */
1713 if (RELATION_IS_OTHER_TEMP(childrel))
1714 {
1715 heap_close(childrel, lockmode);
1716 continue;
1717 }
If we do that we will see above error when the partition corresponding
to the temporary table from the other session is scanned i.e. if such
partition is not pruned.

But a larger question is what use such temporary partitions are?
Should we just prohibit adding temporary partitions to a permanant
partitioned table? We should allow adding temporary partitions to a
temporary partitioned table if only they both belong to the same
session.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2018-06-13 14:08:47 Re: Spilling hashed SetOps and aggregates to disk
Previous Message Robert Haas 2018-06-13 13:47:59 Re: why partition pruning doesn't work?