Re: Alias collision in `refresh materialized view concurrently`

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Bernd Helmle <mailings(at)oopsware(dot)de>
Cc: Mathis Rudolf <mathis(dot)rudolf(at)credativ(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Alias collision in `refresh materialized view concurrently`
Date: 2021-05-20 15:44:45
Message-ID: CALj2ACVLoU=HaZaxC5DSspSX+Sbn_bjgD-Zo7wTNsYrzJdb_MQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 20, 2021 at 7:52 PM Bernd Helmle <mailings(at)oopsware(dot)de> wrote:
>
> Am Mittwoch, dem 19.05.2021 um 18:06 +0530 schrieb Bharath Rupireddy:
> > > The corresponding Code is in `matview.c` in function
> > > `refresh_by_match_merge`. With adding a prefix like `_pg_internal_`
> > > we
> > > could make collisions pretty unlikely, without intrusive changes.
> > >
> > > The appended patch does this change for the aliases `mv`, `newdata`
> > > and
> > > `newdata2`.
> >
> > I think it's better to have some random name, see below. We could
> > either use "OIDNewHeap" or "MyBackendId" to make those column names
> > unique and almost unguessable. So, something like "pg_temp1_XXXX",
> > "pg_temp2_XXXX" or "pg_temp3_XXXX" and so on would be better IMO.
> >
> > snprintf(NewHeapName, sizeof(NewHeapName), "pg_temp_%u",
> > OIDOldHeap);
> > snprintf(namespaceName, sizeof(namespaceName), "pg_temp_%d",
> > MyBackendId);
>
> Hmm, it's an idea, but this can also lead to pretty random failures if
> you have an unlucky user who had the same idea in its generating query
> tool than the backend, no? Not sure if that's really better.
>
> With the current implementation of REFRESH MATERIALIZED VIEW
> CONCURRENTLY we always have the problem of possible collisions here,
> you'd never get out of this area without analyzing the whole query for
> such collisions.
>
> "mv" looks like a very common alias (i use it all over the time when
> testing or playing around with materialized views, so i'm wondering why
> i didn't see this issue already myself). So the risk here for such a
> collision looks very high. We can try to lower this risk by choosing an
> alias name, which is not so common. With a static alias however you get
> a static error condition, not something that fails here and then.

Another idea is to use random() function to generate required number
of uint32 random values(refresh_by_match_merge might need 3 values to
replace newdata, newdata2 and mv) and use the names like
pg_temp_rmv_<<rand_no1>>, pg_temp_rmv_<<rand_no2>> and so on. This
would make the name unguessable. Note that we use this in
choose_dsm_implementation, dsm_impl_posix.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2021-05-20 15:52:38 Re: Update maintenance_work_mem/autovacuum_work_mem to reflect the 1GB limitation with VACUUM
Previous Message Ondřej Žižka 2021-05-20 15:40:36 Re: Synchronous commit behavior during network outage