Concurrenctly running CREATE TEMP TABLE IF NOT EXISTS [...] AS [...]

From: Thorsten Schöning <tschoening(at)am-soft(dot)de>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Concurrenctly running CREATE TEMP TABLE IF NOT EXISTS [...] AS [...]
Date: 2020-06-08 08:14:09
Message-ID: 1102807465.20200608101409@am-soft.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all,

I have an app exposing web services to generate reports. Those web
services accept multiple reports per request and calculate them
concurrently. There's one transaction spanning each entire request and
used by ALL spawned threads. The app makes sure that e.g. committing
transactions is handled by ONE thread only and individual statements
are NEVER reused by multiple threads.

There are two types of reports: Some need to read data from the DB
themself, others can work on the data provided by the former ones. The
current implementation always creates temporary tables for results of
some query, so that one can simply check if some temporary table is
available and either only query that or create it before as necessary.

I'm using "CREATE TEMP TABLE IF NOT EXISTS [...] AS [...]" and the
associated queries can take a long time. So the following lists some
questions about executing those concurrently, even thouzgh I've
already read threads like the following:

> The bottom line is that CREATE TABLE IF NOT EXISTS doesn't pretend
> to handle concurrency issues any better than regular old CREATE
> TABLE, which is to say not very well.[...]

https://www.postgresql.org/message-id/CA+TgmoZAdYVtwBfp1FL2sMZbiHCWT4UPrzRLNnX1Nb30Ku3-gg@mail.gmail.com

When the table needs to be created, when is it visible to other
threads using the same transaction, before or after executing the
additional query?

Am I correct that with using "IF NOT EXISTS" the associated query is
only executed as well if the table needs to be created? In theory
those two things could be independent of each other, maybe resulting
in duplicated rows or stuff like that per execution. But doesn't seem
so according to my tests.

Am I correct that in my described setup I need to make sure on my own
that only one thread creates each individual temporary table and
executes the associated query? Otherwise it might happen that multiple
attempts creating the table at the same time simply fails and the only
question is if this happens before or after the associated query.

Or would creating the table itself succeeds even with many threads,
but some of them simply wouldn't read any data, becauser the
associated query is executed in a second step by that thread that
created the table in the first place? I don't think so, though.

Am I correct that because of the same transaction used by multiple
threads I need to synchronize them on web service-level? E.g. things
like advisory locks won't work because they have session or
transaction level and would be granted to all threads instantly.

Thanks for your insights!

Mit freundlichen Grüßen,

Thorsten Schöning

--
Thorsten Schöning E-Mail: Thorsten(dot)Schoening(at)AM-SoFT(dot)de
AM-SoFT IT-Systeme http://www.AM-SoFT.de/

Telefon...........05151- 9468- 55
Fax...............05151- 9468- 88
Mobil..............0178-8 9468- 04

AM-SoFT GmbH IT-Systeme, Brandenburger Str. 7c, 31789 Hameln
AG Hannover HRB 207 694 - Geschäftsführer: Andreas Muchow

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thomas Kellerer 2020-06-08 08:18:41 Re: checking existence of a table before updating its SERIAL
Previous Message David G. Johnston 2020-06-08 08:05:59 Re: checking existence of a table before updating its SERIAL