Re: Temporary tables usage in functions

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Michael Lewis <mlewis(at)entrata(dot)com>
Cc: Yambu <hyambu(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Temporary tables usage in functions
Date: 2020-11-09 17:45:43
Message-ID: CAFj8pRCyaN+bMG1f_vYG_ZfDwda4-1e83tDxg8+edOO9FnKP+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

po 9. 11. 2020 v 18:19 odesílatel Michael Lewis <mlewis(at)entrata(dot)com> napsal:

> Also may I know if excessive use of temporary tables may cause locks?
>>>
>>
>> Usually there are no problems with locks, but there is a problem with
>> system tables bloating. Creating and dropping temp tables is expensive like
>> creating or dropping normal tables.
>>
>
> Dropping a real table requires scanning all of shared_buffers, right? I
> wouldn't think dropping a temp table requires that kind of heavy operation
> because it should be stored within separate temp_buffers.
>

cleaning shared buffers probably is not a problem - when shared buffers are
less than 20GB. The main problem is bloating pg_class, pg_attribute, maybe
pg_depend. And VACUUM (autovacuum) is done after the transaction. So if
somebody drop and recreate temp table when some function is starting, and
this function is called 1M times inside the transaction, then there is
brutal bloating of the system catalogue. And bloated system tables can do
lot of other performance problems.

>
> Against other databases, there can be used arrays instead temporary
>> tables. This is significantly more effective.
>>
>
> Can you expand on this point? What do you mean? Like using a values
> statement rather than temp table? I find that I often need to create a temp
> table and analyze it to allow the planner to make wise decisions.
>

This is correct usage of temp tables. When you need ANALYZE over some data,
then there is no other possibility than using a temp table.

But a lot of people are coming from the MS SQL world, where temporary
tables are used significantly often - for passing a list of ids between
procedures, for creating multiline result, ... The implementation of MS
SQL temp tables or table variables is very different, and the usage is much
more common - typically when we use arrays in Postgres. Different example
is a Oracle. There are primary global temporary tables - again there is
very low (zero) impact on system catalog, and some patterns that are
working on Oracle well don't work well in Postgres (under longer higher
load).

postgres=# \dt+ pg_attribute
List of relations
┌────────────┬──────────────┬───────┬──────────┬─────────────┬───────────────┬────────┬─────────────┐
│ Schema │ Name │ Type │ Owner │ Persistence │ Access
Method │ Size │ Description │
╞════════════╪══════════════╪═══════╪══════════╪═════════════╪═══════════════╪════════╪═════════════╡
│ pg_catalog │ pg_attribute │ table │ postgres │ permanent │ heap
│ 568 kB │ │
└────────────┴──────────────┴───────┴──────────┴─────────────┴───────────────┴────────┴─────────────┘
(1 row)

Timing is on.
postgres=# do $$
begin
for i in 1..1000
loop
create temp table foo(a int);
drop table foo;
end loop;
end;
$$;
DO
Time: 2043,087 ms (00:02,043)
postgres=# do $$
begin
for i in 1..1000
loop
create temp table foo(a int);
drop table foo;
end loop;
end;
$$;
DO
Time: 2082,437 ms (00:02,082)
postgres=# do $$
begin
for i in 1..1000
loop
create temp table foo(a int);
drop table foo;
end loop;
end;
$$;
DO
Time: 2124,664 ms (00:02,125)
postgres=# do $$
begin
for i in 1..1000
loop
create temp table foo(a int);
drop table foo;
end loop;
end;
$$;
DO
Time: 2137,486 ms (00:02,137)
postgres=# do $$
begin
for i in 1..1000
loop
create temp table foo(a int);
drop table foo;
end loop;
end;
$$;
DO
Time: 2188,999 ms (00:02,189)
postgres=# do $$
begin
for i in 1..1000
loop
create temp table foo(a int);
drop table foo;
end loop;
end;
$$;
DO
Time: 2209,167 ms (00:02,209)
postgres=# do $$
begin
for i in 1..1000
loop
create temp table foo(a int);
drop table foo;
end loop;
end;
$$;
DO
Time: 2344,531 ms (00:02,345)

postgres=# \dt+ pg_attribute
List of relations
┌────────────┬──────────────┬───────┬──────────┬─────────────┬───────────────┬─────────┬─────────────┐
│ Schema │ Name │ Type │ Owner │ Persistence │ Access
Method │ Size │ Description │
╞════════════╪══════════════╪═══════╪══════════╪═════════════╪═══════════════╪═════════╪═════════════╡
│ pg_catalog │ pg_attribute │ table │ postgres │ permanent │ heap
│ 1592 kB │ │
└────────────┴──────────────┴───────┴──────────┴─────────────┴───────────────┴─────────┴─────────────┘
(1 row)

So some patterns that are usual with temporary tables on Oracle or on MSSQL
are bad for Postgres. This is artificial example - the reality can be worse
due too long transactions that can block vacuum.

On second hand - the advaise for Oracle is using temporary tables only when
it is necessary and isn't possible to use collection too.

Regards

Pavel

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Sri Linux 2020-11-09 19:48:08 Re: After vacuum application runs very slow ? is this common behavior ?
Previous Message Laurenz Albe 2020-11-09 17:28:53 Re: Foreign Data Wrapper Handler