Re: Partitioning a table by integer value (preferably in place)

From: Pól Ua Laoínecháin <linehanp(at)tcd(dot)ie>
To: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Partitioning a table by integer value (preferably in place)
Date: 2021-08-13 06:50:40
Message-ID: CAF4RT5RfUZQ0=zDL_tEGJMEDB8yROV2n=zn64oNEDX8VXbR8QQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi all - I resolved my issue - and took a 25 minute query down to 5 seconds.

> I have a 400GB joining table (one SMALLINT and the other INTEGER -
> What I would like to do is to partition by the SMALLINT (1 - 1000)
> value - which would give 1,000 tables of 400MB each.

I found this site very helpful for explaining the basics of
partitioning (LIST, RANGE and HASH):

https://www.enterprisedb.com/postgres-tutorials/how-use-table-partitioning-scale-postgresql

I then found this absolute beaut of a site which was like manna from heaven:

https://www.depesz.com/2021/01/17/are-there-limits-to-partition-counts/

which explained (they had a different issue - I adapted the code) how
what I required can be done entirely from the psql client without the
need for bash or PL/pgSQL or anything else. The "trick" here is to
combine the FORMAT function with GENERATE_SERIES as follows (code from
site):

CREATE TABLE test_ranged (id serial PRIMARY KEY, payload TEXT)
partition BY range (id);
SELECT FORMAT ('CREATE TABLE %I partition OF test_ranged FOR VALUES
FROM (%s) to (%s);', 'test_ranged_' || i, i, i+1) FROM
generate_series(1, 2) i \gexec

Output of this (having removed \gexec - another thing I learnt):

format
---------------------------------------------------------------------------------
CREATE TABLE test_ranged_1 partition OF test_ranged FOR VALUES FROM (1) to (2);
CREATE TABLE test_ranged_2 partition OF test_ranged FOR VALUES FROM (2) to (3);

So, I generated the series for (1, 1000) with my own fields using the
LIST method with a single INTEGER value in the list. Ran the script -
had my 1000 partitions in a matter of seconds. Loading them (750GB
with indexes) was an overnight job however - but that's not
PostgreSQL's fault! :-)

I really love the way that PostgreSQL/psql is so flexible that it's
possible to do heaps of stuff without having to resort to other tools.

I posted this answer to my own question in the hope that it may help
others in my situation. If I haven't been clear, or there's something
missing, please let me know - or add your own opinions/experience if
there's an alternative which may or may not be as efficient. I'm
trying to collect as many strings to my bow as possible!

Rgs,

Pól...

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Zahir Lalani 2021-08-13 08:00:26 Multi-master replication
Previous Message Mladen Gogala 2021-08-12 20:51:33 Re: log_statement GUC parameter