From: | Pól Ua Laoínecháin <linehanp(at)tcd(dot)ie> |
---|---|
To: | PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Partitioning a table by integer value (preferably in place) |
Date: | 2021-08-13 06:50:40 |
Message-ID: | CAF4RT5RfUZQ0=zDL_tEGJMEDB8yROV2n=zn64oNEDX8VXbR8QQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi all - I resolved my issue - and took a 25 minute query down to 5 seconds.
> I have a 400GB joining table (one SMALLINT and the other INTEGER -
> What I would like to do is to partition by the SMALLINT (1 - 1000)
> value - which would give 1,000 tables of 400MB each.
I found this site very helpful for explaining the basics of
partitioning (LIST, RANGE and HASH):
https://www.enterprisedb.com/postgres-tutorials/how-use-table-partitioning-scale-postgresql
I then found this absolute beaut of a site which was like manna from heaven:
https://www.depesz.com/2021/01/17/are-there-limits-to-partition-counts/
which explained (they had a different issue - I adapted the code) how
what I required can be done entirely from the psql client without the
need for bash or PL/pgSQL or anything else. The "trick" here is to
combine the FORMAT function with GENERATE_SERIES as follows (code from
site):
CREATE TABLE test_ranged (id serial PRIMARY KEY, payload TEXT)
partition BY range (id);
SELECT FORMAT ('CREATE TABLE %I partition OF test_ranged FOR VALUES
FROM (%s) to (%s);', 'test_ranged_' || i, i, i+1) FROM
generate_series(1, 2) i \gexec
Output of this (having removed \gexec - another thing I learnt):
format
---------------------------------------------------------------------------------
CREATE TABLE test_ranged_1 partition OF test_ranged FOR VALUES FROM (1) to (2);
CREATE TABLE test_ranged_2 partition OF test_ranged FOR VALUES FROM (2) to (3);
So, I generated the series for (1, 1000) with my own fields using the
LIST method with a single INTEGER value in the list. Ran the script -
had my 1000 partitions in a matter of seconds. Loading them (750GB
with indexes) was an overnight job however - but that's not
PostgreSQL's fault! :-)
I really love the way that PostgreSQL/psql is so flexible that it's
possible to do heaps of stuff without having to resort to other tools.
I posted this answer to my own question in the hope that it may help
others in my situation. If I haven't been clear, or there's something
missing, please let me know - or add your own opinions/experience if
there's an alternative which may or may not be as efficient. I'm
trying to collect as many strings to my bow as possible!
Rgs,
Pól...
From | Date | Subject | |
---|---|---|---|
Next Message | Zahir Lalani | 2021-08-13 08:00:26 | Multi-master replication |
Previous Message | Mladen Gogala | 2021-08-12 20:51:33 | Re: log_statement GUC parameter |