From: | "I(dot) V(dot)" <ianchov(at)gmail(dot)com> |
---|---|
To: | "James Pang (chaolpan)" <chaolpan(at)cisco(dot)com> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: reindex option for tuning load large data |
Date: | 2022-06-19 08:46:53 |
Message-ID: | CACaCskQVWdU+osQgu7ck0qRDW37dBQy4QWW1_Q+QQP24+UArLQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi James,
You should be using pgloader.
Regards,
Iantcho
On Sun, Jun 19, 2022, 10:16 James Pang (chaolpan) <chaolpan(at)cisco(dot)com>
wrote:
> We extracted data from Oracle to csv first, already convert schema objects
> from Oracle to Postgresql too. Then use COPY from csv to Postgres.
>
> The point is about the 2 options to how to make the data load fast,
> pg_dump only used to dump metadata in Postgres to rebuild index and
> recreate constraints.
>
> The questions is instead of drop index and create index, we check
> update pg_index set indisready=false and reindex again after load.
>
>
>
> *From:* Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
> *Sent:* Sunday, June 19, 2022 4:01 AM
> *To:* James Pang (chaolpan) <chaolpan(at)cisco(dot)com>
> *Cc:* pgsql-performance(at)lists(dot)postgresql(dot)org
> *Subject:* Re: reindex option for tuning load large data
>
>
>
>
>
>
>
> On Fri, Jun 17, 2022 at 1:34 AM James Pang (chaolpan) <chaolpan(at)cisco(dot)com>
> wrote:
>
> Hi ,
>
> We plan to migrate large database from Oracle to Postgres(version 13.6,
> OS Redhat8 Enterprise), we are checking options to make data load in
> Postgres fast. Data volume is about several TB, thousands of indexes,
> many large table with partitions. We want to make data load running fast
> and avoid miss any indexes when reindexing. There are 2 options about
> reindex. Could you give some suggestions about the 2 options, which option
> is better.
>
>
>
> 1. Create tables and indexes( empty database) , update pg_index set
> indisready=false and inisvalid=false, then load data use COPY from csv ,
> then reindex table …
>
>
>
> Where did this idea come from? This is likely to destroy your database.
>
>
>
> 2). Use pg_dump to dump meta data only , then copy “CREATE INDEX … sql “
>
> Drop indexes before data load
>
> After data load, increase max_parallel_maintenance_workers,
> maintenance_work_mem
>
> Run CREATE INDEX … sql to leverage parallel create index feature.
>
>
>
> pg_dump doesn't run against Oracle, so where is the thing you are running
> pg_dump against coming from?
>
>
>
> If you already have a fleshed out schema in PostgreSQL, you should dump
> the sections separately (with --section=pre-data and --section=post-data)
> to get the commands to build the objects which should be run before and
> after the data is loaded.
>
>
>
> Cheers,
>
>
>
> Jeff
>
From | Date | Subject | |
---|---|---|---|
Next Message | James Pang (chaolpan) | 2022-06-28 13:15:42 | partition pruning only works for select but update |
Previous Message | James Pang (chaolpan) | 2022-06-19 07:16:26 | RE: reindex option for tuning load large data |