Re: performance problem on big tables

From: Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: performance problem on big tables
Date: 2017-08-15 10:06:40
Message-ID: CA+t6e1nG8bF4-hrvjijhJ_nC5OXmw32eXNYdxuYMkPSk2QLrag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,
So I I run the cheks that jeff mentioned :
\copy (select * from oracle_remote_table) to /tmp/tmp with binary - 1 hour
and 35 minutes
\copy local_postresql_table from /tmp/tmp with binary - Didnt run because
the remote oracle database is currently under maintenance work.

So I decided to follow MichaelDBA tips and I set the ram on my machine to
16G and I configured the effective_cache memory to 14G,tshared_buffer to be
2G and maintenance_work_mem to 4G.

I started running the copy checks again and for now it coppied 5G in 10
minutes. I have some questions :
1)When I run insert into local_postresql_table select * from
remote_oracle_table I insert that data as bulk to the local table or row by
row ? If the answer as bulk than why copy is a better option for this case
?
2)The copy from dump into the postgresql database should take less time
than the copy to dump ?
3)What do you think about the new memory parameters that I cofigured ?

2017-08-14 16:24 GMT+03:00 Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>:

> I have performance issues with two big tables. Those tables are located on
> an oracle remote database. I'm running the quert : insert into
> local_postgresql_table select * from oracle_remote_table.
>
> The first table has 45M records and its size is 23G. The import of the
> data from the oracle remote database is taking 1 hour and 38 minutes. After
> that I create 13 regular indexes on the table and it takes 10 minutes per
> table ->2 hours and 10 minutes in total.
>
> The second table has 29M records and its size is 26G. The import of the
> data from the oracle remote database is taking 2 hours and 30 minutes. The
> creation of the indexes takes 1 hours and 30 minutes (some are indexes on
> one column and the creation takes 5 min and some are indexes on multiples
> column and it takes 11 min.
>
> Those operation are very problematic for me and I'm searching for a
> solution to improve the performance. The parameters I assigned :
>
> min_parallel_relation_size = 200MB
> max_parallel_workers_per_gather = 5
> max_worker_processes = 8
> effective_cache_size = 2500MB
> work_mem = 16MB
> maintenance_work_mem = 1500MB
> shared_buffers = 2000MB
> RAM : 5G
> CPU CORES : 8
>
> *-I tried running select count(*) from table in oracle and in postgresql
> the running time is almost equal.*
>
> *-Before importing the data I drop the indexes and the constraints.*
>
> *-I tried to copy a 23G file from the oracle server to the postgresql
> server and it took me 12 minutes.*
>
> Please advice how can I continue ? How can I improve something in this
> operation ?
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2017-08-15 15:51:24 Re: Odd sudden performance degradation related to temp object churn
Previous Message Jerry Sievers 2017-08-14 23:10:38 Re: Odd sudden performance degradation related to temp object churn