hi,
I am using psql to periodically dump the postgres tables into json files
which are imported into snowflake. For large tables (e.g. 70M rows), it
takes hours for psql to complete. Using spark to read the postgres table
seems not to work as the postgres read only replication is the bottleneck
so spark cluster never uses >1 worker node and the working node timeout or
out of memory.
Will vertical scaling the postgres db speed up psql? Or any thread related
parameter of psql can help? Thanks for any hints.
Regards
Lian