RE: Improving pg_dump performance

From: Kevin Brannen <KBrannen(at)efji(dot)com>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: RE: Improving pg_dump performance
Date: 2018-07-23 21:54:07
Message-ID: SN1PR19MB02554C58A953CD89C34F0B1CA4560@SN1PR19MB0255.namprd19.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

-----Original Message-----
From: Adrian Klaver [mailto:adrian(dot)klaver(at)aklaver(dot)com]
Sent: Monday, July 23, 2018 8:56 AM
To: Ron <ronljohnsonjr(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Improving pg_dump performance

On 07/23/2018 06:47 AM, Ron wrote:
> On 07/23/2018 08:46 AM, Stephen Frost wrote:
>> Greetings,
>>
>> * Ron (ronljohnsonjr(at)gmail(dot)com) wrote:
>>> An interesting idea. To clarify: it's possible to parallel backup a
>>> running
>>> 8.4 cluster remotely from a 9.6 system?
>> Yes, you can do a parallel backup, but you won't be able to get a
>> consistent snapshot. You'll need to pause all changes to the
>> database while the pg_dump processes connect and start their
>> transactions to have the backup be consistent.
>
> I can do that!!!

Assuming you can get this setup, have you tested some subset of your data on 9.6.9?:

-------------------

+1 on that! Case in point...

When we upgraded from 9.5 to 9.6 (only 1 version so it doesn't sound all that bad does it?) our application failed in 2 different places which we traced down to SQL failing. Both instances where something along the lines of:

select fields from table1 join table2 on (key) where conditionA and conditionB;

What happened was in that 9.5, the planner reordered the WHERE and did conditionB first, which always failed (at least when it mattered). In 9.6 the planner did conditionA first. The problem came from conditionA needing a type conversion that didn't automatically exist, hence the failure. A simple casting fixed the issue and we really should have had that in the original version, but the data we tested with never had the characteristics that would have triggered the problem (in our defense, the data that caused the failure had never shown itself in over 3 years of real usage, so I think I can call that rare).

The mistakes were ours, but the new version "tightened" some things and they caught us. The fixes were quite simple to make, but it was a real surprise to us.

So be aware that while Pg has been very good about being backward compatible, or it has for us, you can get bit in upgrades. Reading the release notes looking for change is good, but in the end, running your code against the new version is the only way to find out.

HTH,
Kevin

This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information. If you are not the intended recipient, or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, distribution, review, copy or use of any of the information contained in or attached to this message is STRICTLY PROHIBITED. If you have received this transmission in error, please immediately notify us by reply e-mail, and destroy the original transmission and its attachments without reading them or saving them to disk. Thank you.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andres Freund 2018-07-23 23:51:00 Re: logical replication snapshots
Previous Message Dimitri Maziuk 2018-07-23 20:23:04 logical replication snapshots