Re: postgres large database backup

From: Michael Loftis <mloftis(at)wgops(dot)com>
To: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
Cc: Mladen Gogala <gogala(dot)mladen(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: postgres large database backup
Date: 2022-12-06 15:03:34
Message-ID: CAHDg04smet6qG0tJvzaP5S98qX+fueXPN+FCd57p56h_HWdqgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Dec 1, 2022 at 7:40 AM Vijaykumar Jain
<vijaykumarjain(dot)github(at)gmail(dot)com> wrote:
>
>
>> I do not recall zfs snapshots took anything resource intensive, and it was quick.ill ask around for actual time.
>
>
> Ok just a small note, out ingestion pattern is write anywhere, read globally. So we did stop ingestion while snapshot was taken as we could afford it that way. Maybe the story is different when snapshot is taken on live systems which generate a lot of delta.

Snapshot in ZFS at worst case would copy the entire allocation tree
and adjusts ref counters, IE metadata, no data copy. I don't know if
it even works that hard to create a snapshot now, as in it might just
make a marker, all I know is they've always been fast/cheap.
Differential zfs send|recv based off two snapshots is also pretty damn
fast because it knows what's shared, and only sends what changes.
There's definitely been major changes in how snapshots are created
over the years to make them even quicker (ISTR it's the "bookmarks"
feature?)

This is just a small pool on my local/home NAS (TrueNAS Scale) of
around 40T of data...Note that -r, it's not creating one snapshot but
uhm *checks* 64 (-r create also a snapshot of every volume/filesystem
underneath that)
root(at)(dot)(dot)(dot):~ # time zfs snapshot -r tank(at)TESTSNAP0
0.000u 0.028s 0:00.32 6.2% 144+280k 0+0io 0pf+0w
root(at)(dot)(dot)(dot):~ #

I have no idea how many files are in there. My personal home
directory and dev tree is in one of those, and I've got at least half
a dozen versions of the Linux Kernel, FreeBSD kernel, and other source
trees, and quite a few other Very Bushy(tm) source trees so it's quite
a fair amount of files.

So yeah, 28msec, 64 snapshots....they're REALLY cheap to create, and
since you pay the performance costs already, they're not very
expensive to maintain. And the performance cost isn't awful unlike in
more traditional snapshot systems. I will say that is a kind of
optimal case because I have a very fast NVMe SLOG/ZIL, and the box is
otherwise effectively idle. Destroying the freshly created snapshot
is about the same...So is destroying 6 months old snapshots though I
don't have a bonkers amount of changed data in my pool.

--

"Genius might be described as a supreme capacity for getting its possessors
into trouble of all kinds."
-- Samuel Butler

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2022-12-06 15:23:57 Re: PG 14.5 -- Impossible to restore dump due to interaction/order of views, functions, and generated columns
Previous Message shashidhar Reddy 2022-12-06 13:52:50 Re: plpgsql_check_function issue after upgrade