Re: can we avoid pg_basebackup on planned switches?

From: Ben Chobot <bench(at)silentmedia(dot)com>
To: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Cc: Postgresql General <pgsql-general(at)postgresql(dot)org>
Subject: Re: can we avoid pg_basebackup on planned switches?
Date: 2012-08-05 18:29:25
Message-ID: 60454EF6-EE41-413F-B0FA-2C037D83F8DF@silentmedia.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Aug 5, 2012, at 11:12 AM, Fujii Masao wrote:

> On Sat, Jul 28, 2012 at 2:00 AM, Ben Chobot <bench(at)silentmedia(dot)com> wrote:
>> We make heavy use of streaming replication on PG 9.1 and it's been great for
>> us. We do have one issue with it, though, and that's when we switch master
>> nodes - currently, the documentation says that you must run pg_basebackup on
>> your old master to turn it into a slave. That makes sense when the old
>> master had crashed, but it seems that in the case of a planned switch, we
>> could do better. Here's what we tried that seemed to work... are we shooting
>> ourselves in the foot?
>>
>> 1. Cleanly shut down the current master.
>> 2. Pick a slave, turn it into the new master.
>
> Before promoting the standby, you have to confirm that all WAL files
> the old master generated have been shipped to the standby which you'll promote. Because the
> standby might terminate the replication before receiving all WAL
> files. Note that there is no clean way to confirm that. For example, to confirm that, you need to
> execute CHECKPOINT in the standby, run pg_controldata in both old master and
> standby, and check whether their latest checkpoint locations are the same. You
> may think to compare the latest checkpoint location in the old master and
> pg_last_xlog_replay_location in the standby. But the former indicates
> the *starting* location of the last WAL record (i.e., shutdown checkpoint WAL record). OTOH,
> the latter indicates the *ending* location of it. So you should not compare them
> without taking into consideration the above mismatch.
>
> If the standby failed to receive some WAL files, you need to manually copy them
> in pg_xlog from the old master to the standby.

Oh, I would have though that doing a clean shutdown of the old master (step 1) would have made sure that all the unstreamed wal records would be flushed to any connected slaves as part of the master shutting down. In retrospect, I don't remember reading that anywhere, so I must have made that up because I wanted it to be that way. Is it wishful thinking?
>From pgsql-general-owner(at)postgresql(dot)org Sun Aug 5 17:21:37 2012
Received: from magus.postgresql.org (magus.postgresql.org [87.238.57.229])
by mail.postgresql.org (Postfix) with ESMTP id 97F5A67D74F
for <pgsql-general(at)postgresql(dot)org>; Sun, 5 Aug 2012 17:21:36 -0300 (ADT)
Received: from sss.pgh.pa.us ([66.207.139.130])
by magus.postgresql.org with esmtp (Exim 4.72)
(envelope-from <tgl(at)sss(dot)pgh(dot)pa(dot)us>)
id 1Sy7KN-0001wM-OE
for pgsql-general(at)postgresql(dot)org; Sun, 05 Aug 2012 20:21:35 +0000
Received: from sss2.sss.pgh.pa.us (tgl(at)localhost [127.0.0.1])
by sss.pgh.pa.us (8.14.5/8.14.5) with ESMTP id q75KL60q027853;
Sun, 5 Aug 2012 16:21:06 -0400 (EDT)
From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Christensen <mike(at)kitchenpc(dot)com>
cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-general(at)postgresql(dot)org
Subject: Re: Is this a Postgres Bug?
In-reply-to: <CABs1bs3e2wP3A6Hdj7jed1TW9kpTapC7cAW5ZCWOtd5qjCvY-w(at)mail(dot)gmail(dot)com>
References: <CABs1bs04DNhhb0Y1J1npB9svp+F0Hk6h6C4c-5j6haKDkcDwXQ(at)mail(dot)gmail(dot)com> <501E6713(dot)20601(at)fuzzy(dot)cz> <CABs1bs3e2wP3A6Hdj7jed1TW9kpTapC7cAW5ZCWOtd5qjCvY-w(at)mail(dot)gmail(dot)com>
Comments: In-reply-to Mike Christensen <mike(at)kitchenpc(dot)com>
message dated "Sun, 05 Aug 2012 08:58:32 -0700"
Date: Sun, 05 Aug 2012 16:21:05 -0400
Message-ID: <27852(dot)1344198065(at)sss(dot)pgh(dot)pa(dot)us>
X-Pg-Spam-Score: -1.9 (-)
X-Archive-Number: 201208/86
X-Sequence-Number: 189263

Mike Christensen <mike(at)kitchenpc(dot)com> writes:
> 1) Is it possible to make int2 + int2 = int4?

We could do that, but why stop there? int4 + int4 can overflow, maybe
its result should be int8? int8 + int8 can overflow, maybe its result
should be numeric? numeric + numeric can overflow, now what? And what
about subtraction, multiplication, and various other operators?

The long and the short of it is that you've made an unwise choice of
datatype, if you have values that are close enough to the overflow
threshold for this to be an issue.

> 2) I've noticed if I have an index on (CookTime::Int4 +
> PrepTime::Int4), then the query WHERE (CookTime + PrepTime > 100)
> won't use the index. However, WHERE (CookTime::Int4 + PrepTime::Int4
> 100) *will* use the index. Is this by design, and can the query
> planner by smarter about this scenario?

Yes, and no. The planner does not know enough about the semantics of
cross-type coercions to infer that these expressions are equivalent.
In fact, they *aren't* equivalent, precisely because of the differing
prospects for overflow, and so the planner would be quite exceeding its
authority to replace one with the other.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mike Christensen 2012-08-06 02:56:47 Re: Is this a Postgres Bug?
Previous Message Fujii Masao 2012-08-05 18:12:22 Re: can we avoid pg_basebackup on planned switches?