Re: New pacific mon won't join with octopus mons

From: Chris Dunlop <chris@onthe.net.au>
To: Gregory Farnum <gfarnum@redhat.com>
Cc: ceph-devel <ceph-devel@vger.kernel.org>
Subject: Re: New pacific mon won't join with octopus mons
Date: Thu, 2 Sep 2021 11:41:25 +1000	[thread overview]
Message-ID: <20210902014125.GA13333@onthe.net.au> (raw)
In-Reply-To: <CAJ4mKGZN+zAzyMM1+mWuPw5r1v=b-QQrChm+_0nfWtzcbyx=_g@mail.gmail.com>

Hi Gregory,

On Wed, Sep 01, 2021 at 10:56:56AM -0700, Gregory Farnum wrote:
> Why are you trying to create a new pacific monitor instead of
> upgrading an existing one?

The "ceph orch upgrade" failed twice at the point of upgrading the 
mons, once due to the octopus mons getting the "--init" argument added 
to their docker startup and the docker version on Debian Buster not 
supporting both the "--init" and "-v /dev:/dev" args at the same time, 
per:

https://github.com/moby/moby/pull/37665

...and once due to never having a cephfs on the cluster:

https://tracker.ceph.com/issues/51673

So at one point I had one mon down due to the failed upgrade, then 
another of the 3 originals was taken out by the host's disk filling up 
(due, I think, to the excessive logging occurring at the time in 
combination with having both docker and podman images pulled in), 
leaving me with a single octopus mon running and no quorum, bringing 
the cluster to a stand still, and me panic-learning how to deal with 
the situation. Fun times.

So yes, I was feeling just a little leery about upgrading the octopus 
mons and potentialy losing quorum again!

> I *think* what's going on here is that since you're deploying a new
> pacific mon, and you're not giving it a starting monmap, it's set up
> to assume the use of pacific features. It can find peers at the
> locations you've given it, but since they're on octopus there are
> mismatches.
>
> Now, I would expect and want this to work so you should file a bug,

https://tracker.ceph.com/issues/52488

> but the initial bootstrapping code is a bit hairy and may not account
> for cross-version initial setup in this fashion, or have gotten buggy
> since written. So I'd try upgrading the existing mons, or generating a
> new pacific mon and upgrading that one to octopus if you're feeling
> leery.

Yes, I thought a safer / less stressful way of progressing would be to 
add a new octopus mon to the existing quorum and upgrade that one 
first as a test. I went ahead with that and checked the cluster health 
immediately afterwards: "ceph -s" showed HEALTH_OK, with 4 mons, i.e. 
3 x octopus and 1 x pacific.

Nice!  

But shortly later alarms started going off and the health of the 
cluster was coming back as more than a little gut-wrenching, with ALL 
pgs showing up as inactive / unknown:

$ ceph -s
   cluster:
     id:     c6618970-0ce0-4cb2-bc9a-dd5f29b62e24
     health: HEALTH_WARN
             Reduced data availability: 5721 pgs inactive
             (muted: OSDMAP_FLAGS POOL_NO_REDUNDANCY)

   services:
     mon: 4 daemons, quorum k2,b2,b4,b5 (age 43m)
     mgr: b5(active, starting, since 40m), standbys: b4, b2
     osd: 78 osds: 78 up (since 4d), 78 in (since 3w)
          flags noout

   data:
     pools:   12 pools, 5721 pgs
     objects: 0 objects, 0 B
     usage:   0 B used, 0 B / 0 B avail
     pgs:     100.000% pgs unknown
              5721 unknown

$ ceph health detail
HEALTH_WARN Reduced data availability: 5721 pgs inactive; (muted: OSDMAP_FLAGS POOL_NO_REDUNDANCY)
(MUTED) [WRN] OSDMAP_FLAGS: noout flag(s) set
[WRN] PG_AVAILABILITY: Reduced data availability: 5721 pgs inactive
     pg 6.fcd is stuck inactive for 41m, current state unknown, last acting []
     pg 6.fce is stuck inactive for 41m, current state unknown, last acting []
     pg 6.fcf is stuck inactive for 41m, current state unknown, last acting []
     pg 6.fd0 is stuck inactive for 41m, current state unknown, last acting []
     ...etc.

So that was also heaps of fun for a while, until I thought to remove 
the pacific mon and the health reverted to normal. Bug filed:

https://tracker.ceph.com/issues/52489

At this point I'm more than a little gun shy, but I'm girding my loins 
to go ahead with the rest of the upgrade on the basis the health issue 
is "just" a temporary reporting problem (albeit a highly startling 
one!) with mixed octopus and pacific mons.

Cheers,

Chris