All of lore.kernel.org
 help / color / mirror / Atom feed
* Luminous 12.1.1 upgrade mgr woes
@ 2017-07-18  9:03 Mark Kirkwood
  2017-07-18  9:07 ` Mark Kirkwood
  2017-07-18 11:32 ` John Spray
  0 siblings, 2 replies; 11+ messages in thread
From: Mark Kirkwood @ 2017-07-18  9:03 UTC (permalink / raw)
  To: ceph-devel

[-- Attachment #1: Type: text/plain, Size: 714 bytes --]

Hi,

Just had a go at this - 12.1.1 from a freshly deployed Jewel (10.2.9) on 
Ubuntu 16.04, following 
http://docs.ceph.com/docs/master/release-notes/#upgrade-from-jewel-or-kraken. 


So it all worked ok *except* for the the mgr deploy, this hang at the 
key/caps modification stage (see attached). Now I managed to work around it:

- switch cephx to none in ceph.conf

- restart mon

- redeploy mgr

- edit client.admin and add missing: caps mgr = "allow *"

- switch to cephx again, restart mon, mgr

- ...and continue, but it makes things a whole lot more messy than 
needed, would be good for this not to trip up upgraders on important 
systems (I'm just on a play setup, so no stress here)!

regards

Mark



[-- Attachment #2: mgr.log --]
[-- Type: text/x-log, Size: 2744 bytes --]

[2017-07-18 20:43:09,773][ceph_deploy.conf][DEBUG ] found configuration file at: /home/markir/.cephdeploy.conf
[2017-07-18 20:43:09,773][ceph_deploy.cli][INFO  ] Invoked (1.5.38): /usr/local/bin/ceph-deploy mgr create ceph1
[2017-07-18 20:43:09,773][ceph_deploy.cli][INFO  ] ceph-deploy options:
[2017-07-18 20:43:09,773][ceph_deploy.cli][INFO  ]  username                      : None
[2017-07-18 20:43:09,773][ceph_deploy.cli][INFO  ]  verbose                       : False
[2017-07-18 20:43:09,774][ceph_deploy.cli][INFO  ]  mgr                           : [('ceph1', 'ceph1')]
[2017-07-18 20:43:09,774][ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[2017-07-18 20:43:09,774][ceph_deploy.cli][INFO  ]  subcommand                    : create
[2017-07-18 20:43:09,774][ceph_deploy.cli][INFO  ]  quiet                         : False
[2017-07-18 20:43:09,774][ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fd9ea1fbab8>
[2017-07-18 20:43:09,774][ceph_deploy.cli][INFO  ]  cluster                       : ceph
[2017-07-18 20:43:09,774][ceph_deploy.cli][INFO  ]  func                          : <function mgr at 0x7fd9ea6cb0c8>
[2017-07-18 20:43:09,774][ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[2017-07-18 20:43:09,774][ceph_deploy.cli][INFO  ]  default_release               : False
[2017-07-18 20:43:09,774][ceph_deploy.mgr][DEBUG ] Deploying mgr, cluster ceph hosts ceph1:ceph1
[2017-07-18 20:43:10,230][ceph1][DEBUG ] connection detected need for sudo
[2017-07-18 20:43:10,681][ceph1][DEBUG ] connected to host: ceph1 
[2017-07-18 20:43:10,681][ceph1][DEBUG ] detect platform information from remote host
[2017-07-18 20:43:10,693][ceph1][DEBUG ] detect machine type
[2017-07-18 20:43:10,696][ceph_deploy.mgr][INFO  ] Distro info: Ubuntu 16.04 xenial
[2017-07-18 20:43:10,696][ceph_deploy.mgr][DEBUG ] remote host will use systemd
[2017-07-18 20:43:10,696][ceph_deploy.mgr][DEBUG ] deploying mgr bootstrap to ceph1
[2017-07-18 20:43:10,697][ceph1][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
[2017-07-18 20:43:10,699][ceph1][DEBUG ] create path if it doesn't exist
[2017-07-18 20:43:10,700][ceph1][INFO  ] Running command: sudo ceph --cluster ceph --name client.bootstrap-mgr --keyring /var/lib/ceph/bootstrap-mgr/ceph.keyring auth get-or-create mgr.ceph1 mon allow profile mgr osd allow * mds allow * -o /var/lib/ceph/mgr/ceph-ceph1/keyring
[2017-07-18 20:43:10,967][ceph1][INFO  ] Running command: sudo systemctl enable ceph-mgr@ceph1
[2017-07-18 20:43:11,033][ceph1][INFO  ] Running command: sudo systemctl start ceph-mgr@ceph1
[2017-07-18 20:43:11,068][ceph1][INFO  ] Running command: sudo systemctl enable ceph.target

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-18  9:03 Luminous 12.1.1 upgrade mgr woes Mark Kirkwood
@ 2017-07-18  9:07 ` Mark Kirkwood
  2017-07-18 13:24   ` Sage Weil
  2017-07-18 11:32 ` John Spray
  1 sibling, 1 reply; 11+ messages in thread
From: Mark Kirkwood @ 2017-07-18  9:07 UTC (permalink / raw)
  To: ceph-devel

On 18/07/17 21:03, Mark Kirkwood wrote:

> So it all worked ok *except* for the the mgr deploy, this hang at the 
> key/caps modification stage (see attached). 

Hmm - sorry, managed to leave off the important bit of the log:
[ceph1][WARNIN] No data was received after 300 seconds, disconnecting...


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-18  9:03 Luminous 12.1.1 upgrade mgr woes Mark Kirkwood
  2017-07-18  9:07 ` Mark Kirkwood
@ 2017-07-18 11:32 ` John Spray
  2017-07-18 12:17   ` Joao Eduardo Luis
  2017-07-19  2:27   ` Mark Kirkwood
  1 sibling, 2 replies; 11+ messages in thread
From: John Spray @ 2017-07-18 11:32 UTC (permalink / raw)
  To: Mark Kirkwood; +Cc: ceph-devel

On Tue, Jul 18, 2017 at 10:03 AM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:
> Hi,
>
> Just had a go at this - 12.1.1 from a freshly deployed Jewel (10.2.9) on
> Ubuntu 16.04, following
> http://docs.ceph.com/docs/master/release-notes/#upgrade-from-jewel-or-kraken.
>
> So it all worked ok *except* for the the mgr deploy, this hang at the
> key/caps modification stage (see attached). Now I managed to work around it:
>
> - switch cephx to none in ceph.conf
>
> - restart mon
>
> - redeploy mgr

Hmm, I suspect the issue is with the bootstrap-mgr keyring.  I notice
that when trying a "mgr create" on an upgraded cluster, ceph-deploy is
prompting me to do a "gatherkeys", at which point it generates the
keyring.  However, the bootstrap-mgr identity that I have inside the
mon is weird, its key is AAAAAAAAAAAAAAAA.

Even after I've got the bootstrap-mgr keyring (whose AAA... key
matches the weird one that the mon has), I get EINVAL connecting, and
the mon is logging "error when trying to handle auth request, probably
malformed request".

So yeah, something's pretty broken here!

John


> - edit client.admin and add missing: caps mgr = "allow *"
>
> - switch to cephx again, restart mon, mgr
>
> - ...and continue, but it makes things a whole lot more messy than needed,
> would be good for this not to trip up upgraders on important systems (I'm
> just on a play setup, so no stress here)!
>
> regards
>
> Mark
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-18 11:32 ` John Spray
@ 2017-07-18 12:17   ` Joao Eduardo Luis
  2017-07-18 12:20     ` John Spray
  2017-07-19  2:27   ` Mark Kirkwood
  1 sibling, 1 reply; 11+ messages in thread
From: Joao Eduardo Luis @ 2017-07-18 12:17 UTC (permalink / raw)
  To: John Spray, Mark Kirkwood; +Cc: ceph-devel

On 07/18/2017 12:32 PM, John Spray wrote:
> On Tue, Jul 18, 2017 at 10:03 AM, Mark Kirkwood
> <mark.kirkwood@catalyst.net.nz> wrote:
>> Hi,
>>
>> Just had a go at this - 12.1.1 from a freshly deployed Jewel (10.2.9) on
>> Ubuntu 16.04, following
>> http://docs.ceph.com/docs/master/release-notes/#upgrade-from-jewel-or-kraken.
>>
>> So it all worked ok *except* for the the mgr deploy, this hang at the
>> key/caps modification stage (see attached). Now I managed to work around it:
>>
>> - switch cephx to none in ceph.conf
>>
>> - restart mon
>>
>> - redeploy mgr
> 
> Hmm, I suspect the issue is with the bootstrap-mgr keyring.  I notice
> that when trying a "mgr create" on an upgraded cluster, ceph-deploy is
> prompting me to do a "gatherkeys", at which point it generates the
> keyring.  However, the bootstrap-mgr identity that I have inside the
> mon is weird, its key is AAAAAAAAAAAAAAAA.
> 
> Even after I've got the bootstrap-mgr keyring (whose AAA... key
> matches the weird one that the mon has), I get EINVAL connecting, and
> the mon is logging "error when trying to handle auth request, probably
> malformed request".
> 
> So yeah, something's pretty broken here!

I was having that when working on `osd new`, I think, but IIRC I managed 
to fix the bug.

This may be somehow related to the refactoring I did on AuthMonitor though.

Is this just a matter of running a 'mgr create' on an upgraded cluster? 
If so, I'll try reproducing this in the afternoon and see if I can 
figure out what went wrong.

   -Joao

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-18 12:17   ` Joao Eduardo Luis
@ 2017-07-18 12:20     ` John Spray
  2017-07-18 14:42       ` Joao Eduardo Luis
  0 siblings, 1 reply; 11+ messages in thread
From: John Spray @ 2017-07-18 12:20 UTC (permalink / raw)
  To: Joao Eduardo Luis; +Cc: Mark Kirkwood, ceph-devel

On Tue, Jul 18, 2017 at 1:17 PM, Joao Eduardo Luis <joao@suse.de> wrote:
> On 07/18/2017 12:32 PM, John Spray wrote:
>>
>> On Tue, Jul 18, 2017 at 10:03 AM, Mark Kirkwood
>> <mark.kirkwood@catalyst.net.nz> wrote:
>>>
>>> Hi,
>>>
>>> Just had a go at this - 12.1.1 from a freshly deployed Jewel (10.2.9) on
>>> Ubuntu 16.04, following
>>>
>>> http://docs.ceph.com/docs/master/release-notes/#upgrade-from-jewel-or-kraken.
>>>
>>> So it all worked ok *except* for the the mgr deploy, this hang at the
>>> key/caps modification stage (see attached). Now I managed to work around
>>> it:
>>>
>>> - switch cephx to none in ceph.conf
>>>
>>> - restart mon
>>>
>>> - redeploy mgr
>>
>>
>> Hmm, I suspect the issue is with the bootstrap-mgr keyring.  I notice
>> that when trying a "mgr create" on an upgraded cluster, ceph-deploy is
>> prompting me to do a "gatherkeys", at which point it generates the
>> keyring.  However, the bootstrap-mgr identity that I have inside the
>> mon is weird, its key is AAAAAAAAAAAAAAAA.
>>
>> Even after I've got the bootstrap-mgr keyring (whose AAA... key
>> matches the weird one that the mon has), I get EINVAL connecting, and
>> the mon is logging "error when trying to handle auth request, probably
>> malformed request".
>>
>> So yeah, something's pretty broken here!
>
>
> I was having that when working on `osd new`, I think, but IIRC I managed to
> fix the bug.
>
> This may be somehow related to the refactoring I did on AuthMonitor though.
>
> Is this just a matter of running a 'mgr create' on an upgraded cluster? If
> so, I'll try reproducing this in the afternoon and see if I can figure out
> what went wrong.

Pretty much -- my cluster was a bit different though because it had
been kraken, so the mon nodes already had mgrs on them.  I was running
"mgr create" one one of the nodes that had never had a mgr or monitor
on it.

John



>
>   -Joao

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-18  9:07 ` Mark Kirkwood
@ 2017-07-18 13:24   ` Sage Weil
  0 siblings, 0 replies; 11+ messages in thread
From: Sage Weil @ 2017-07-18 13:24 UTC (permalink / raw)
  To: Mark Kirkwood; +Cc: ceph-devel

On Tue, 18 Jul 2017, Mark Kirkwood wrote:
> On 18/07/17 21:03, Mark Kirkwood wrote:
> 
> > So it all worked ok *except* for the the mgr deploy, this hang at the
> > key/caps modification stage (see attached). 
> 
> Hmm - sorry, managed to leave off the important bit of the log:
> [ceph1][WARNIN] No data was received after 300 seconds, disconnecting...

Opened http://tracker.ceph.com/issues/20666

Thanks!
sage

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-18 12:20     ` John Spray
@ 2017-07-18 14:42       ` Joao Eduardo Luis
  2017-07-18 15:08         ` Sage Weil
  0 siblings, 1 reply; 11+ messages in thread
From: Joao Eduardo Luis @ 2017-07-18 14:42 UTC (permalink / raw)
  To: John Spray; +Cc: Mark Kirkwood, ceph-devel

On 07/18/2017 01:20 PM, John Spray wrote:
> On Tue, Jul 18, 2017 at 1:17 PM, Joao Eduardo Luis <joao@suse.de> wrote:
>> On 07/18/2017 12:32 PM, John Spray wrote:
>>>
>>> On Tue, Jul 18, 2017 at 10:03 AM, Mark Kirkwood
>>> <mark.kirkwood@catalyst.net.nz> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Just had a go at this - 12.1.1 from a freshly deployed Jewel (10.2.9) on
>>>> Ubuntu 16.04, following
>>>>
>>>> http://docs.ceph.com/docs/master/release-notes/#upgrade-from-jewel-or-kraken.
>>>>
>>>> So it all worked ok *except* for the the mgr deploy, this hang at the
>>>> key/caps modification stage (see attached). Now I managed to work around
>>>> it:
>>>>
>>>> - switch cephx to none in ceph.conf
>>>>
>>>> - restart mon
>>>>
>>>> - redeploy mgr
>>>
>>>
>>> Hmm, I suspect the issue is with the bootstrap-mgr keyring.  I notice
>>> that when trying a "mgr create" on an upgraded cluster, ceph-deploy is
>>> prompting me to do a "gatherkeys", at which point it generates the
>>> keyring.  However, the bootstrap-mgr identity that I have inside the
>>> mon is weird, its key is AAAAAAAAAAAAAAAA.
>>>
>>> Even after I've got the bootstrap-mgr keyring (whose AAA... key
>>> matches the weird one that the mon has), I get EINVAL connecting, and
>>> the mon is logging "error when trying to handle auth request, probably
>>> malformed request".
>>>
>>> So yeah, something's pretty broken here!
>>
>>
>> I was having that when working on `osd new`, I think, but IIRC I managed to
>> fix the bug.
>>
>> This may be somehow related to the refactoring I did on AuthMonitor though.
>>
>> Is this just a matter of running a 'mgr create' on an upgraded cluster? If
>> so, I'll try reproducing this in the afternoon and see if I can figure out
>> what went wrong.
> 
> Pretty much -- my cluster was a bit different though because it had
> been kraken, so the mon nodes already had mgrs on them.  I was running
> "mgr create" one one of the nodes that had never had a mgr or monitor
> on it.

Looks like the problem is due to the auth entity not having a key at all 
when it's added during upgrade.

PR https://github.com/ceph/ceph/pull/16395 fixes it.

   -Joao

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-18 14:42       ` Joao Eduardo Luis
@ 2017-07-18 15:08         ` Sage Weil
  2017-07-18 15:30           ` Joao Eduardo Luis
  0 siblings, 1 reply; 11+ messages in thread
From: Sage Weil @ 2017-07-18 15:08 UTC (permalink / raw)
  To: Joao Eduardo Luis; +Cc: John Spray, Mark Kirkwood, ceph-devel

On Tue, 18 Jul 2017, Joao Eduardo Luis wrote:
> On 07/18/2017 01:20 PM, John Spray wrote:
> > On Tue, Jul 18, 2017 at 1:17 PM, Joao Eduardo Luis <joao@suse.de> wrote:
> > > On 07/18/2017 12:32 PM, John Spray wrote:
> > > > 
> > > > On Tue, Jul 18, 2017 at 10:03 AM, Mark Kirkwood
> > > > <mark.kirkwood@catalyst.net.nz> wrote:
> > > > > 
> > > > > Hi,
> > > > > 
> > > > > Just had a go at this - 12.1.1 from a freshly deployed Jewel (10.2.9)
> > > > > on
> > > > > Ubuntu 16.04, following
> > > > > 
> > > > > http://docs.ceph.com/docs/master/release-notes/#upgrade-from-jewel-or-kraken.
> > > > > 
> > > > > So it all worked ok *except* for the the mgr deploy, this hang at the
> > > > > key/caps modification stage (see attached). Now I managed to work
> > > > > around
> > > > > it:
> > > > > 
> > > > > - switch cephx to none in ceph.conf
> > > > > 
> > > > > - restart mon
> > > > > 
> > > > > - redeploy mgr
> > > > 
> > > > 
> > > > Hmm, I suspect the issue is with the bootstrap-mgr keyring.  I notice
> > > > that when trying a "mgr create" on an upgraded cluster, ceph-deploy is
> > > > prompting me to do a "gatherkeys", at which point it generates the
> > > > keyring.  However, the bootstrap-mgr identity that I have inside the
> > > > mon is weird, its key is AAAAAAAAAAAAAAAA.
> > > > 
> > > > Even after I've got the bootstrap-mgr keyring (whose AAA... key
> > > > matches the weird one that the mon has), I get EINVAL connecting, and
> > > > the mon is logging "error when trying to handle auth request, probably
> > > > malformed request".
> > > > 
> > > > So yeah, something's pretty broken here!
> > > 
> > > 
> > > I was having that when working on `osd new`, I think, but IIRC I managed
> > > to
> > > fix the bug.
> > > 
> > > This may be somehow related to the refactoring I did on AuthMonitor
> > > though.
> > > 
> > > Is this just a matter of running a 'mgr create' on an upgraded cluster? If
> > > so, I'll try reproducing this in the afternoon and see if I can figure out
> > > what went wrong.
> > 
> > Pretty much -- my cluster was a bit different though because it had
> > been kraken, so the mon nodes already had mgrs on them.  I was running
> > "mgr create" one one of the nodes that had never had a mgr or monitor
> > on it.
> 
> Looks like the problem is due to the auth entity not having a key at all when
> it's added during upgrade.
> 
> PR https://github.com/ceph/ceph/pull/16395 fixes it.

That fix looks right to me.  Were you able to reproduce the original 
issue, and/or did you test with the fix?

Thanks!
sage

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-18 15:08         ` Sage Weil
@ 2017-07-18 15:30           ` Joao Eduardo Luis
  0 siblings, 0 replies; 11+ messages in thread
From: Joao Eduardo Luis @ 2017-07-18 15:30 UTC (permalink / raw)
  To: Sage Weil; +Cc: John Spray, Mark Kirkwood, ceph-devel

On 07/18/2017 04:08 PM, Sage Weil wrote:
> On Tue, 18 Jul 2017, Joao Eduardo Luis wrote:
>> On 07/18/2017 01:20 PM, John Spray wrote:
>>> On Tue, Jul 18, 2017 at 1:17 PM, Joao Eduardo Luis <joao@suse.de> wrote:
>>>> On 07/18/2017 12:32 PM, John Spray wrote:
>>>>>
>>>>> On Tue, Jul 18, 2017 at 10:03 AM, Mark Kirkwood
>>>>> <mark.kirkwood@catalyst.net.nz> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Just had a go at this - 12.1.1 from a freshly deployed Jewel (10.2.9)
>>>>>> on
>>>>>> Ubuntu 16.04, following
>>>>>>
>>>>>> http://docs.ceph.com/docs/master/release-notes/#upgrade-from-jewel-or-kraken.
>>>>>>
>>>>>> So it all worked ok *except* for the the mgr deploy, this hang at the
>>>>>> key/caps modification stage (see attached). Now I managed to work
>>>>>> around
>>>>>> it:
>>>>>>
>>>>>> - switch cephx to none in ceph.conf
>>>>>>
>>>>>> - restart mon
>>>>>>
>>>>>> - redeploy mgr
>>>>>
>>>>>
>>>>> Hmm, I suspect the issue is with the bootstrap-mgr keyring.  I notice
>>>>> that when trying a "mgr create" on an upgraded cluster, ceph-deploy is
>>>>> prompting me to do a "gatherkeys", at which point it generates the
>>>>> keyring.  However, the bootstrap-mgr identity that I have inside the
>>>>> mon is weird, its key is AAAAAAAAAAAAAAAA.
>>>>>
>>>>> Even after I've got the bootstrap-mgr keyring (whose AAA... key
>>>>> matches the weird one that the mon has), I get EINVAL connecting, and
>>>>> the mon is logging "error when trying to handle auth request, probably
>>>>> malformed request".
>>>>>
>>>>> So yeah, something's pretty broken here!
>>>>
>>>>
>>>> I was having that when working on `osd new`, I think, but IIRC I managed
>>>> to
>>>> fix the bug.
>>>>
>>>> This may be somehow related to the refactoring I did on AuthMonitor
>>>> though.
>>>>
>>>> Is this just a matter of running a 'mgr create' on an upgraded cluster? If
>>>> so, I'll try reproducing this in the afternoon and see if I can figure out
>>>> what went wrong.
>>>
>>> Pretty much -- my cluster was a bit different though because it had
>>> been kraken, so the mon nodes already had mgrs on them.  I was running
>>> "mgr create" one one of the nodes that had never had a mgr or monitor
>>> on it.
>>
>> Looks like the problem is due to the auth entity not having a key at all when
>> it's added during upgrade.
>>
>> PR https://github.com/ceph/ceph/pull/16395 fixes it.
> 
> That fix looks right to me.  Were you able to reproduce the original
> issue, and/or did you test with the fix?

Yes to both.

Reproducing is trivial, coming from kraken:

- have a vstart kraken cluster
- upgrade monitors to luminous
- see a 'client.bootstrap-mgr' entry popping up on 'auth list' with a 
key something like 'AAAAAAAA'

After the fix, the key is a proper cephx key.

   -Joao

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-18 11:32 ` John Spray
  2017-07-18 12:17   ` Joao Eduardo Luis
@ 2017-07-19  2:27   ` Mark Kirkwood
  2017-07-19 17:33     ` Joao Eduardo Luis
  1 sibling, 1 reply; 11+ messages in thread
From: Mark Kirkwood @ 2017-07-19  2:27 UTC (permalink / raw)
  To: John Spray; +Cc: ceph-devel

On 18/07/17 23:32, John Spray wrote:

> On Tue, Jul 18, 2017 at 10:03 AM, Mark Kirkwood
> <mark.kirkwood@catalyst.net.nz> wrote:
>> Hi,
>>
>> Just had a go at this - 12.1.1 from a freshly deployed Jewel (10.2.9) on
>> Ubuntu 16.04, following
>> http://docs.ceph.com/docs/master/release-notes/#upgrade-from-jewel-or-kraken.
>>
>> So it all worked ok *except* for the the mgr deploy, this hang at the
>> key/caps modification stage (see attached). Now I managed to work around it:
>>
>> - switch cephx to none in ceph.conf
>>
>> - restart mon
>>
>> - redeploy mgr
> Hmm, I suspect the issue is with the bootstrap-mgr keyring.  I notice
> that when trying a "mgr create" on an upgraded cluster, ceph-deploy is
> prompting me to do a "gatherkeys", at which point it generates the
> keyring.  However, the bootstrap-mgr identity that I have inside the
> mon is weird, its key is AAAAAAAAAAAAAAAA.
>
> Even after I've got the bootstrap-mgr keyring (whose AAA... key
> matches the weird one that the mon has), I get EINVAL connecting, and
> the mon is logging "error when trying to handle auth request, probably
> malformed request".
>
>

Yeah, that is what I'm seeing here (I didn't notice that the 
bootstrap-mgr key was weird/truncated...well spotted).

regards
Mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Luminous 12.1.1 upgrade mgr woes
  2017-07-19  2:27   ` Mark Kirkwood
@ 2017-07-19 17:33     ` Joao Eduardo Luis
  0 siblings, 0 replies; 11+ messages in thread
From: Joao Eduardo Luis @ 2017-07-19 17:33 UTC (permalink / raw)
  To: Mark Kirkwood, John Spray; +Cc: ceph-devel

On 07/19/2017 03:27 AM, Mark Kirkwood wrote:
> On 18/07/17 23:32, John Spray wrote:
> 
>> On Tue, Jul 18, 2017 at 10:03 AM, Mark Kirkwood
>> <mark.kirkwood@catalyst.net.nz> wrote:
>>> Hi,
>>>
>>> Just had a go at this - 12.1.1 from a freshly deployed Jewel (10.2.9) on
>>> Ubuntu 16.04, following
>>> http://docs.ceph.com/docs/master/release-notes/#upgrade-from-jewel-or-kraken. 
>>>
>>>
>>> So it all worked ok *except* for the the mgr deploy, this hang at the
>>> key/caps modification stage (see attached). Now I managed to work 
>>> around it:
>>>
>>> - switch cephx to none in ceph.conf
>>>
>>> - restart mon
>>>
>>> - redeploy mgr
>> Hmm, I suspect the issue is with the bootstrap-mgr keyring.  I notice
>> that when trying a "mgr create" on an upgraded cluster, ceph-deploy is
>> prompting me to do a "gatherkeys", at which point it generates the
>> keyring.  However, the bootstrap-mgr identity that I have inside the
>> mon is weird, its key is AAAAAAAAAAAAAAAA.
>>
>> Even after I've got the bootstrap-mgr keyring (whose AAA... key
>> matches the weird one that the mon has), I get EINVAL connecting, and
>> the mon is logging "error when trying to handle auth request, probably
>> malformed request".
>>
>>
> 
> Yeah, that is what I'm seeing here (I didn't notice that the 
> bootstrap-mgr key was weird/truncated...well spotted).

That should now be fixed on master.

Let us know if it didn't solve the issue.

   -Joao

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-07-19 17:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-18  9:03 Luminous 12.1.1 upgrade mgr woes Mark Kirkwood
2017-07-18  9:07 ` Mark Kirkwood
2017-07-18 13:24   ` Sage Weil
2017-07-18 11:32 ` John Spray
2017-07-18 12:17   ` Joao Eduardo Luis
2017-07-18 12:20     ` John Spray
2017-07-18 14:42       ` Joao Eduardo Luis
2017-07-18 15:08         ` Sage Weil
2017-07-18 15:30           ` Joao Eduardo Luis
2017-07-19  2:27   ` Mark Kirkwood
2017-07-19 17:33     ` Joao Eduardo Luis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.