All of lore.kernel.org
 help / color / mirror / Atom feed
* ceph-disk and /dev/dm-* permissions - race condition?
@ 2016-11-04 14:51 Wyllys Ingersoll
       [not found] ` <CADEhWsOYJeq3kVUL31fzsCmi6E2jueYQsy08OV+jXx-waqZe5w@mail.gmail.com>
  2016-11-22 14:48 ` Loic Dachary
  0 siblings, 2 replies; 11+ messages in thread
From: Wyllys Ingersoll @ 2016-11-04 14:51 UTC (permalink / raw)
  To: Ceph Development

We are running 10.2.3 with encrypted OSDs and journals using the old
(i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
after a reboot of a storage server.  Our data and journals are on
separate partitions on the same disk.

After a reboot, sometimes the OSDs fail to start because of
permissions problems.  The /dev/dm-* devices come back with
permissions set to "root:disk" sometimes instead of "ceph:ceph".
Weirder still is that sometimes the ceph-osd will start and work in
spite of the incorrect perrmissions (root:disk) and other times they
will fail and the logs show permissions errors when trying to access
the journals. Sometimes half of the /dev/dm- devices are "root:disk"
and others are "ceph:ceph".  There's no clear pattern, so that's what
leads me to think its a race condition in the ceph_disk "dmcrypt_map"
function.

Is there a known issue with ceph-disk and/or ceph-osd related to
timing of the encrypted devices being setup and the permissions
getting changed to the ceph processes can access them?

Wyllys Ingersoll
Keeper Technology, LLC

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
       [not found] ` <CADEhWsOYJeq3kVUL31fzsCmi6E2jueYQsy08OV+jXx-waqZe5w@mail.gmail.com>
@ 2016-11-05 12:36   ` Wyllys Ingersoll
  2016-11-07 20:09     ` Wyllys Ingersoll
  0 siblings, 1 reply; 11+ messages in thread
From: Wyllys Ingersoll @ 2016-11-05 12:36 UTC (permalink / raw)
  To: Rajib Hossen, Ceph Development

Thats an interesting workaround, I may end up using it if all else fails.

I watch the permissions on /dev/dm-* devices during the boot
processes, they start out correctly as "ceph:ceph", but at the end of
the ceph disk preparation, a "ceph-disk trigger" is executed which
seems to cause the permissions to get reset back to "root:disk".  This
leaves the ceph-osd processes that are running able to continue, but
if they have to restart for any reason, they will fail to restart.

It could be a problem with the udev rules for the encrypted data and
journal partitions.  Debugging udev is a nightmare.  Im hoping someone
else has already solved this one.



On Sat, Nov 5, 2016 at 1:13 AM, Rajib Hossen
<rajib.hossen.ipvision@gmail.com> wrote:
> Hello,
> I had the similar issue. I solved it via a cronjob. In crontab -e
> "@reboot chown -R ceph:ceph /dev/vdb1". say my journal is in disk vdb and
> first partition(vdb1). vdb2 is my data disk.
>
> On Fri, Nov 4, 2016 at 8:51 PM, Wyllys Ingersoll
> <wyllys.ingersoll@keepertech.com> wrote:
>>
>> We are running 10.2.3 with encrypted OSDs and journals using the old
>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
>> after a reboot of a storage server.  Our data and journals are on
>> separate partitions on the same disk.
>>
>> After a reboot, sometimes the OSDs fail to start because of
>> permissions problems.  The /dev/dm-* devices come back with
>> permissions set to "root:disk" sometimes instead of "ceph:ceph".
>> Weirder still is that sometimes the ceph-osd will start and work in
>> spite of the incorrect perrmissions (root:disk) and other times they
>> will fail and the logs show permissions errors when trying to access
>> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
>> and others are "ceph:ceph".  There's no clear pattern, so that's what
>> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
>> function.
>>
>> Is there a known issue with ceph-disk and/or ceph-osd related to
>> timing of the encrypted devices being setup and the permissions
>> getting changed to the ceph processes can access them?
>>
>> Wyllys Ingersoll
>> Keeper Technology, LLC
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
  2016-11-05 12:36   ` Wyllys Ingersoll
@ 2016-11-07 20:09     ` Wyllys Ingersoll
  2016-11-07 21:35       ` Loic Dachary
  0 siblings, 1 reply; 11+ messages in thread
From: Wyllys Ingersoll @ 2016-11-07 20:09 UTC (permalink / raw)
  To: Rajib Hossen, Ceph Development

The workaround to put "@reboot chown -R ceph:ceph /dev/vdb1" in
crontab doesn't work because the /dev/dm-* devices change ownership
after they start up.

Im not sure of all of the interactions between ceph-osd and udev and
the /dev/mapper for handling encrypted partitions, but somewhere late
in the startup process just after ceph-osd has started running, the
permissions on the /dev/dm-* devices change from ceph:ceph to
"root:disk" which makes it impossible for an an OSD process to ever
restart again due to not being able to read the encrypted journal.

My workaround was to add a line to the udev 55-dm.rules file just
before the 'GOTO="dm_end"' line towards the end of that file:
OWNER:="ceph", GROUP:="ceph", MODE:="0660"

Even though this workaround seems to work for our situation, I still
maintain that there is a bug in the ceph-osd startup sequence that is
causing the ownership to change back to "root:disk" where it should be
"ceph:ceph".

Wyllys Ingersoll
Keeper Technology, LLC



On Sat, Nov 5, 2016 at 8:36 AM, Wyllys Ingersoll
<wyllys.ingersoll@keepertech.com> wrote:
>
> Thats an interesting workaround, I may end up using it if all else fails.
>
> I watch the permissions on /dev/dm-* devices during the boot
> processes, they start out correctly as "ceph:ceph", but at the end of
> the ceph disk preparation, a "ceph-disk trigger" is executed which
> seems to cause the permissions to get reset back to "root:disk".  This
> leaves the ceph-osd processes that are running able to continue, but
> if they have to restart for any reason, they will fail to restart.
>
> It could be a problem with the udev rules for the encrypted data and
> journal partitions.  Debugging udev is a nightmare.  Im hoping someone
> else has already solved this one.
>
>
>
> On Sat, Nov 5, 2016 at 1:13 AM, Rajib Hossen
> <rajib.hossen.ipvision@gmail.com> wrote:
> > Hello,
> > I had the similar issue. I solved it via a cronjob. In crontab -e
> > "@reboot chown -R ceph:ceph /dev/vdb1". say my journal is in disk vdb and
> > first partition(vdb1). vdb2 is my data disk.
> >
> > On Fri, Nov 4, 2016 at 8:51 PM, Wyllys Ingersoll
> > <wyllys.ingersoll@keepertech.com> wrote:
> >>
> >> We are running 10.2.3 with encrypted OSDs and journals using the old
> >> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
> >> after a reboot of a storage server.  Our data and journals are on
> >> separate partitions on the same disk.
> >>
> >> After a reboot, sometimes the OSDs fail to start because of
> >> permissions problems.  The /dev/dm-* devices come back with
> >> permissions set to "root:disk" sometimes instead of "ceph:ceph".
> >> Weirder still is that sometimes the ceph-osd will start and work in
> >> spite of the incorrect perrmissions (root:disk) and other times they
> >> will fail and the logs show permissions errors when trying to access
> >> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
> >> and others are "ceph:ceph".  There's no clear pattern, so that's what
> >> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
> >> function.
> >>
> >> Is there a known issue with ceph-disk and/or ceph-osd related to
> >> timing of the encrypted devices being setup and the permissions
> >> getting changed to the ceph processes can access them?
> >>
> >> Wyllys Ingersoll
> >> Keeper Technology, LLC
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
  2016-11-07 20:09     ` Wyllys Ingersoll
@ 2016-11-07 21:35       ` Loic Dachary
  0 siblings, 0 replies; 11+ messages in thread
From: Loic Dachary @ 2016-11-07 21:35 UTC (permalink / raw)
  To: Wyllys Ingersoll, Rajib Hossen, Ceph Development

Hi,

I created http://tracker.ceph.com/issues/17813 to track that issue

On 07/11/2016 21:09, Wyllys Ingersoll wrote:
> add a line to the udev 55-dm.rules file just
> before the 'GOTO="dm_end"' line towards the end of that file:
> OWNER:="ceph", GROUP:="ceph", MODE:="0660"

It looks like the proper workaround to me. I'm not sure how that should be packaged though or if there is a better fix.

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
  2016-11-04 14:51 ceph-disk and /dev/dm-* permissions - race condition? Wyllys Ingersoll
       [not found] ` <CADEhWsOYJeq3kVUL31fzsCmi6E2jueYQsy08OV+jXx-waqZe5w@mail.gmail.com>
@ 2016-11-22 14:48 ` Loic Dachary
  2016-11-22 15:13   ` Wyllys Ingersoll
  1 sibling, 1 reply; 11+ messages in thread
From: Loic Dachary @ 2016-11-22 14:48 UTC (permalink / raw)
  To: Wyllys Ingersoll, Ceph Development

Hi,

It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation.

I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do.

What do you think ?

On 04/11/2016 15:51, Wyllys Ingersoll wrote:
> We are running 10.2.3 with encrypted OSDs and journals using the old
> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
> after a reboot of a storage server.  Our data and journals are on
> separate partitions on the same disk.
> 
> After a reboot, sometimes the OSDs fail to start because of
> permissions problems.  The /dev/dm-* devices come back with
> permissions set to "root:disk" sometimes instead of "ceph:ceph".
> Weirder still is that sometimes the ceph-osd will start and work in
> spite of the incorrect perrmissions (root:disk) and other times they
> will fail and the logs show permissions errors when trying to access
> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
> and others are "ceph:ceph".  There's no clear pattern, so that's what
> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
> function.
> 
> Is there a known issue with ceph-disk and/or ceph-osd related to
> timing of the encrypted devices being setup and the permissions
> getting changed to the ceph processes can access them?
> 
> Wyllys Ingersoll
> Keeper Technology, LLC
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
  2016-11-22 14:48 ` Loic Dachary
@ 2016-11-22 15:13   ` Wyllys Ingersoll
  2016-11-22 17:07     ` Loic Dachary
  0 siblings, 1 reply; 11+ messages in thread
From: Wyllys Ingersoll @ 2016-11-22 15:13 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

I think that sounds reasonable, obviously more testing will be needed
to verify.  Our situation occurred on an Ubuntu Trusty (upstart based,
not systemd) server, so I dont think this will help for non-systemd
systems.

On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote:
> Hi,
>
> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation.
>
> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do.
>
> What do you think ?
>
> On 04/11/2016 15:51, Wyllys Ingersoll wrote:
>> We are running 10.2.3 with encrypted OSDs and journals using the old
>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
>> after a reboot of a storage server.  Our data and journals are on
>> separate partitions on the same disk.
>>
>> After a reboot, sometimes the OSDs fail to start because of
>> permissions problems.  The /dev/dm-* devices come back with
>> permissions set to "root:disk" sometimes instead of "ceph:ceph".
>> Weirder still is that sometimes the ceph-osd will start and work in
>> spite of the incorrect perrmissions (root:disk) and other times they
>> will fail and the logs show permissions errors when trying to access
>> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
>> and others are "ceph:ceph".  There's no clear pattern, so that's what
>> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
>> function.
>>
>> Is there a known issue with ceph-disk and/or ceph-osd related to
>> timing of the encrypted devices being setup and the permissions
>> getting changed to the ceph processes can access them?
>>
>> Wyllys Ingersoll
>> Keeper Technology, LLC
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
  2016-11-22 15:13   ` Wyllys Ingersoll
@ 2016-11-22 17:07     ` Loic Dachary
  2016-11-22 19:13       ` Wyllys Ingersoll
  0 siblings, 1 reply; 11+ messages in thread
From: Loic Dachary @ 2016-11-22 17:07 UTC (permalink / raw)
  To: Wyllys Ingersoll; +Cc: Ceph Development



On 22/11/2016 16:13, Wyllys Ingersoll wrote:
> I think that sounds reasonable, obviously more testing will be needed
> to verify.  Our situation occurred on an Ubuntu Trusty (upstart based,
> not systemd) server, so I dont think this will help for non-systemd
> systems.

I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research.

> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote:
>> Hi,
>>
>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation.
>>
>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do.
>>
>> What do you think ?
>>
>> On 04/11/2016 15:51, Wyllys Ingersoll wrote:
>>> We are running 10.2.3 with encrypted OSDs and journals using the old
>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
>>> after a reboot of a storage server.  Our data and journals are on
>>> separate partitions on the same disk.
>>>
>>> After a reboot, sometimes the OSDs fail to start because of
>>> permissions problems.  The /dev/dm-* devices come back with
>>> permissions set to "root:disk" sometimes instead of "ceph:ceph".
>>> Weirder still is that sometimes the ceph-osd will start and work in
>>> spite of the incorrect perrmissions (root:disk) and other times they
>>> will fail and the logs show permissions errors when trying to access
>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
>>> and others are "ceph:ceph".  There's no clear pattern, so that's what
>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
>>> function.
>>>
>>> Is there a known issue with ceph-disk and/or ceph-osd related to
>>> timing of the encrypted devices being setup and the permissions
>>> getting changed to the ceph processes can access them?
>>>
>>> Wyllys Ingersoll
>>> Keeper Technology, LLC
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
  2016-11-22 17:07     ` Loic Dachary
@ 2016-11-22 19:13       ` Wyllys Ingersoll
  2016-11-22 23:33         ` Loic Dachary
  2016-11-23 11:42         ` Loic Dachary
  0 siblings, 2 replies; 11+ messages in thread
From: Wyllys Ingersoll @ 2016-11-22 19:13 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

I dont know, but making the change in the 55-dm.rules file seems to do
the trick well enough for now.

On Tue, Nov 22, 2016 at 12:07 PM, Loic Dachary <loic@dachary.org> wrote:
>
>
> On 22/11/2016 16:13, Wyllys Ingersoll wrote:
>> I think that sounds reasonable, obviously more testing will be needed
>> to verify.  Our situation occurred on an Ubuntu Trusty (upstart based,
>> not systemd) server, so I dont think this will help for non-systemd
>> systems.
>
> I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research.
>
>> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote:
>>> Hi,
>>>
>>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation.
>>>
>>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do.
>>>
>>> What do you think ?
>>>
>>> On 04/11/2016 15:51, Wyllys Ingersoll wrote:
>>>> We are running 10.2.3 with encrypted OSDs and journals using the old
>>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
>>>> after a reboot of a storage server.  Our data and journals are on
>>>> separate partitions on the same disk.
>>>>
>>>> After a reboot, sometimes the OSDs fail to start because of
>>>> permissions problems.  The /dev/dm-* devices come back with
>>>> permissions set to "root:disk" sometimes instead of "ceph:ceph".
>>>> Weirder still is that sometimes the ceph-osd will start and work in
>>>> spite of the incorrect perrmissions (root:disk) and other times they
>>>> will fail and the logs show permissions errors when trying to access
>>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
>>>> and others are "ceph:ceph".  There's no clear pattern, so that's what
>>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
>>>> function.
>>>>
>>>> Is there a known issue with ceph-disk and/or ceph-osd related to
>>>> timing of the encrypted devices being setup and the permissions
>>>> getting changed to the ceph processes can access them?
>>>>
>>>> Wyllys Ingersoll
>>>> Keeper Technology, LLC
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>> --
>>> Loïc Dachary, Artisan Logiciel Libre
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
  2016-11-22 19:13       ` Wyllys Ingersoll
@ 2016-11-22 23:33         ` Loic Dachary
  2016-11-23 11:42         ` Loic Dachary
  1 sibling, 0 replies; 11+ messages in thread
From: Loic Dachary @ 2016-11-22 23:33 UTC (permalink / raw)
  To: Wyllys Ingersoll; +Cc: Ceph Development



On 22/11/2016 20:13, Wyllys Ingersoll wrote:
> I dont know, but making the change in the 55-dm.rules file seems to do
> the trick well enough for now.

It does. But there does not seem to be a way to package this workaround. This is the reason why I'm trying to find another fix.

Cheers

> 
> On Tue, Nov 22, 2016 at 12:07 PM, Loic Dachary <loic@dachary.org> wrote:
>>
>>
>> On 22/11/2016 16:13, Wyllys Ingersoll wrote:
>>> I think that sounds reasonable, obviously more testing will be needed
>>> to verify.  Our situation occurred on an Ubuntu Trusty (upstart based,
>>> not systemd) server, so I dont think this will help for non-systemd
>>> systems.
>>
>> I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research.
>>
>>> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote:
>>>> Hi,
>>>>
>>>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation.
>>>>
>>>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do.
>>>>
>>>> What do you think ?
>>>>
>>>> On 04/11/2016 15:51, Wyllys Ingersoll wrote:
>>>>> We are running 10.2.3 with encrypted OSDs and journals using the old
>>>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
>>>>> after a reboot of a storage server.  Our data and journals are on
>>>>> separate partitions on the same disk.
>>>>>
>>>>> After a reboot, sometimes the OSDs fail to start because of
>>>>> permissions problems.  The /dev/dm-* devices come back with
>>>>> permissions set to "root:disk" sometimes instead of "ceph:ceph".
>>>>> Weirder still is that sometimes the ceph-osd will start and work in
>>>>> spite of the incorrect perrmissions (root:disk) and other times they
>>>>> will fail and the logs show permissions errors when trying to access
>>>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
>>>>> and others are "ceph:ceph".  There's no clear pattern, so that's what
>>>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
>>>>> function.
>>>>>
>>>>> Is there a known issue with ceph-disk and/or ceph-osd related to
>>>>> timing of the encrypted devices being setup and the permissions
>>>>> getting changed to the ceph processes can access them?
>>>>>
>>>>> Wyllys Ingersoll
>>>>> Keeper Technology, LLC
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>> --
>>>> Loïc Dachary, Artisan Logiciel Libre
>>>
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
  2016-11-22 19:13       ` Wyllys Ingersoll
  2016-11-22 23:33         ` Loic Dachary
@ 2016-11-23 11:42         ` Loic Dachary
  2016-11-23 15:49           ` Wyllys Ingersoll
  1 sibling, 1 reply; 11+ messages in thread
From: Loic Dachary @ 2016-11-23 11:42 UTC (permalink / raw)
  To: Wyllys Ingersoll; +Cc: Ceph Development

I think that could work as well:

in ceph-disk.conf

description "ceph-disk async worker"

start on (ceph-disk and local-filesystems)

instance $dev/$pid
export dev
export pid

exec flock /var/lock/ceph-disk -c 'ceph-disk --verbose --log-stdout trigger --sync $dev'

with https://github.com/ceph/ceph/pull/12136/commits/72f0b2aa1eb4b7b2a2222c2847d26f99400a8374

What do you say ?

On 22/11/2016 20:13, Wyllys Ingersoll wrote:
> I dont know, but making the change in the 55-dm.rules file seems to do
> the trick well enough for now.
> 
> On Tue, Nov 22, 2016 at 12:07 PM, Loic Dachary <loic@dachary.org> wrote:
>>
>>
>> On 22/11/2016 16:13, Wyllys Ingersoll wrote:
>>> I think that sounds reasonable, obviously more testing will be needed
>>> to verify.  Our situation occurred on an Ubuntu Trusty (upstart based,
>>> not systemd) server, so I dont think this will help for non-systemd
>>> systems.
>>
>> I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research.
>>
>>> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote:
>>>> Hi,
>>>>
>>>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation.
>>>>
>>>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do.
>>>>
>>>> What do you think ?
>>>>
>>>> On 04/11/2016 15:51, Wyllys Ingersoll wrote:
>>>>> We are running 10.2.3 with encrypted OSDs and journals using the old
>>>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
>>>>> after a reboot of a storage server.  Our data and journals are on
>>>>> separate partitions on the same disk.
>>>>>
>>>>> After a reboot, sometimes the OSDs fail to start because of
>>>>> permissions problems.  The /dev/dm-* devices come back with
>>>>> permissions set to "root:disk" sometimes instead of "ceph:ceph".
>>>>> Weirder still is that sometimes the ceph-osd will start and work in
>>>>> spite of the incorrect perrmissions (root:disk) and other times they
>>>>> will fail and the logs show permissions errors when trying to access
>>>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
>>>>> and others are "ceph:ceph".  There's no clear pattern, so that's what
>>>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
>>>>> function.
>>>>>
>>>>> Is there a known issue with ceph-disk and/or ceph-osd related to
>>>>> timing of the encrypted devices being setup and the permissions
>>>>> getting changed to the ceph processes can access them?
>>>>>
>>>>> Wyllys Ingersoll
>>>>> Keeper Technology, LLC
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>> --
>>>> Loïc Dachary, Artisan Logiciel Libre
>>>
>>
>> --
>> Loïc Dachary, Artisan Logiciel Libre
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: ceph-disk and /dev/dm-* permissions - race condition?
  2016-11-23 11:42         ` Loic Dachary
@ 2016-11-23 15:49           ` Wyllys Ingersoll
  0 siblings, 0 replies; 11+ messages in thread
From: Wyllys Ingersoll @ 2016-11-23 15:49 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

That doesn't appear to work on 10.2.3 (with modified ceph-disk/main.py
from your fix above).  I think it ends up trying to access the
/dev/mapper/UUID files before they have been established so the
ceph-osd starter process fails since there are no mapped dm partitions
yet.  Adding 'local-filesystems' to the "start on" line is forcing it
to start too soon,I think.   I see these errors in the upstart logs
for ceph-osd-all-starter.log.

ceph-disk: Cannot discover filesystem type: device
/dev/mapper/00457719-b9b0-4cd0-a912-8e6e5efff7cd: Command
'/sbin/blkid' returned non-zero exit status 2
ceph-disk: Cannot discover filesystem type: device
/dev/mapper/eb056779-7bd0-4768-86cb-d757174a2046: Command
'/sbin/blkid' returned non-zero exit status 2
ceph-disk: Cannot discover filesystem type: device
/dev/mapper/f1300502-1143-4c91-b43c-051342b36933: Command
'/sbin/blkid' returned non-zero exit status 2
ceph-disk: Error: One or more partitions failed to activate



On Wed, Nov 23, 2016 at 6:42 AM, Loic Dachary <loic@dachary.org> wrote:
> I think that could work as well:
>
> in ceph-disk.conf
>
> description "ceph-disk async worker"
>
> start on (ceph-disk and local-filesystems)
>
> instance $dev/$pid
> export dev
> export pid
>
> exec flock /var/lock/ceph-disk -c 'ceph-disk --verbose --log-stdout trigger --sync $dev'
>
> with https://github.com/ceph/ceph/pull/12136/commits/72f0b2aa1eb4b7b2a2222c2847d26f99400a8374
>
> What do you say ?
>
> On 22/11/2016 20:13, Wyllys Ingersoll wrote:
>> I dont know, but making the change in the 55-dm.rules file seems to do
>> the trick well enough for now.
>>
>> On Tue, Nov 22, 2016 at 12:07 PM, Loic Dachary <loic@dachary.org> wrote:
>>>
>>>
>>> On 22/11/2016 16:13, Wyllys Ingersoll wrote:
>>>> I think that sounds reasonable, obviously more testing will be needed
>>>> to verify.  Our situation occurred on an Ubuntu Trusty (upstart based,
>>>> not systemd) server, so I dont think this will help for non-systemd
>>>> systems.
>>>
>>> I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research.
>>>
>>>> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote:
>>>>> Hi,
>>>>>
>>>>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation.
>>>>>
>>>>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do.
>>>>>
>>>>> What do you think ?
>>>>>
>>>>> On 04/11/2016 15:51, Wyllys Ingersoll wrote:
>>>>>> We are running 10.2.3 with encrypted OSDs and journals using the old
>>>>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes
>>>>>> after a reboot of a storage server.  Our data and journals are on
>>>>>> separate partitions on the same disk.
>>>>>>
>>>>>> After a reboot, sometimes the OSDs fail to start because of
>>>>>> permissions problems.  The /dev/dm-* devices come back with
>>>>>> permissions set to "root:disk" sometimes instead of "ceph:ceph".
>>>>>> Weirder still is that sometimes the ceph-osd will start and work in
>>>>>> spite of the incorrect perrmissions (root:disk) and other times they
>>>>>> will fail and the logs show permissions errors when trying to access
>>>>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk"
>>>>>> and others are "ceph:ceph".  There's no clear pattern, so that's what
>>>>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map"
>>>>>> function.
>>>>>>
>>>>>> Is there a known issue with ceph-disk and/or ceph-osd related to
>>>>>> timing of the encrypted devices being setup and the permissions
>>>>>> getting changed to the ceph processes can access them?
>>>>>>
>>>>>> Wyllys Ingersoll
>>>>>> Keeper Technology, LLC
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>
>>>>> --
>>>>> Loïc Dachary, Artisan Logiciel Libre
>>>>
>>>
>>> --
>>> Loïc Dachary, Artisan Logiciel Libre
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> Loïc Dachary, Artisan Logiciel Libre

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-11-23 15:50 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-04 14:51 ceph-disk and /dev/dm-* permissions - race condition? Wyllys Ingersoll
     [not found] ` <CADEhWsOYJeq3kVUL31fzsCmi6E2jueYQsy08OV+jXx-waqZe5w@mail.gmail.com>
2016-11-05 12:36   ` Wyllys Ingersoll
2016-11-07 20:09     ` Wyllys Ingersoll
2016-11-07 21:35       ` Loic Dachary
2016-11-22 14:48 ` Loic Dachary
2016-11-22 15:13   ` Wyllys Ingersoll
2016-11-22 17:07     ` Loic Dachary
2016-11-22 19:13       ` Wyllys Ingersoll
2016-11-22 23:33         ` Loic Dachary
2016-11-23 11:42         ` Loic Dachary
2016-11-23 15:49           ` Wyllys Ingersoll

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.