* ceph-disk and /dev/dm-* permissions - race condition? @ 2016-11-04 14:51 Wyllys Ingersoll [not found] ` <CADEhWsOYJeq3kVUL31fzsCmi6E2jueYQsy08OV+jXx-waqZe5w@mail.gmail.com> 2016-11-22 14:48 ` Loic Dachary 0 siblings, 2 replies; 11+ messages in thread From: Wyllys Ingersoll @ 2016-11-04 14:51 UTC (permalink / raw) To: Ceph Development We are running 10.2.3 with encrypted OSDs and journals using the old (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes after a reboot of a storage server. Our data and journals are on separate partitions on the same disk. After a reboot, sometimes the OSDs fail to start because of permissions problems. The /dev/dm-* devices come back with permissions set to "root:disk" sometimes instead of "ceph:ceph". Weirder still is that sometimes the ceph-osd will start and work in spite of the incorrect perrmissions (root:disk) and other times they will fail and the logs show permissions errors when trying to access the journals. Sometimes half of the /dev/dm- devices are "root:disk" and others are "ceph:ceph". There's no clear pattern, so that's what leads me to think its a race condition in the ceph_disk "dmcrypt_map" function. Is there a known issue with ceph-disk and/or ceph-osd related to timing of the encrypted devices being setup and the permissions getting changed to the ceph processes can access them? Wyllys Ingersoll Keeper Technology, LLC ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <CADEhWsOYJeq3kVUL31fzsCmi6E2jueYQsy08OV+jXx-waqZe5w@mail.gmail.com>]
* Re: ceph-disk and /dev/dm-* permissions - race condition? [not found] ` <CADEhWsOYJeq3kVUL31fzsCmi6E2jueYQsy08OV+jXx-waqZe5w@mail.gmail.com> @ 2016-11-05 12:36 ` Wyllys Ingersoll 2016-11-07 20:09 ` Wyllys Ingersoll 0 siblings, 1 reply; 11+ messages in thread From: Wyllys Ingersoll @ 2016-11-05 12:36 UTC (permalink / raw) To: Rajib Hossen, Ceph Development Thats an interesting workaround, I may end up using it if all else fails. I watch the permissions on /dev/dm-* devices during the boot processes, they start out correctly as "ceph:ceph", but at the end of the ceph disk preparation, a "ceph-disk trigger" is executed which seems to cause the permissions to get reset back to "root:disk". This leaves the ceph-osd processes that are running able to continue, but if they have to restart for any reason, they will fail to restart. It could be a problem with the udev rules for the encrypted data and journal partitions. Debugging udev is a nightmare. Im hoping someone else has already solved this one. On Sat, Nov 5, 2016 at 1:13 AM, Rajib Hossen <rajib.hossen.ipvision@gmail.com> wrote: > Hello, > I had the similar issue. I solved it via a cronjob. In crontab -e > "@reboot chown -R ceph:ceph /dev/vdb1". say my journal is in disk vdb and > first partition(vdb1). vdb2 is my data disk. > > On Fri, Nov 4, 2016 at 8:51 PM, Wyllys Ingersoll > <wyllys.ingersoll@keepertech.com> wrote: >> >> We are running 10.2.3 with encrypted OSDs and journals using the old >> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes >> after a reboot of a storage server. Our data and journals are on >> separate partitions on the same disk. >> >> After a reboot, sometimes the OSDs fail to start because of >> permissions problems. The /dev/dm-* devices come back with >> permissions set to "root:disk" sometimes instead of "ceph:ceph". >> Weirder still is that sometimes the ceph-osd will start and work in >> spite of the incorrect perrmissions (root:disk) and other times they >> will fail and the logs show permissions errors when trying to access >> the journals. Sometimes half of the /dev/dm- devices are "root:disk" >> and others are "ceph:ceph". There's no clear pattern, so that's what >> leads me to think its a race condition in the ceph_disk "dmcrypt_map" >> function. >> >> Is there a known issue with ceph-disk and/or ceph-osd related to >> timing of the encrypted devices being setup and the permissions >> getting changed to the ceph processes can access them? >> >> Wyllys Ingersoll >> Keeper Technology, LLC >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ceph-disk and /dev/dm-* permissions - race condition? 2016-11-05 12:36 ` Wyllys Ingersoll @ 2016-11-07 20:09 ` Wyllys Ingersoll 2016-11-07 21:35 ` Loic Dachary 0 siblings, 1 reply; 11+ messages in thread From: Wyllys Ingersoll @ 2016-11-07 20:09 UTC (permalink / raw) To: Rajib Hossen, Ceph Development The workaround to put "@reboot chown -R ceph:ceph /dev/vdb1" in crontab doesn't work because the /dev/dm-* devices change ownership after they start up. Im not sure of all of the interactions between ceph-osd and udev and the /dev/mapper for handling encrypted partitions, but somewhere late in the startup process just after ceph-osd has started running, the permissions on the /dev/dm-* devices change from ceph:ceph to "root:disk" which makes it impossible for an an OSD process to ever restart again due to not being able to read the encrypted journal. My workaround was to add a line to the udev 55-dm.rules file just before the 'GOTO="dm_end"' line towards the end of that file: OWNER:="ceph", GROUP:="ceph", MODE:="0660" Even though this workaround seems to work for our situation, I still maintain that there is a bug in the ceph-osd startup sequence that is causing the ownership to change back to "root:disk" where it should be "ceph:ceph". Wyllys Ingersoll Keeper Technology, LLC On Sat, Nov 5, 2016 at 8:36 AM, Wyllys Ingersoll <wyllys.ingersoll@keepertech.com> wrote: > > Thats an interesting workaround, I may end up using it if all else fails. > > I watch the permissions on /dev/dm-* devices during the boot > processes, they start out correctly as "ceph:ceph", but at the end of > the ceph disk preparation, a "ceph-disk trigger" is executed which > seems to cause the permissions to get reset back to "root:disk". This > leaves the ceph-osd processes that are running able to continue, but > if they have to restart for any reason, they will fail to restart. > > It could be a problem with the udev rules for the encrypted data and > journal partitions. Debugging udev is a nightmare. Im hoping someone > else has already solved this one. > > > > On Sat, Nov 5, 2016 at 1:13 AM, Rajib Hossen > <rajib.hossen.ipvision@gmail.com> wrote: > > Hello, > > I had the similar issue. I solved it via a cronjob. In crontab -e > > "@reboot chown -R ceph:ceph /dev/vdb1". say my journal is in disk vdb and > > first partition(vdb1). vdb2 is my data disk. > > > > On Fri, Nov 4, 2016 at 8:51 PM, Wyllys Ingersoll > > <wyllys.ingersoll@keepertech.com> wrote: > >> > >> We are running 10.2.3 with encrypted OSDs and journals using the old > >> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes > >> after a reboot of a storage server. Our data and journals are on > >> separate partitions on the same disk. > >> > >> After a reboot, sometimes the OSDs fail to start because of > >> permissions problems. The /dev/dm-* devices come back with > >> permissions set to "root:disk" sometimes instead of "ceph:ceph". > >> Weirder still is that sometimes the ceph-osd will start and work in > >> spite of the incorrect perrmissions (root:disk) and other times they > >> will fail and the logs show permissions errors when trying to access > >> the journals. Sometimes half of the /dev/dm- devices are "root:disk" > >> and others are "ceph:ceph". There's no clear pattern, so that's what > >> leads me to think its a race condition in the ceph_disk "dmcrypt_map" > >> function. > >> > >> Is there a known issue with ceph-disk and/or ceph-osd related to > >> timing of the encrypted devices being setup and the permissions > >> getting changed to the ceph processes can access them? > >> > >> Wyllys Ingersoll > >> Keeper Technology, LLC > >> -- > >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > >> the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ceph-disk and /dev/dm-* permissions - race condition? 2016-11-07 20:09 ` Wyllys Ingersoll @ 2016-11-07 21:35 ` Loic Dachary 0 siblings, 0 replies; 11+ messages in thread From: Loic Dachary @ 2016-11-07 21:35 UTC (permalink / raw) To: Wyllys Ingersoll, Rajib Hossen, Ceph Development Hi, I created http://tracker.ceph.com/issues/17813 to track that issue On 07/11/2016 21:09, Wyllys Ingersoll wrote: > add a line to the udev 55-dm.rules file just > before the 'GOTO="dm_end"' line towards the end of that file: > OWNER:="ceph", GROUP:="ceph", MODE:="0660" It looks like the proper workaround to me. I'm not sure how that should be packaged though or if there is a better fix. Cheers -- Loïc Dachary, Artisan Logiciel Libre ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ceph-disk and /dev/dm-* permissions - race condition? 2016-11-04 14:51 ceph-disk and /dev/dm-* permissions - race condition? Wyllys Ingersoll [not found] ` <CADEhWsOYJeq3kVUL31fzsCmi6E2jueYQsy08OV+jXx-waqZe5w@mail.gmail.com> @ 2016-11-22 14:48 ` Loic Dachary 2016-11-22 15:13 ` Wyllys Ingersoll 1 sibling, 1 reply; 11+ messages in thread From: Loic Dachary @ 2016-11-22 14:48 UTC (permalink / raw) To: Wyllys Ingersoll, Ceph Development Hi, It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation. I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do. What do you think ? On 04/11/2016 15:51, Wyllys Ingersoll wrote: > We are running 10.2.3 with encrypted OSDs and journals using the old > (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes > after a reboot of a storage server. Our data and journals are on > separate partitions on the same disk. > > After a reboot, sometimes the OSDs fail to start because of > permissions problems. The /dev/dm-* devices come back with > permissions set to "root:disk" sometimes instead of "ceph:ceph". > Weirder still is that sometimes the ceph-osd will start and work in > spite of the incorrect perrmissions (root:disk) and other times they > will fail and the logs show permissions errors when trying to access > the journals. Sometimes half of the /dev/dm- devices are "root:disk" > and others are "ceph:ceph". There's no clear pattern, so that's what > leads me to think its a race condition in the ceph_disk "dmcrypt_map" > function. > > Is there a known issue with ceph-disk and/or ceph-osd related to > timing of the encrypted devices being setup and the permissions > getting changed to the ceph processes can access them? > > Wyllys Ingersoll > Keeper Technology, LLC > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Loïc Dachary, Artisan Logiciel Libre ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ceph-disk and /dev/dm-* permissions - race condition? 2016-11-22 14:48 ` Loic Dachary @ 2016-11-22 15:13 ` Wyllys Ingersoll 2016-11-22 17:07 ` Loic Dachary 0 siblings, 1 reply; 11+ messages in thread From: Wyllys Ingersoll @ 2016-11-22 15:13 UTC (permalink / raw) To: Loic Dachary; +Cc: Ceph Development I think that sounds reasonable, obviously more testing will be needed to verify. Our situation occurred on an Ubuntu Trusty (upstart based, not systemd) server, so I dont think this will help for non-systemd systems. On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote: > Hi, > > It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation. > > I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do. > > What do you think ? > > On 04/11/2016 15:51, Wyllys Ingersoll wrote: >> We are running 10.2.3 with encrypted OSDs and journals using the old >> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes >> after a reboot of a storage server. Our data and journals are on >> separate partitions on the same disk. >> >> After a reboot, sometimes the OSDs fail to start because of >> permissions problems. The /dev/dm-* devices come back with >> permissions set to "root:disk" sometimes instead of "ceph:ceph". >> Weirder still is that sometimes the ceph-osd will start and work in >> spite of the incorrect perrmissions (root:disk) and other times they >> will fail and the logs show permissions errors when trying to access >> the journals. Sometimes half of the /dev/dm- devices are "root:disk" >> and others are "ceph:ceph". There's no clear pattern, so that's what >> leads me to think its a race condition in the ceph_disk "dmcrypt_map" >> function. >> >> Is there a known issue with ceph-disk and/or ceph-osd related to >> timing of the encrypted devices being setup and the permissions >> getting changed to the ceph processes can access them? >> >> Wyllys Ingersoll >> Keeper Technology, LLC >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > Loïc Dachary, Artisan Logiciel Libre ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ceph-disk and /dev/dm-* permissions - race condition? 2016-11-22 15:13 ` Wyllys Ingersoll @ 2016-11-22 17:07 ` Loic Dachary 2016-11-22 19:13 ` Wyllys Ingersoll 0 siblings, 1 reply; 11+ messages in thread From: Loic Dachary @ 2016-11-22 17:07 UTC (permalink / raw) To: Wyllys Ingersoll; +Cc: Ceph Development On 22/11/2016 16:13, Wyllys Ingersoll wrote: > I think that sounds reasonable, obviously more testing will be needed > to verify. Our situation occurred on an Ubuntu Trusty (upstart based, > not systemd) server, so I dont think this will help for non-systemd > systems. I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research. > On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote: >> Hi, >> >> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation. >> >> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do. >> >> What do you think ? >> >> On 04/11/2016 15:51, Wyllys Ingersoll wrote: >>> We are running 10.2.3 with encrypted OSDs and journals using the old >>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes >>> after a reboot of a storage server. Our data and journals are on >>> separate partitions on the same disk. >>> >>> After a reboot, sometimes the OSDs fail to start because of >>> permissions problems. The /dev/dm-* devices come back with >>> permissions set to "root:disk" sometimes instead of "ceph:ceph". >>> Weirder still is that sometimes the ceph-osd will start and work in >>> spite of the incorrect perrmissions (root:disk) and other times they >>> will fail and the logs show permissions errors when trying to access >>> the journals. Sometimes half of the /dev/dm- devices are "root:disk" >>> and others are "ceph:ceph". There's no clear pattern, so that's what >>> leads me to think its a race condition in the ceph_disk "dmcrypt_map" >>> function. >>> >>> Is there a known issue with ceph-disk and/or ceph-osd related to >>> timing of the encrypted devices being setup and the permissions >>> getting changed to the ceph processes can access them? >>> >>> Wyllys Ingersoll >>> Keeper Technology, LLC >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> >> >> -- >> Loïc Dachary, Artisan Logiciel Libre > -- Loïc Dachary, Artisan Logiciel Libre ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ceph-disk and /dev/dm-* permissions - race condition? 2016-11-22 17:07 ` Loic Dachary @ 2016-11-22 19:13 ` Wyllys Ingersoll 2016-11-22 23:33 ` Loic Dachary 2016-11-23 11:42 ` Loic Dachary 0 siblings, 2 replies; 11+ messages in thread From: Wyllys Ingersoll @ 2016-11-22 19:13 UTC (permalink / raw) To: Loic Dachary; +Cc: Ceph Development I dont know, but making the change in the 55-dm.rules file seems to do the trick well enough for now. On Tue, Nov 22, 2016 at 12:07 PM, Loic Dachary <loic@dachary.org> wrote: > > > On 22/11/2016 16:13, Wyllys Ingersoll wrote: >> I think that sounds reasonable, obviously more testing will be needed >> to verify. Our situation occurred on an Ubuntu Trusty (upstart based, >> not systemd) server, so I dont think this will help for non-systemd >> systems. > > I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research. > >> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote: >>> Hi, >>> >>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation. >>> >>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do. >>> >>> What do you think ? >>> >>> On 04/11/2016 15:51, Wyllys Ingersoll wrote: >>>> We are running 10.2.3 with encrypted OSDs and journals using the old >>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes >>>> after a reboot of a storage server. Our data and journals are on >>>> separate partitions on the same disk. >>>> >>>> After a reboot, sometimes the OSDs fail to start because of >>>> permissions problems. The /dev/dm-* devices come back with >>>> permissions set to "root:disk" sometimes instead of "ceph:ceph". >>>> Weirder still is that sometimes the ceph-osd will start and work in >>>> spite of the incorrect perrmissions (root:disk) and other times they >>>> will fail and the logs show permissions errors when trying to access >>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk" >>>> and others are "ceph:ceph". There's no clear pattern, so that's what >>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map" >>>> function. >>>> >>>> Is there a known issue with ceph-disk and/or ceph-osd related to >>>> timing of the encrypted devices being setup and the permissions >>>> getting changed to the ceph processes can access them? >>>> >>>> Wyllys Ingersoll >>>> Keeper Technology, LLC >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>> >>> >>> -- >>> Loïc Dachary, Artisan Logiciel Libre >> > > -- > Loïc Dachary, Artisan Logiciel Libre ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ceph-disk and /dev/dm-* permissions - race condition? 2016-11-22 19:13 ` Wyllys Ingersoll @ 2016-11-22 23:33 ` Loic Dachary 2016-11-23 11:42 ` Loic Dachary 1 sibling, 0 replies; 11+ messages in thread From: Loic Dachary @ 2016-11-22 23:33 UTC (permalink / raw) To: Wyllys Ingersoll; +Cc: Ceph Development On 22/11/2016 20:13, Wyllys Ingersoll wrote: > I dont know, but making the change in the 55-dm.rules file seems to do > the trick well enough for now. It does. But there does not seem to be a way to package this workaround. This is the reason why I'm trying to find another fix. Cheers > > On Tue, Nov 22, 2016 at 12:07 PM, Loic Dachary <loic@dachary.org> wrote: >> >> >> On 22/11/2016 16:13, Wyllys Ingersoll wrote: >>> I think that sounds reasonable, obviously more testing will be needed >>> to verify. Our situation occurred on an Ubuntu Trusty (upstart based, >>> not systemd) server, so I dont think this will help for non-systemd >>> systems. >> >> I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research. >> >>> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote: >>>> Hi, >>>> >>>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation. >>>> >>>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do. >>>> >>>> What do you think ? >>>> >>>> On 04/11/2016 15:51, Wyllys Ingersoll wrote: >>>>> We are running 10.2.3 with encrypted OSDs and journals using the old >>>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes >>>>> after a reboot of a storage server. Our data and journals are on >>>>> separate partitions on the same disk. >>>>> >>>>> After a reboot, sometimes the OSDs fail to start because of >>>>> permissions problems. The /dev/dm-* devices come back with >>>>> permissions set to "root:disk" sometimes instead of "ceph:ceph". >>>>> Weirder still is that sometimes the ceph-osd will start and work in >>>>> spite of the incorrect perrmissions (root:disk) and other times they >>>>> will fail and the logs show permissions errors when trying to access >>>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk" >>>>> and others are "ceph:ceph". There's no clear pattern, so that's what >>>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map" >>>>> function. >>>>> >>>>> Is there a known issue with ceph-disk and/or ceph-osd related to >>>>> timing of the encrypted devices being setup and the permissions >>>>> getting changed to the ceph processes can access them? >>>>> >>>>> Wyllys Ingersoll >>>>> Keeper Technology, LLC >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >>>> -- >>>> Loïc Dachary, Artisan Logiciel Libre >>> >> >> -- >> Loïc Dachary, Artisan Logiciel Libre > -- Loïc Dachary, Artisan Logiciel Libre ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ceph-disk and /dev/dm-* permissions - race condition? 2016-11-22 19:13 ` Wyllys Ingersoll 2016-11-22 23:33 ` Loic Dachary @ 2016-11-23 11:42 ` Loic Dachary 2016-11-23 15:49 ` Wyllys Ingersoll 1 sibling, 1 reply; 11+ messages in thread From: Loic Dachary @ 2016-11-23 11:42 UTC (permalink / raw) To: Wyllys Ingersoll; +Cc: Ceph Development I think that could work as well: in ceph-disk.conf description "ceph-disk async worker" start on (ceph-disk and local-filesystems) instance $dev/$pid export dev export pid exec flock /var/lock/ceph-disk -c 'ceph-disk --verbose --log-stdout trigger --sync $dev' with https://github.com/ceph/ceph/pull/12136/commits/72f0b2aa1eb4b7b2a2222c2847d26f99400a8374 What do you say ? On 22/11/2016 20:13, Wyllys Ingersoll wrote: > I dont know, but making the change in the 55-dm.rules file seems to do > the trick well enough for now. > > On Tue, Nov 22, 2016 at 12:07 PM, Loic Dachary <loic@dachary.org> wrote: >> >> >> On 22/11/2016 16:13, Wyllys Ingersoll wrote: >>> I think that sounds reasonable, obviously more testing will be needed >>> to verify. Our situation occurred on an Ubuntu Trusty (upstart based, >>> not systemd) server, so I dont think this will help for non-systemd >>> systems. >> >> I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research. >> >>> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote: >>>> Hi, >>>> >>>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation. >>>> >>>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do. >>>> >>>> What do you think ? >>>> >>>> On 04/11/2016 15:51, Wyllys Ingersoll wrote: >>>>> We are running 10.2.3 with encrypted OSDs and journals using the old >>>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes >>>>> after a reboot of a storage server. Our data and journals are on >>>>> separate partitions on the same disk. >>>>> >>>>> After a reboot, sometimes the OSDs fail to start because of >>>>> permissions problems. The /dev/dm-* devices come back with >>>>> permissions set to "root:disk" sometimes instead of "ceph:ceph". >>>>> Weirder still is that sometimes the ceph-osd will start and work in >>>>> spite of the incorrect perrmissions (root:disk) and other times they >>>>> will fail and the logs show permissions errors when trying to access >>>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk" >>>>> and others are "ceph:ceph". There's no clear pattern, so that's what >>>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map" >>>>> function. >>>>> >>>>> Is there a known issue with ceph-disk and/or ceph-osd related to >>>>> timing of the encrypted devices being setup and the permissions >>>>> getting changed to the ceph processes can access them? >>>>> >>>>> Wyllys Ingersoll >>>>> Keeper Technology, LLC >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >>>> -- >>>> Loïc Dachary, Artisan Logiciel Libre >>> >> >> -- >> Loïc Dachary, Artisan Logiciel Libre > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Loïc Dachary, Artisan Logiciel Libre ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: ceph-disk and /dev/dm-* permissions - race condition? 2016-11-23 11:42 ` Loic Dachary @ 2016-11-23 15:49 ` Wyllys Ingersoll 0 siblings, 0 replies; 11+ messages in thread From: Wyllys Ingersoll @ 2016-11-23 15:49 UTC (permalink / raw) To: Loic Dachary; +Cc: Ceph Development That doesn't appear to work on 10.2.3 (with modified ceph-disk/main.py from your fix above). I think it ends up trying to access the /dev/mapper/UUID files before they have been established so the ceph-osd starter process fails since there are no mapped dm partitions yet. Adding 'local-filesystems' to the "start on" line is forcing it to start too soon,I think. I see these errors in the upstart logs for ceph-osd-all-starter.log. ceph-disk: Cannot discover filesystem type: device /dev/mapper/00457719-b9b0-4cd0-a912-8e6e5efff7cd: Command '/sbin/blkid' returned non-zero exit status 2 ceph-disk: Cannot discover filesystem type: device /dev/mapper/eb056779-7bd0-4768-86cb-d757174a2046: Command '/sbin/blkid' returned non-zero exit status 2 ceph-disk: Cannot discover filesystem type: device /dev/mapper/f1300502-1143-4c91-b43c-051342b36933: Command '/sbin/blkid' returned non-zero exit status 2 ceph-disk: Error: One or more partitions failed to activate On Wed, Nov 23, 2016 at 6:42 AM, Loic Dachary <loic@dachary.org> wrote: > I think that could work as well: > > in ceph-disk.conf > > description "ceph-disk async worker" > > start on (ceph-disk and local-filesystems) > > instance $dev/$pid > export dev > export pid > > exec flock /var/lock/ceph-disk -c 'ceph-disk --verbose --log-stdout trigger --sync $dev' > > with https://github.com/ceph/ceph/pull/12136/commits/72f0b2aa1eb4b7b2a2222c2847d26f99400a8374 > > What do you say ? > > On 22/11/2016 20:13, Wyllys Ingersoll wrote: >> I dont know, but making the change in the 55-dm.rules file seems to do >> the trick well enough for now. >> >> On Tue, Nov 22, 2016 at 12:07 PM, Loic Dachary <loic@dachary.org> wrote: >>> >>> >>> On 22/11/2016 16:13, Wyllys Ingersoll wrote: >>>> I think that sounds reasonable, obviously more testing will be needed >>>> to verify. Our situation occurred on an Ubuntu Trusty (upstart based, >>>> not systemd) server, so I dont think this will help for non-systemd >>>> systems. >>> >>> I don't think there is a way to enforce an order with upstart. But maybe there is ? If you don't know about it I will research. >>> >>>> On Tue, Nov 22, 2016 at 9:48 AM, Loic Dachary <loic@dachary.org> wrote: >>>>> Hi, >>>>> >>>>> It should be enough to add After=local-fs.target to /lib/systemd/system/ceph-disk@.service and have ceph-disk trigger --sync chown ceph:ceph /dev/XXX to fix this issue (and others). Since local-fs.target indirectly depends on dm, this ensures ceph disk activation will only happen after dm is finished. It is entirely possible that the ownership is incorrect when ceph-disk trigger --sync starts running, but it will no longer race with dm and it can safely chown ceph:ceph and proceed with activation. >>>>> >>>>> I'm testing this with https://github.com/ceph/ceph/pull/12136 but I'm not sure yet if I'm missing something or if that's the right thing to do. >>>>> >>>>> What do you think ? >>>>> >>>>> On 04/11/2016 15:51, Wyllys Ingersoll wrote: >>>>>> We are running 10.2.3 with encrypted OSDs and journals using the old >>>>>> (i.e. non-Luks) keys and are seeing issues with the ceph-osd processes >>>>>> after a reboot of a storage server. Our data and journals are on >>>>>> separate partitions on the same disk. >>>>>> >>>>>> After a reboot, sometimes the OSDs fail to start because of >>>>>> permissions problems. The /dev/dm-* devices come back with >>>>>> permissions set to "root:disk" sometimes instead of "ceph:ceph". >>>>>> Weirder still is that sometimes the ceph-osd will start and work in >>>>>> spite of the incorrect perrmissions (root:disk) and other times they >>>>>> will fail and the logs show permissions errors when trying to access >>>>>> the journals. Sometimes half of the /dev/dm- devices are "root:disk" >>>>>> and others are "ceph:ceph". There's no clear pattern, so that's what >>>>>> leads me to think its a race condition in the ceph_disk "dmcrypt_map" >>>>>> function. >>>>>> >>>>>> Is there a known issue with ceph-disk and/or ceph-osd related to >>>>>> timing of the encrypted devices being setup and the permissions >>>>>> getting changed to the ceph processes can access them? >>>>>> >>>>>> Wyllys Ingersoll >>>>>> Keeper Technology, LLC >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>>>>> the body of a message to majordomo@vger.kernel.org >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>> >>>>> >>>>> -- >>>>> Loïc Dachary, Artisan Logiciel Libre >>>> >>> >>> -- >>> Loïc Dachary, Artisan Logiciel Libre >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > Loïc Dachary, Artisan Logiciel Libre ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-11-23 15:50 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-11-04 14:51 ceph-disk and /dev/dm-* permissions - race condition? Wyllys Ingersoll [not found] ` <CADEhWsOYJeq3kVUL31fzsCmi6E2jueYQsy08OV+jXx-waqZe5w@mail.gmail.com> 2016-11-05 12:36 ` Wyllys Ingersoll 2016-11-07 20:09 ` Wyllys Ingersoll 2016-11-07 21:35 ` Loic Dachary 2016-11-22 14:48 ` Loic Dachary 2016-11-22 15:13 ` Wyllys Ingersoll 2016-11-22 17:07 ` Loic Dachary 2016-11-22 19:13 ` Wyllys Ingersoll 2016-11-22 23:33 ` Loic Dachary 2016-11-23 11:42 ` Loic Dachary 2016-11-23 15:49 ` Wyllys Ingersoll
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.