All of lore.kernel.org
 help / color / mirror / Atom feed
* teuthology SELinux failures
@ 2017-05-31 20:23 Yehuda Sadeh-Weinraub
  2017-05-31 21:12 ` John Spray
  2017-06-01 15:33 ` Boris Ranto
  0 siblings, 2 replies; 11+ messages in thread
From: Yehuda Sadeh-Weinraub @ 2017-05-31 20:23 UTC (permalink / raw)
  To: Boris Ranto; +Cc: ceph-devel

We started seeing SELinux related failures in recent teuthology run, e.g.:
http://pulpito.ceph.com/yehudasa-2017-05-30_14:55:10-rgw-wip-rgw-mdsearch---basic-smithi/

It seems that it's unrelated to the runs themselves, possibly postfix
that's running in the background is triggering these. Any idea what we
should do there?

Yehuda

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-05-31 20:23 teuthology SELinux failures Yehuda Sadeh-Weinraub
@ 2017-05-31 21:12 ` John Spray
  2017-06-06 13:26   ` John Spray
  2017-06-01 15:33 ` Boris Ranto
  1 sibling, 1 reply; 11+ messages in thread
From: John Spray @ 2017-05-31 21:12 UTC (permalink / raw)
  To: Yehuda Sadeh-Weinraub; +Cc: Boris Ranto, ceph-devel

On Wed, May 31, 2017 at 9:23 PM, Yehuda Sadeh-Weinraub
<yehuda@redhat.com> wrote:
> We started seeing SELinux related failures in recent teuthology run, e.g.:
> http://pulpito.ceph.com/yehudasa-2017-05-30_14:55:10-rgw-wip-rgw-mdsearch---basic-smithi/
>
> It seems that it's unrelated to the runs themselves, possibly postfix
> that's running in the background is triggering these. Any idea what we
> should do there?

My assumption had been that the new failures were from a kernel
change, because I had two runs overnight, one with "-k distro" and one
with "-k testing", and it was only the testing one that had the masses
of selinux failure.

Possibly these rgw jobs did not have a -k flag and therefore ended up
with the testing kernel that had been installed with previous jobs.

I have no insight about the nature of the selinux breakage itself though!

John

>
> Yehuda
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-05-31 20:23 teuthology SELinux failures Yehuda Sadeh-Weinraub
  2017-05-31 21:12 ` John Spray
@ 2017-06-01 15:33 ` Boris Ranto
  2017-06-01 16:15   ` Vasu Kulkarni
  1 sibling, 1 reply; 11+ messages in thread
From: Boris Ranto @ 2017-06-01 15:33 UTC (permalink / raw)
  To: Yehuda Sadeh-Weinraub; +Cc: ceph-devel

I did not check all of the failed tests but those that I checked
complained about dac_read_search. The dac_* family of capabilities
complains that root is trying to access a file that the standard
permissions does not allow him (root) to access (i.e. having 600 and
ceph/ceph user/group).

However, there is a lot of dac_* failures all throughout the system and
the target contexts are different for these files (i.e. there would
have to be a lot of files like that) so I am inclined to say that this
is a kernel bug. Especially considering that this does not present in
older/stock kernels where there already is a dac_override support.

Anyway, it should be safe to ignore these (not our processes, not our
files...)

Regards,
Boris


On Wed, 2017-05-31 at 13:23 -0700, Yehuda Sadeh-Weinraub wrote:
> We started seeing SELinux related failures in recent teuthology run,
> e.g.:
> http://pulpito.ceph.com/yehudasa-2017-05-30_14:55:10-rgw-wip-rgw-mdse
> arch---basic-smithi/
> 
> It seems that it's unrelated to the runs themselves, possibly postfix
> that's running in the background is triggering these. Any idea what
> we
> should do there?
> 
> Yehuda

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-06-01 15:33 ` Boris Ranto
@ 2017-06-01 16:15   ` Vasu Kulkarni
  2017-06-01 16:39     ` John Spray
  0 siblings, 1 reply; 11+ messages in thread
From: Vasu Kulkarni @ 2017-06-01 16:15 UTC (permalink / raw)
  To: Boris Ranto; +Cc: Yehuda Sadeh-Weinraub, ceph-devel

I believe the nodes are pretty much messed up with various kernel
versions, ideally we should not be seeing this on distro kernel(as per
Boris's comment)
so probably we should just schedule kernel client tests on vps so that
the smithi's and mira's have only latest distro kernels for other
suites for correct testing.

Also it would be nice to automatically reimage the nodes back to
distro once in a week.

On Thu, Jun 1, 2017 at 8:33 AM, Boris Ranto <branto@redhat.com> wrote:
> I did not check all of the failed tests but those that I checked
> complained about dac_read_search. The dac_* family of capabilities
> complains that root is trying to access a file that the standard
> permissions does not allow him (root) to access (i.e. having 600 and
> ceph/ceph user/group).
>
> However, there is a lot of dac_* failures all throughout the system and
> the target contexts are different for these files (i.e. there would
> have to be a lot of files like that) so I am inclined to say that this
> is a kernel bug. Especially considering that this does not present in
> older/stock kernels where there already is a dac_override support.
>
> Anyway, it should be safe to ignore these (not our processes, not our
> files...)
>
> Regards,
> Boris
>
>
> On Wed, 2017-05-31 at 13:23 -0700, Yehuda Sadeh-Weinraub wrote:
>> We started seeing SELinux related failures in recent teuthology run,
>> e.g.:
>> http://pulpito.ceph.com/yehudasa-2017-05-30_14:55:10-rgw-wip-rgw-mdse
>> arch---basic-smithi/
>>
>> It seems that it's unrelated to the runs themselves, possibly postfix
>> that's running in the background is triggering these. Any idea what
>> we
>> should do there?
>>
>> Yehuda
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-06-01 16:15   ` Vasu Kulkarni
@ 2017-06-01 16:39     ` John Spray
  2017-06-01 16:50       ` Yuri Weinstein
  0 siblings, 1 reply; 11+ messages in thread
From: John Spray @ 2017-06-01 16:39 UTC (permalink / raw)
  To: Vasu Kulkarni; +Cc: Boris Ranto, Yehuda Sadeh-Weinraub, ceph-devel

On Thu, Jun 1, 2017 at 5:15 PM, Vasu Kulkarni <vakulkar@redhat.com> wrote:
> I believe the nodes are pretty much messed up with various kernel
> versions, ideally we should not be seeing this on distro kernel(as per
> Boris's comment)
> so probably we should just schedule kernel client tests on vps so that
> the smithi's and mira's have only latest distro kernels for other
> suites for correct testing.

We do need to be able to run with "-k testing" for kcephfs, knfs and
multimds suites (this has always been the case), and I don't think to
move all those off of our main test nodes would be realistic.

Folks can already use "-k distro" on their runs (I have use this on
all my fs suite runs for a long time), it would probably make sense
for teuthology to use that as the default so that people are not
getting a random kernel.

John

> Also it would be nice to automatically reimage the nodes back to
> distro once in a week.
>
> On Thu, Jun 1, 2017 at 8:33 AM, Boris Ranto <branto@redhat.com> wrote:
>> I did not check all of the failed tests but those that I checked
>> complained about dac_read_search. The dac_* family of capabilities
>> complains that root is trying to access a file that the standard
>> permissions does not allow him (root) to access (i.e. having 600 and
>> ceph/ceph user/group).
>>
>> However, there is a lot of dac_* failures all throughout the system and
>> the target contexts are different for these files (i.e. there would
>> have to be a lot of files like that) so I am inclined to say that this
>> is a kernel bug. Especially considering that this does not present in
>> older/stock kernels where there already is a dac_override support.
>>
>> Anyway, it should be safe to ignore these (not our processes, not our
>> files...)
>>
>> Regards,
>> Boris
>>
>>
>> On Wed, 2017-05-31 at 13:23 -0700, Yehuda Sadeh-Weinraub wrote:
>>> We started seeing SELinux related failures in recent teuthology run,
>>> e.g.:
>>> http://pulpito.ceph.com/yehudasa-2017-05-30_14:55:10-rgw-wip-rgw-mdse
>>> arch---basic-smithi/
>>>
>>> It seems that it's unrelated to the runs themselves, possibly postfix
>>> that's running in the background is triggering these. Any idea what
>>> we
>>> should do there?
>>>
>>> Yehuda
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-06-01 16:39     ` John Spray
@ 2017-06-01 16:50       ` Yuri Weinstein
  2017-06-02  2:44         ` Nathan Cutler
  2017-06-02 15:31         ` Yuri Weinstein
  0 siblings, 2 replies; 11+ messages in thread
From: Yuri Weinstein @ 2017-06-01 16:50 UTC (permalink / raw)
  To: John Spray; +Cc: Vasu Kulkarni, Boris Ranto, Yehuda Sadeh-Weinraub, ceph-devel

all k* suites run with -k testing in nightlies

On Thu, Jun 1, 2017 at 9:39 AM, John Spray <jspray@redhat.com> wrote:
> On Thu, Jun 1, 2017 at 5:15 PM, Vasu Kulkarni <vakulkar@redhat.com> wrote:
>> I believe the nodes are pretty much messed up with various kernel
>> versions, ideally we should not be seeing this on distro kernel(as per
>> Boris's comment)
>> so probably we should just schedule kernel client tests on vps so that
>> the smithi's and mira's have only latest distro kernels for other
>> suites for correct testing.
>
> We do need to be able to run with "-k testing" for kcephfs, knfs and
> multimds suites (this has always been the case), and I don't think to
> move all those off of our main test nodes would be realistic.
>
> Folks can already use "-k distro" on their runs (I have use this on
> all my fs suite runs for a long time), it would probably make sense
> for teuthology to use that as the default so that people are not
> getting a random kernel.
>
> John
>
>> Also it would be nice to automatically reimage the nodes back to
>> distro once in a week.
>>
>> On Thu, Jun 1, 2017 at 8:33 AM, Boris Ranto <branto@redhat.com> wrote:
>>> I did not check all of the failed tests but those that I checked
>>> complained about dac_read_search. The dac_* family of capabilities
>>> complains that root is trying to access a file that the standard
>>> permissions does not allow him (root) to access (i.e. having 600 and
>>> ceph/ceph user/group).
>>>
>>> However, there is a lot of dac_* failures all throughout the system and
>>> the target contexts are different for these files (i.e. there would
>>> have to be a lot of files like that) so I am inclined to say that this
>>> is a kernel bug. Especially considering that this does not present in
>>> older/stock kernels where there already is a dac_override support.
>>>
>>> Anyway, it should be safe to ignore these (not our processes, not our
>>> files...)
>>>
>>> Regards,
>>> Boris
>>>
>>>
>>> On Wed, 2017-05-31 at 13:23 -0700, Yehuda Sadeh-Weinraub wrote:
>>>> We started seeing SELinux related failures in recent teuthology run,
>>>> e.g.:
>>>> http://pulpito.ceph.com/yehudasa-2017-05-30_14:55:10-rgw-wip-rgw-mdse
>>>> arch---basic-smithi/
>>>>
>>>> It seems that it's unrelated to the runs themselves, possibly postfix
>>>> that's running in the background is triggering these. Any idea what
>>>> we
>>>> should do there?
>>>>
>>>> Yehuda
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-06-01 16:50       ` Yuri Weinstein
@ 2017-06-02  2:44         ` Nathan Cutler
  2017-06-02 15:31         ` Yuri Weinstein
  1 sibling, 0 replies; 11+ messages in thread
From: Nathan Cutler @ 2017-06-02  2:44 UTC (permalink / raw)
  To: Yuri Weinstein, John Spray
  Cc: Vasu Kulkarni, Boris Ranto, Yehuda Sadeh-Weinraub, ceph-devel

> all k* suites run with -k testing in nightlies

What I hear John saying is you should add an explicit "-k distro" to all 
the other suites in the nightlies.

We had the same issue in backports testing, and fixed it by specifying 
"-k distro" on all teuthology-suite commands we use, except for the k* 
suites of course.

Not specifying "-k" is the equivalent of saying "I don't care, just use 
whatever kernel happens to be on each test node".

Nathan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-06-01 16:50       ` Yuri Weinstein
  2017-06-02  2:44         ` Nathan Cutler
@ 2017-06-02 15:31         ` Yuri Weinstein
  1 sibling, 0 replies; 11+ messages in thread
From: Yuri Weinstein @ 2017-06-02 15:31 UTC (permalink / raw)
  To: John Spray; +Cc: Vasu Kulkarni, Boris Ranto, Yehuda Sadeh-Weinraub, ceph-devel

Reviewed crontab schedules and all suite are using "-k" explicitly now

On Thu, Jun 1, 2017 at 9:50 AM, Yuri Weinstein <yweinste@redhat.com> wrote:
> all k* suites run with -k testing in nightlies
>
> On Thu, Jun 1, 2017 at 9:39 AM, John Spray <jspray@redhat.com> wrote:
>> On Thu, Jun 1, 2017 at 5:15 PM, Vasu Kulkarni <vakulkar@redhat.com> wrote:
>>> I believe the nodes are pretty much messed up with various kernel
>>> versions, ideally we should not be seeing this on distro kernel(as per
>>> Boris's comment)
>>> so probably we should just schedule kernel client tests on vps so that
>>> the smithi's and mira's have only latest distro kernels for other
>>> suites for correct testing.
>>
>> We do need to be able to run with "-k testing" for kcephfs, knfs and
>> multimds suites (this has always been the case), and I don't think to
>> move all those off of our main test nodes would be realistic.
>>
>> Folks can already use "-k distro" on their runs (I have use this on
>> all my fs suite runs for a long time), it would probably make sense
>> for teuthology to use that as the default so that people are not
>> getting a random kernel.
>>
>> John
>>
>>> Also it would be nice to automatically reimage the nodes back to
>>> distro once in a week.
>>>
>>> On Thu, Jun 1, 2017 at 8:33 AM, Boris Ranto <branto@redhat.com> wrote:
>>>> I did not check all of the failed tests but those that I checked
>>>> complained about dac_read_search. The dac_* family of capabilities
>>>> complains that root is trying to access a file that the standard
>>>> permissions does not allow him (root) to access (i.e. having 600 and
>>>> ceph/ceph user/group).
>>>>
>>>> However, there is a lot of dac_* failures all throughout the system and
>>>> the target contexts are different for these files (i.e. there would
>>>> have to be a lot of files like that) so I am inclined to say that this
>>>> is a kernel bug. Especially considering that this does not present in
>>>> older/stock kernels where there already is a dac_override support.
>>>>
>>>> Anyway, it should be safe to ignore these (not our processes, not our
>>>> files...)
>>>>
>>>> Regards,
>>>> Boris
>>>>
>>>>
>>>> On Wed, 2017-05-31 at 13:23 -0700, Yehuda Sadeh-Weinraub wrote:
>>>>> We started seeing SELinux related failures in recent teuthology run,
>>>>> e.g.:
>>>>> http://pulpito.ceph.com/yehudasa-2017-05-30_14:55:10-rgw-wip-rgw-mdse
>>>>> arch---basic-smithi/
>>>>>
>>>>> It seems that it's unrelated to the runs themselves, possibly postfix
>>>>> that's running in the background is triggering these. Any idea what
>>>>> we
>>>>> should do there?
>>>>>
>>>>> Yehuda
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-05-31 21:12 ` John Spray
@ 2017-06-06 13:26   ` John Spray
  2017-06-06 13:55     ` Ilya Dryomov
  0 siblings, 1 reply; 11+ messages in thread
From: John Spray @ 2017-06-06 13:26 UTC (permalink / raw)
  To: 严正, Ilya Dryomov; +Cc: ceph-devel

Adding Ilya and Zheng -- do you guys happen to know what might have
changed in the current testing kernel to start seeing all these
selinux issues?  I'm still seeing it in my latest runs.

Here's a recent example:
http://qa-proxy.ceph.com/teuthology/jspray-2017-06-05_15:00:03-multimds-wip-jcsp-testing-20170604-testing-basic-smithi/1260536/teuthology.log

John

On Wed, May 31, 2017 at 10:12 PM, John Spray <jspray@redhat.com> wrote:
> On Wed, May 31, 2017 at 9:23 PM, Yehuda Sadeh-Weinraub
> <yehuda@redhat.com> wrote:
>> We started seeing SELinux related failures in recent teuthology run, e.g.:
>> http://pulpito.ceph.com/yehudasa-2017-05-30_14:55:10-rgw-wip-rgw-mdsearch---basic-smithi/
>>
>> It seems that it's unrelated to the runs themselves, possibly postfix
>> that's running in the background is triggering these. Any idea what we
>> should do there?
>
> My assumption had been that the new failures were from a kernel
> change, because I had two runs overnight, one with "-k distro" and one
> with "-k testing", and it was only the testing one that had the masses
> of selinux failure.
>
> Possibly these rgw jobs did not have a -k flag and therefore ended up
> with the testing kernel that had been installed with previous jobs.
>
> I have no insight about the nature of the selinux breakage itself though!
>
> John
>
>>
>> Yehuda
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-06-06 13:26   ` John Spray
@ 2017-06-06 13:55     ` Ilya Dryomov
  2017-06-06 16:42       ` Vasu Kulkarni
  0 siblings, 1 reply; 11+ messages in thread
From: Ilya Dryomov @ 2017-06-06 13:55 UTC (permalink / raw)
  To: John Spray; +Cc: 严正, Ilya Dryomov, ceph-devel

On Tue, Jun 6, 2017 at 3:26 PM, John Spray <jspray@redhat.com> wrote:
> Adding Ilya and Zheng -- do you guys happen to know what might have
> changed in the current testing kernel to start seeing all these
> selinux issues?  I'm still seeing it in my latest runs.
>
> Here's a recent example:
> http://qa-proxy.ceph.com/teuthology/jspray-2017-06-05_15:00:03-multimds-wip-jcsp-testing-20170604-testing-basic-smithi/1260536/teuthology.log

A quick grep points at

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a4c22426955d4fc04069811997b7390c0fb858e

and that commit is blamed in

https://bugzilla.redhat.com/show_bug.cgi?id=1449108

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: teuthology SELinux failures
  2017-06-06 13:55     ` Ilya Dryomov
@ 2017-06-06 16:42       ` Vasu Kulkarni
  0 siblings, 0 replies; 11+ messages in thread
From: Vasu Kulkarni @ 2017-06-06 16:42 UTC (permalink / raw)
  To: Ilya Dryomov; +Cc: John Spray, 严正, Ilya Dryomov, ceph-devel

This is a different one compared to original thread,  I have this
https://github.com/ceph/teuthology/pull/1076  (if you want to test
with that commit, it will help or else i am not sure we are seeing
this in smoke)

On Tue, Jun 6, 2017 at 6:55 AM, Ilya Dryomov <idryomov@gmail.com> wrote:
> On Tue, Jun 6, 2017 at 3:26 PM, John Spray <jspray@redhat.com> wrote:
>> Adding Ilya and Zheng -- do you guys happen to know what might have
>> changed in the current testing kernel to start seeing all these
>> selinux issues?  I'm still seeing it in my latest runs.
>>
>> Here's a recent example:
>> http://qa-proxy.ceph.com/teuthology/jspray-2017-06-05_15:00:03-multimds-wip-jcsp-testing-20170604-testing-basic-smithi/1260536/teuthology.log
>
> A quick grep points at
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a4c22426955d4fc04069811997b7390c0fb858e
>
> and that commit is blamed in
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1449108
>
> Thanks,
>
>                 Ilya
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-06-06 16:42 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-31 20:23 teuthology SELinux failures Yehuda Sadeh-Weinraub
2017-05-31 21:12 ` John Spray
2017-06-06 13:26   ` John Spray
2017-06-06 13:55     ` Ilya Dryomov
2017-06-06 16:42       ` Vasu Kulkarni
2017-06-01 15:33 ` Boris Ranto
2017-06-01 16:15   ` Vasu Kulkarni
2017-06-01 16:39     ` John Spray
2017-06-01 16:50       ` Yuri Weinstein
2017-06-02  2:44         ` Nathan Cutler
2017-06-02 15:31         ` Yuri Weinstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.