linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC
       [not found] ` <20221215001205.51969-4-jeffxu@google.com>
@ 2023-06-28 11:42   ` Dominique Martinet
  2023-06-28 19:31     ` Dominique Martinet
  2023-06-29  4:13     ` Jeff Xu
  0 siblings, 2 replies; 6+ messages in thread
From: Dominique Martinet @ 2023-06-28 11:42 UTC (permalink / raw)
  To: jeffxu
  Cc: skhan, keescook, akpm, dmitry.torokhov, dverkamp, hughd, jeffxu,
	jorgelo, linux-kernel, linux-kselftest, linux-mm, jannh,
	linux-hardening, linux-security-module, kernel test robot

jeffxu@chromium.org wrote on Thu, Dec 15, 2022 at 12:12:03AM +0000:
> From: Jeff Xu <jeffxu@google.com>
> 
> The new MFD_NOEXEC_SEAL and MFD_EXEC flags allows application to
> set executable bit at creation time (memfd_create).
> 
> When MFD_NOEXEC_SEAL is set, memfd is created without executable bit
> (mode:0666), and sealed with F_SEAL_EXEC, so it can't be chmod to
> be executable (mode: 0777) after creation.
> 
> when MFD_EXEC flag is set, memfd is created with executable bit
> (mode:0777), this is the same as the old behavior of memfd_create.
> 
> The new pid namespaced sysctl vm.memfd_noexec has 3 values:
> 0: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
>         MFD_EXEC was set.
> 1: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
>         MFD_NOEXEC_SEAL was set.
> 2: memfd_create() without MFD_NOEXEC_SEAL will be rejected.

So, erm, I'm a bit late to the party but I was just looking at a way of
blocking memfd_create+exec in a container and this sounded perfect: my
reading is that this is a security feature meant to be set for
container's namespaces that'd totally disable something like
memfd_create followed by fexecve (because we don't want weird binaries
coming from who knows where to be executed on a shiny secure system),
but. . . is this actually supposed to work?
(see below)

> [...]
> --- a/mm/memfd.c
> +++ b/mm/memfd.c
> @@ -263,12 +264,14 @@ long memfd_fcntl(struct file *file, unsigned int cmd, unsigned long arg)
>  #define MFD_NAME_PREFIX_LEN (sizeof(MFD_NAME_PREFIX) - 1)
>  #define MFD_NAME_MAX_LEN (NAME_MAX - MFD_NAME_PREFIX_LEN)
>  
> -#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB)
> +#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB | MFD_NOEXEC_SEAL | MFD_EXEC)
>  
>  SYSCALL_DEFINE2(memfd_create,
>  		const char __user *, uname,
>  		unsigned int, flags)
>  {
> +	char comm[TASK_COMM_LEN];
> +	struct pid_namespace *ns;
>  	unsigned int *file_seals;
>  	struct file *file;
>  	int fd, error;
> @@ -285,6 +288,39 @@ SYSCALL_DEFINE2(memfd_create,
>  			return -EINVAL;
>  	}
>  
> +	/* Invalid if both EXEC and NOEXEC_SEAL are set.*/
> +	if ((flags & MFD_EXEC) && (flags & MFD_NOEXEC_SEAL))
> +		return -EINVAL;
> +
> +	if (!(flags & (MFD_EXEC | MFD_NOEXEC_SEAL))) {
> +		[code that checks the sysctl]

If flags already has either MFD_EXEC or MFD_NOEXEC_SEAL, you don't check
the sysctl at all.

This can be verified easily:
-----
$ cat > memfd_exec.c <<'EOF'
#define _GNU_SOURCE

#include <errno.h>
#include <stdio.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/wait.h>

#ifndef MFD_EXEC
#define MFD_EXEC                0x0010U
#endif

int main() { 
    int fd = memfd_create("script", MFD_EXEC);
    if (fd == -1)l
        perror("memfd");
    
    char prog[] = "#!/bin/sh\necho Ran script\n";
    if (write(fd, prog, sizeof(prog)-1) != sizeof(prog)-1)
            perror("write");
    
    char *const argv[] = { "script", NULL };
    char *const envp[] = { NULL };
    fexecve(fd, argv, envp);
    perror("fexecve");
}   
EOF
$ gcc -o memfd_exec memfd_exec.c
$ ./memfd_exec
Ran script
$ sysctl vm.memfd_noexec
vm.memfd_noexec = 2
-----
(as opposed to failing hard on memfd_create if flag unset on sysctl=2,
and failing on fexecve with flag unset and sysctl=1)

What am I missing?



BTW I find the current behaviour rather hard to use: setting this to 2
should still set NOEXEC by default in my opinion, just refuse anything
that explicitly requested EXEC.

Sure there's a warn_once that memfd_create was used without seal, but
right now on my system it's "used up" 5 seconds after boot by systemd:
[    5.854378] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=1 'systemd'

And anyway, older kernels will barf up EINVAL when calling memfd_create
with MFD_NOEXEC_SEAL, so even if userspace will want to adapt they'll
need to try calling memfd_create with the flag once and retry on EINVAL,
which let's face it is going to take a while to happen.
(Also, the flag has been added to glibc, but not in any release yet)

Making calls default to noexec AND refuse exec does what you want
(forbid use of exec in an app that wasn't in a namespace that allows
exec) while allowing apps that require it to work; that sounds better
than making all applications that haven't taken the pain of adding the
new flag to me.
Well, I guess an app that did require exec without setting the flag will
fail in a weird place instead of failing at memfd_create and having a
chance to fallback, so it's not like it doesn't make any sense;
I don't have such strong feelings about this if the sysctl works, but
for my use case I'm more likely to want to take a chance at memfd_create
not needing exec than having the flag set. Perhaps a third value if I
cared enough...

-- 
Dominique Martinet | Asmadeus


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC
  2023-06-28 11:42   ` [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC Dominique Martinet
@ 2023-06-28 19:31     ` Dominique Martinet
  2023-06-29  4:33       ` Jeff Xu
  2023-06-29  4:13     ` Jeff Xu
  1 sibling, 1 reply; 6+ messages in thread
From: Dominique Martinet @ 2023-06-28 19:31 UTC (permalink / raw)
  To: jeffxu
  Cc: skhan, keescook, akpm, dmitry.torokhov, dverkamp, hughd, jeffxu,
	jorgelo, linux-kernel, linux-kselftest, linux-mm, jannh,
	linux-hardening, linux-security-module, kernel test robot

Dominique Martinet wrote on Wed, Jun 28, 2023 at 08:42:41PM +0900:
> If flags already has either MFD_EXEC or MFD_NOEXEC_SEAL, you don't check
> the sysctl at all.
> [...repro snipped..]
> 
> What am I missing?

(Perhaps the intent is just to force people to use the flag so it is
easier to check for memfd_create in seccomp or other LSM?
But I don't see why such a check couldn't consider the absence of a flag
as well, so I don't see the point.)


> BTW I find the current behaviour rather hard to use: setting this to 2
> should still set NOEXEC by default in my opinion, just refuse anything
> that explicitly requested EXEC.

And I just noticed it's not possible to lower the value despite having
CAP_SYS_ADMIN: what the heck?! I have never seen such a sysctl and it
just forced me to reboot because I willy-nilly tested in the init pid
namespace, and quite a few applications that don't require exec broke
exactly as I described below.

If the user has CAP_SYS_ADMIN there are more container escape methods
than I can count, this is basically free pass to root on main namespace
anyway, you're not protecting anything. Please let people set the sysctl
to what they want.

> Sure there's a warn_once that memfd_create was used without seal, but
> right now on my system it's "used up" 5 seconds after boot by systemd:
> [    5.854378] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=1 'systemd'
> 
> And anyway, older kernels will barf up EINVAL when calling memfd_create
> with MFD_NOEXEC_SEAL, so even if userspace will want to adapt they'll
> need to try calling memfd_create with the flag once and retry on EINVAL,
> which let's face it is going to take a while to happen.
> (Also, the flag has been added to glibc, but not in any release yet)
> 
> Making calls default to noexec AND refuse exec does what you want
> (forbid use of exec in an app that wasn't in a namespace that allows
> exec) while allowing apps that require it to work; that sounds better
> than making all applications that haven't taken the pain of adding the
> new flag to me.
> Well, I guess an app that did require exec without setting the flag will
> fail in a weird place instead of failing at memfd_create and having a
> chance to fallback, so it's not like it doesn't make any sense;
> I don't have such strong feelings about this if the sysctl works, but
> for my use case I'm more likely to want to take a chance at memfd_create
> not needing exec than having the flag set. Perhaps a third value if I
> cared enough...
 

-- 
Dominique Martinet | Asmadeus


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC
  2023-06-28 11:42   ` [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC Dominique Martinet
  2023-06-28 19:31     ` Dominique Martinet
@ 2023-06-29  4:13     ` Jeff Xu
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff Xu @ 2023-06-29  4:13 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: skhan, keescook, akpm, dmitry.torokhov, dverkamp, hughd, jeffxu,
	jorgelo, linux-kernel, linux-kselftest, linux-mm, jannh,
	linux-hardening, linux-security-module, kernel test robot

Hello.

Thank you for your email and interested in using memfd_noexec !

On Wed, Jun 28, 2023 at 4:43 AM Dominique Martinet
<asmadeus@codewreck.org> wrote:
>
> jeffxu@chromium.org wrote on Thu, Dec 15, 2022 at 12:12:03AM +0000:
> > From: Jeff Xu <jeffxu@google.com>
> >
> > The new MFD_NOEXEC_SEAL and MFD_EXEC flags allows application to
> > set executable bit at creation time (memfd_create).
> >
> > When MFD_NOEXEC_SEAL is set, memfd is created without executable bit
> > (mode:0666), and sealed with F_SEAL_EXEC, so it can't be chmod to
> > be executable (mode: 0777) after creation.
> >
> > when MFD_EXEC flag is set, memfd is created with executable bit
> > (mode:0777), this is the same as the old behavior of memfd_create.
> >
> > The new pid namespaced sysctl vm.memfd_noexec has 3 values:
> > 0: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
> >         MFD_EXEC was set.
> > 1: memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL acts like
> >         MFD_NOEXEC_SEAL was set.
> > 2: memfd_create() without MFD_NOEXEC_SEAL will be rejected.
>
> So, erm, I'm a bit late to the party but I was just looking at a way of
> blocking memfd_create+exec in a container and this sounded perfect: my
> reading is that this is a security feature meant to be set for
> container's namespaces that'd totally disable something like
> memfd_create followed by fexecve (because we don't want weird binaries
> coming from who knows where to be executed on a shiny secure system),
> but. . . is this actually supposed to work?
> (see below)
>
> > [...]
> > --- a/mm/memfd.c
> > +++ b/mm/memfd.c
> > @@ -263,12 +264,14 @@ long memfd_fcntl(struct file *file, unsigned int cmd, unsigned long arg)
> >  #define MFD_NAME_PREFIX_LEN (sizeof(MFD_NAME_PREFIX) - 1)
> >  #define MFD_NAME_MAX_LEN (NAME_MAX - MFD_NAME_PREFIX_LEN)
> >
> > -#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB)
> > +#define MFD_ALL_FLAGS (MFD_CLOEXEC | MFD_ALLOW_SEALING | MFD_HUGETLB | MFD_NOEXEC_SEAL | MFD_EXEC)
> >
> >  SYSCALL_DEFINE2(memfd_create,
> >               const char __user *, uname,
> >               unsigned int, flags)
> >  {
> > +     char comm[TASK_COMM_LEN];
> > +     struct pid_namespace *ns;
> >       unsigned int *file_seals;
> >       struct file *file;
> >       int fd, error;
> > @@ -285,6 +288,39 @@ SYSCALL_DEFINE2(memfd_create,
> >                       return -EINVAL;
> >       }
> >
> > +     /* Invalid if both EXEC and NOEXEC_SEAL are set.*/
> > +     if ((flags & MFD_EXEC) && (flags & MFD_NOEXEC_SEAL))
> > +             return -EINVAL;
> > +
> > +     if (!(flags & (MFD_EXEC | MFD_NOEXEC_SEAL))) {
> > +             [code that checks the sysctl]
>
> If flags already has either MFD_EXEC or MFD_NOEXEC_SEAL, you don't check
> the sysctl at all.
>
> This can be verified easily:
> -----
> $ cat > memfd_exec.c <<'EOF'
> #define _GNU_SOURCE
>
> #include <errno.h>
> #include <stdio.h>
> #include <sys/mman.h>
> #include <sys/types.h>
> #include <sys/wait.h>
>
> #ifndef MFD_EXEC
> #define MFD_EXEC                0x0010U
> #endif
>
> int main() {
>     int fd = memfd_create("script", MFD_EXEC);
>     if (fd == -1)l
>         perror("memfd");
>
>     char prog[] = "#!/bin/sh\necho Ran script\n";
>     if (write(fd, prog, sizeof(prog)-1) != sizeof(prog)-1)
>             perror("write");
>
>     char *const argv[] = { "script", NULL };
>     char *const envp[] = { NULL };
>     fexecve(fd, argv, envp);
>     perror("fexecve");
> }
> EOF
> $ gcc -o memfd_exec memfd_exec.c
> $ ./memfd_exec
> Ran script
> $ sysctl vm.memfd_noexec
> vm.memfd_noexec = 2
> -----
> (as opposed to failing hard on memfd_create if flag unset on sysctl=2,
> and failing on fexecve with flag unset and sysctl=1)
>
> What am I missing?
>
>
At one point, I was thinking of having a security hook to block
executable memfd [1], so this sysctl only works for the application
that doesn't set EXEC bit. Now I think it makes sense to use
vm.memfd_noexec = 2 to block the MFD_EXEC also.
Anyway the commit msg says:
2: memfd_create() without MFD_NOEXEC_SEAL will be rejected.
Not doing that is a bug. I will send a fix for that.

[1] https://lore.kernel.org/lkml/20221206150233.1963717-7-jeffxu@google.com/

>
> BTW I find the current behaviour rather hard to use: setting this to 2
> should still set NOEXEC by default in my opinion, just refuse anything
> that explicitly requested EXEC.
>
At one point [2] (v2 of patch) there were two sysctls, one is doing
overwrite, one is enforcing, later I decided with one sysctl, the
rationale is the kernel will eventually get out of the business of
overwriting user space code.  Yes. It might take a long time to
migrate all of the userspace.

In the meantime,  to meet what you want, the solution is keep
vm.memfd_noexec = 1 (for overwrite), and a new security policy
(SELInux or Landlock) that uses security hook security_memfd_create,
this can block one process from creating executable memfd. Indeed,
security policy is better fit to cases like this than sysctl.

[2] https://lore.kernel.org/linux-mm/CABi2SkWGo9Jrd=i1e2PoDWYGenGhR=pG=yGsQP5VLmizTmg-iA@mail.gmail.com/

> Sure there's a warn_once that memfd_create was used without seal, but
> right now on my system it's "used up" 5 seconds after boot by systemd:
> [    5.854378] memfd_create() without MFD_EXEC nor MFD_NOEXEC_SEAL, pid=1 'systemd'
>
> And anyway, older kernels will barf up EINVAL when calling memfd_create
> with MFD_NOEXEC_SEAL, so even if userspace will want to adapt they'll
> need to try calling memfd_create with the flag once and retry on EINVAL,
> which let's face it is going to take a while to happen.
> (Also, the flag has been added to glibc, but not in any release yet)
>
Yes. Application will need to do some detection of the kernel. This is
not avoidable.

> Making calls default to noexec AND refuse exec does what you want
> (forbid use of exec in an app that wasn't in a namespace that allows
> exec) while allowing apps that require it to work; that sounds better
> than making all applications that haven't taken the pain of adding the
> new flag to me.
> Well, I guess an app that did require exec without setting the flag will
> fail in a weird place instead of failing at memfd_create and having a
> chance to fallback, so it's not like it doesn't make any sense;
> I don't have such strong feelings about this if the sysctl works, but
> for my use case I'm more likely to want to take a chance at memfd_create
> not needing exec than having the flag set. Perhaps a third value if I
> cared enough...
>
> --
> Dominique Martinet | Asmadeus

Thanks
-Jeff


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC
  2023-06-28 19:31     ` Dominique Martinet
@ 2023-06-29  4:33       ` Jeff Xu
  2023-06-29 10:29         ` Dominique Martinet
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff Xu @ 2023-06-29  4:33 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: skhan, keescook, akpm, dmitry.torokhov, dverkamp, hughd, jeffxu,
	jorgelo, linux-kernel, linux-kselftest, linux-mm, jannh,
	linux-hardening, linux-security-module, kernel test robot

Hello!

On Wed, Jun 28, 2023 at 12:31 PM Dominique Martinet
<asmadeus@codewreck.org> wrote:
>
> Dominique Martinet wrote on Wed, Jun 28, 2023 at 08:42:41PM +0900:
> > If flags already has either MFD_EXEC or MFD_NOEXEC_SEAL, you don't check
> > the sysctl at all.
> > [...repro snipped..]
> >
> > What am I missing?
>
> (Perhaps the intent is just to force people to use the flag so it is
> easier to check for memfd_create in seccomp or other LSM?
> But I don't see why such a check couldn't consider the absence of a flag
> as well, so I don't see the point.)
>
Yes. There is consideration to motivate app devs to migrate their code
to use the new EXEC/NOEXEC_SEAL flag for memfd_create, if that answers
your question.

>
> > BTW I find the current behaviour rather hard to use: setting this to 2
> > should still set NOEXEC by default in my opinion, just refuse anything
> > that explicitly requested EXEC.
>
> And I just noticed it's not possible to lower the value despite having
> CAP_SYS_ADMIN: what the heck?! I have never seen such a sysctl and it
> just forced me to reboot because I willy-nilly tested in the init pid
> namespace, and quite a few applications that don't require exec broke
> exactly as I described below.
>
> If the user has CAP_SYS_ADMIN there are more container escape methods
> than I can count, this is basically free pass to root on main namespace
> anyway, you're not protecting anything. Please let people set the sysctl
> to what they want.
>
Yama has a similar setting,  for example, 3 (YAMA_SCOPE_NO_ATTACH)
will not allow downgrading at runtime.

Since this is a security feature, not allowing downgrading at run time
is part of the security consideration. I hope you understand.

> --
> Dominique Martinet | Asmadeus

Thanks!
-Jeff


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC
  2023-06-29  4:33       ` Jeff Xu
@ 2023-06-29 10:29         ` Dominique Martinet
  2023-06-29 21:04           ` Jeff Xu
  0 siblings, 1 reply; 6+ messages in thread
From: Dominique Martinet @ 2023-06-29 10:29 UTC (permalink / raw)
  To: Jeff Xu
  Cc: skhan, keescook, akpm, dmitry.torokhov, dverkamp, hughd, jeffxu,
	jorgelo, linux-kernel, linux-kselftest, linux-mm, jannh,
	linux-hardening, linux-security-module, kernel test robot

Jeff Xu wrote on Wed, Jun 28, 2023 at 09:33:27PM -0700:
> > > BTW I find the current behaviour rather hard to use: setting this to 2
> > > should still set NOEXEC by default in my opinion, just refuse anything
> > > that explicitly requested EXEC.
> >
> > And I just noticed it's not possible to lower the value despite having
> > CAP_SYS_ADMIN: what the heck?! I have never seen such a sysctl and it
> > just forced me to reboot because I willy-nilly tested in the init pid
> > namespace, and quite a few applications that don't require exec broke
> > exactly as I described below.
> >
> > If the user has CAP_SYS_ADMIN there are more container escape methods
> > than I can count, this is basically free pass to root on main namespace
> > anyway, you're not protecting anything. Please let people set the sysctl
> > to what they want.
>
> Yama has a similar setting,  for example, 3 (YAMA_SCOPE_NO_ATTACH)
> will not allow downgrading at runtime.
> 
> Since this is a security feature, not allowing downgrading at run time
> is part of the security consideration. I hope you understand.

I didn't remember yama had this stuck bit; that still strikes me as
unusual, and if you require a custom LSM rule for memfd anyway I don't
see why it couldn't enforce that the sysctl is unchanged, but sure.

Please, though:
 - I have a hard time thinking of 1 as a security flag in general (even
if I do agree a sloppy LSM rule could require it); I would only lock 2
 - please make it clear, I don't see any entry in the sysctl
documentation[1] about memfd_noexec, there should be one and you can
copy the wording from yama's doc[2]: "Once set, this sysctl value cannot
be changed"
[1] Documentation/admin-guide/sysctl/vm.rst
[2] Documentation/admin-guide/LSM/Yama.rst


Either way as it stands I still don't think one can expect most
userspace applications to be converted until some libc wrapper takes
care of the retry logic and a couple of years, so I'll go look for
another way of filtering this (and eventually setting this to 1) as you
suggested.
I'll leave the follow-up up to you and won't bother you more.

Thanks,
-- 
Dominique Martinet | Asmadeus


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC
  2023-06-29 10:29         ` Dominique Martinet
@ 2023-06-29 21:04           ` Jeff Xu
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff Xu @ 2023-06-29 21:04 UTC (permalink / raw)
  To: Dominique Martinet
  Cc: Jeff Xu, skhan, keescook, akpm, dmitry.torokhov, dverkamp, hughd,
	jorgelo, linux-kernel, linux-kselftest, linux-mm, jannh,
	linux-hardening, linux-security-module, kernel test robot

Hello

On Thu, Jun 29, 2023 at 3:30 AM Dominique Martinet
<asmadeus@codewreck.org> wrote:
>
> Jeff Xu wrote on Wed, Jun 28, 2023 at 09:33:27PM -0700:
> > > > BTW I find the current behaviour rather hard to use: setting this to 2
> > > > should still set NOEXEC by default in my opinion, just refuse anything
> > > > that explicitly requested EXEC.
> > >
> > > And I just noticed it's not possible to lower the value despite having
> > > CAP_SYS_ADMIN: what the heck?! I have never seen such a sysctl and it
> > > just forced me to reboot because I willy-nilly tested in the init pid
> > > namespace, and quite a few applications that don't require exec broke
> > > exactly as I described below.
> > >
> > > If the user has CAP_SYS_ADMIN there are more container escape methods
> > > than I can count, this is basically free pass to root on main namespace
> > > anyway, you're not protecting anything. Please let people set the sysctl
> > > to what they want.
> >
> > Yama has a similar setting,  for example, 3 (YAMA_SCOPE_NO_ATTACH)
> > will not allow downgrading at runtime.
> >
> > Since this is a security feature, not allowing downgrading at run time
> > is part of the security consideration. I hope you understand.
>
> I didn't remember yama had this stuck bit; that still strikes me as
> unusual, and if you require a custom LSM rule for memfd anyway I don't
> see why it couldn't enforce that the sysctl is unchanged, but sure.
>
> Please, though:
>  - I have a hard time thinking of 1 as a security flag in general (even
> if I do agree a sloppy LSM rule could require it); I would only lock 2
>  - please make it clear, I don't see any entry in the sysctl
> documentation[1] about memfd_noexec, there should be one and you can
> copy the wording from yama's doc[2]: "Once set, this sysctl value cannot
> be changed"
> [1] Documentation/admin-guide/sysctl/vm.rst
> [2] Documentation/admin-guide/LSM/Yama.rst
>
Thanks for the suggestion.
Yes, it would be good to have some documentation.
I will send patch to update Documentation/admin-guide/sysctl/vm.rst

>
> Either way as it stands I still don't think one can expect most
> userspace applications to be converted until some libc wrapper takes
> care of the retry logic and a couple of years, so I'll go look for
> another way of filtering this (and eventually setting this to 1) as you
> suggested.
> I'll leave the follow-up up to you and won't bother you more.
>
Not bothered at all! and thanks for reporting the bug to improve the
quality of memfd_noexec !

Much appreciated.
-Jeff


> Thanks,
> --
> Dominique Martinet | Asmadeus


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-06-29 21:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20221215001205.51969-1-jeffxu@google.com>
     [not found] ` <20221215001205.51969-4-jeffxu@google.com>
2023-06-28 11:42   ` [PATCH v8 3/5] mm/memfd: add MFD_NOEXEC_SEAL and MFD_EXEC Dominique Martinet
2023-06-28 19:31     ` Dominique Martinet
2023-06-29  4:33       ` Jeff Xu
2023-06-29 10:29         ` Dominique Martinet
2023-06-29 21:04           ` Jeff Xu
2023-06-29  4:13     ` Jeff Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).