* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 [not found] <100149681.11244932.1588661282331.JavaMail.zimbra@redhat.com> @ 2020-05-05 7:28 ` Jan Stancek 2020-05-05 7:49 ` Florian Weimer 2020-05-05 7:54 ` Andreas Schwab 0 siblings, 2 replies; 19+ messages in thread From: Jan Stancek @ 2020-05-05 7:28 UTC (permalink / raw) To: ltp Hi, I'm seeing an issue with CLONE_IO and libc' clone() on ppc64le, where flags parameter appears to be sign extended before it's passed to kernel syscall. This is an issue since kernel commit ef2c41cf38a7 ("clone3: allow spawning processes into cgroups"), because there's now a check for flag that userspace didn't intend to pass: static int cgroup_css_set_fork(struct kernel_clone_args *kargs) __acquires(&cgroup_mutex) __acquires(&cgroup_threadgroup_rwsem) if (!(kargs->flags & CLONE_INTO_CGROUP)) { // CLONE_INTO_CGROUP == 0x200000000ULL kargs->cset = cset; return 0; } f = fget_raw(kargs->cgroup); if (!f) { ret = -EBADF; goto err; } Reproducer: #define _GNU_SOURCE #include <sched.h> #include <stdio.h> #include <sys/wait.h> char stack[2*1024*1024]; static int do_child(void *arg) { printf("hello"); return 0; } int main(void) { clone(do_child, stack+1024*1024, CLONE_IO|SIGCHLD, NULL, NULL, NULL, NULL); return 0; } reliably hits EBADF with glibc-2.31.9000-12.fc33.ppc64le and 5.7.0-0.rc2 kernel: clone(child_stack=0x1011ffe0, flags=CLONE_IO|0xffffffff00000000|SIGCHLD) = -1 EBADF (Bad file descriptor) Regards, Jan ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 7:28 ` [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 Jan Stancek @ 2020-05-05 7:49 ` Florian Weimer 2020-05-05 7:59 ` Christian Brauner 2020-05-05 8:32 ` Christian Brauner 2020-05-05 7:54 ` Andreas Schwab 1 sibling, 2 replies; 19+ messages in thread From: Florian Weimer @ 2020-05-05 7:49 UTC (permalink / raw) To: ltp * Jan Stancek via Libc-alpha: > I'm seeing an issue with CLONE_IO and libc' clone() on ppc64le, > where flags parameter appears to be sign extended before it's passed > to kernel syscall. Interesting, thanks for reporting this. The manual page clearly documents the interface as; int clone(int (*fn)(void *), void *child_stack, int flags, void *arg, ... /* pid_t *ptid, void *newtls, pid_t *ctid */ ); But the kernel uses unsigned long for clone_flags. This looks like an unintended userspace ABI breakage. Rather than dropping the invalid flags check in the kernel (having the check is valuable), I think the parameter should be changed to int or unsigned int, or the flags check should be written in such a way that it disregards bits that result from sign extensions: fail if clone_flags != (int) clone_flags, otherwise set clone_flags = 0xFFFFFFFF. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 7:49 ` Florian Weimer @ 2020-05-05 7:59 ` Christian Brauner 2020-05-05 8:02 ` Christian Brauner 2020-05-05 8:32 ` Christian Brauner 1 sibling, 1 reply; 19+ messages in thread From: Christian Brauner @ 2020-05-05 7:59 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 09:49:50AM +0200, Florian Weimer wrote: > * Jan Stancek via Libc-alpha: > > > I'm seeing an issue with CLONE_IO and libc' clone() on ppc64le, > > where flags parameter appears to be sign extended before it's passed > > to kernel syscall. > > Interesting, thanks for reporting this. The manual page clearly > documents the interface as; > > int clone(int (*fn)(void *), void *child_stack, > int flags, void *arg, ... > /* pid_t *ptid, void *newtls, pid_t *ctid */ ); > > But the kernel uses unsigned long for clone_flags. This looks like an > unintended userspace ABI breakage. > > Rather than dropping the invalid flags check in the kernel (having the > check is valuable), I think the parameter should be changed to int or > unsigned int, or the flags check should be written in such a way that > it disregards bits that result from sign extensions: fail if > clone_flags != (int) clone_flags, otherwise set clone_flags = 0xFFFFFFFF. Let me take a quick look. This just affects legacy clone it seems. Maybe we can do something to avoid that breakage without having to change extensions. Legacy clone shouldn't care about invalid flag checking since it never used to do that anyway, only clone3() does. Christian ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 7:59 ` Christian Brauner @ 2020-05-05 8:02 ` Christian Brauner 0 siblings, 0 replies; 19+ messages in thread From: Christian Brauner @ 2020-05-05 8:02 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 09:59:38AM +0200, Christian Brauner wrote: > On Tue, May 05, 2020 at 09:49:50AM +0200, Florian Weimer wrote: > > * Jan Stancek via Libc-alpha: > > > > > I'm seeing an issue with CLONE_IO and libc' clone() on ppc64le, > > > where flags parameter appears to be sign extended before it's passed > > > to kernel syscall. > > > > Interesting, thanks for reporting this. The manual page clearly > > documents the interface as; > > > > int clone(int (*fn)(void *), void *child_stack, > > int flags, void *arg, ... > > /* pid_t *ptid, void *newtls, pid_t *ctid */ ); > > > > But the kernel uses unsigned long for clone_flags. This looks like an > > unintended userspace ABI breakage. > > > > Rather than dropping the invalid flags check in the kernel (having the > > check is valuable), I think the parameter should be changed to int or > > unsigned int, or the flags check should be written in such a way that > > it disregards bits that result from sign extensions: fail if > > clone_flags != (int) clone_flags, otherwise set clone_flags = 0xFFFFFFFF. > > Let me take a quick look. This just affects legacy clone it seems. Maybe > we can do something to avoid that breakage without having to change > extensions. Legacy clone shouldn't care about invalid flag checking > since it never used to do that anyway, only clone3() does. Oh I see. Ugh. Christian ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 7:49 ` Florian Weimer 2020-05-05 7:59 ` Christian Brauner @ 2020-05-05 8:32 ` Christian Brauner 2020-05-05 8:58 ` Jan Stancek 2020-05-05 9:05 ` Florian Weimer 1 sibling, 2 replies; 19+ messages in thread From: Christian Brauner @ 2020-05-05 8:32 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 09:49:50AM +0200, Florian Weimer wrote: > * Jan Stancek via Libc-alpha: > > > I'm seeing an issue with CLONE_IO and libc' clone() on ppc64le, > > where flags parameter appears to be sign extended before it's passed > > to kernel syscall. > > Interesting, thanks for reporting this. The manual page clearly > documents the interface as; > > int clone(int (*fn)(void *), void *child_stack, > int flags, void *arg, ... > /* pid_t *ptid, void *newtls, pid_t *ctid */ ); > > But the kernel uses unsigned long for clone_flags. This looks like an > unintended userspace ABI breakage. > > Rather than dropping the invalid flags check in the kernel (having the > check is valuable), I think the parameter should be changed to int or > unsigned int, or the flags check should be written in such a way that > it disregards bits that result from sign extensions: fail if > clone_flags != (int) clone_flags, otherwise set clone_flags = 0xFFFFFFFF. Hm, as you observed, the kernel always defines the flags argument as unsigned long and afaict this has been the case since forever. So I'm not sure why the userspace wrapper is defined as taking an int for flags in the first place(?). Be that as it may, the current legacy clone syscall has been switched over a while ago to do: { struct kernel_clone_args args = { .flags = (clone_flags & ~CSIGNAL), .pidfd = parent_tidptr, .child_tid = child_tidptr, .parent_tid = parent_tidptr, .exit_signal = (clone_flags & CSIGNAL), .stack = newsp, .tls = tls, }; if (!legacy_clone_args_valid(&args)) return -EINVAL; return _do_fork(&args); } where legacy_clone_args_valid() does: bool legacy_clone_args_valid(const struct kernel_clone_args *kargs) { /* clone(CLONE_PIDFD) uses parent_tidptr to return a pidfd */ if ((kargs->flags & CLONE_PIDFD) && (kargs->flags & CLONE_PARENT_SETTID)) return false; return true; } And I wonder if we should just do: if (clone_flags & ~CLONE_LEGACY_FLAGS) /* return -EINVAL; but that might be a change in behavior, as legacy clone _never_ failed when invalid values where passed. (Note, that CLONE_LEGACY_FLAGS is already defined as #define CLONE_LEGACY_FLAGS 0xffffffffULL and used in clone3().) So the better option might be to do what you suggested, Florian: if (clone_flags & ~CLONE_LEGACY_FLAGS) clone_flags = CLONE_LEGACY_FLAGS? and move on? (Btw, iiuc, this would've always been a problem, right? In the sense that userspace only got away with this because there were no additional flags defined and couldn't.) Christian ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 8:32 ` Christian Brauner @ 2020-05-05 8:58 ` Jan Stancek 2020-05-05 9:05 ` Florian Weimer 1 sibling, 0 replies; 19+ messages in thread From: Jan Stancek @ 2020-05-05 8:58 UTC (permalink / raw) To: ltp ----- Original Message ----- > (Btw, iiuc, this would've always been a problem, right? In the sense that > userspace only got away with this because there were no additional flags > defined and couldn't.) I think so, 4.20 behaves same way, but kernel ignores any extra flags: clone(child_stack=0x1011ffe0, flags=CLONE_IO|0xffffffff00000000|SIGCHLD) = 1061 ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 8:32 ` Christian Brauner 2020-05-05 8:58 ` Jan Stancek @ 2020-05-05 9:05 ` Florian Weimer 2020-05-05 9:15 ` Christian Brauner 1 sibling, 1 reply; 19+ messages in thread From: Florian Weimer @ 2020-05-05 9:05 UTC (permalink / raw) To: ltp * Christian Brauner: > Hm, as you observed, the kernel always defines the flags argument as > unsigned long and afaict this has been the case since forever. So I'm > not sure why the userspace wrapper is defined as taking an int for > flags in the first place(?). We have different types in many places. Sometimes this is required by POSIX, sometimes not. See the recent effort to fix the x32 interface (mostly for size_t arguments). Flags arguments that depend on the word size are incompatible with portable system calls, so arguably what the kernel is doing here is wrong: the additional bits can never be used for anything. > And I wonder if we should just do: > > if (clone_flags & ~CLONE_LEGACY_FLAGS) /* > return -EINVAL; > > but that might be a change in behavior, as legacy clone _never_ failed > when invalid values where passed. Have any flags been added recently? > (Note, that CLONE_LEGACY_FLAGS is already defined as > #define CLONE_LEGACY_FLAGS 0xffffffffULL > and used in clone3().) > > So the better option might be to do what you suggested, Florian: > if (clone_flags & ~CLONE_LEGACY_FLAGS) > clone_flags = CLONE_LEGACY_FLAGS? > and move on? Not sure what you are suggesting here. Do you mean an unconditional masking of excess bits? clone_flags &= CLONE_LEGACY_FLAGS; I think I would prefer this: /* Userspace may have passed a sign-extended int value. */ if (clone_flags != (int) clone_flags) /* return -EINVAL; clone_flags = (unsigned) clone_flags; > (Btw, iiuc, this would've always been a problem, right? In the sense that > userspace only got away with this because there were no additional flags > defined and couldn't.) It depends on how the clone system call wrapper is implemented, and what the C function call ABI is like. In some cases, you probably get zero-extension, as expected by the kernel. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 9:05 ` Florian Weimer @ 2020-05-05 9:15 ` Christian Brauner 2020-05-05 9:36 ` Florian Weimer 0 siblings, 1 reply; 19+ messages in thread From: Christian Brauner @ 2020-05-05 9:15 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 11:05:37AM +0200, Florian Weimer wrote: > * Christian Brauner: > > > Hm, as you observed, the kernel always defines the flags argument as > > unsigned long and afaict this has been the case since forever. So I'm > > not sure why the userspace wrapper is defined as taking an int for > > flags in the first place(?). > > We have different types in many places. Sometimes this is required by > POSIX, sometimes not. See the recent effort to fix the x32 interface > (mostly for size_t arguments). > > Flags arguments that depend on the word size are incompatible with > portable system calls, so arguably what the kernel is doing here is > wrong: the additional bits can never be used for anything. I mean, the unsigned long is odd. Most syscalls are using unsigned int and new ones are basically expected to. > > > And I wonder if we should just do: > > > > if (clone_flags & ~CLONE_LEGACY_FLAGS) /* > > return -EINVAL; > > > > but that might be a change in behavior, as legacy clone _never_ failed > > when invalid values where passed. > > Have any flags been added recently? /* Flags for the clone3() syscall. */ #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ > > > (Note, that CLONE_LEGACY_FLAGS is already defined as > > #define CLONE_LEGACY_FLAGS 0xffffffffULL > > and used in clone3().) > > > > So the better option might be to do what you suggested, Florian: > > if (clone_flags & ~CLONE_LEGACY_FLAGS) > > clone_flags = CLONE_LEGACY_FLAGS? > > and move on? > > Not sure what you are suggesting here. Do you mean an unconditional > masking of excess bits? > > clone_flags &= CLONE_LEGACY_FLAGS; > > I think I would prefer this: > > /* Userspace may have passed a sign-extended int value. */ > if (clone_flags != (int) clone_flags) /* > return -EINVAL; > clone_flags = (unsigned) clone_flags; My worry is that this will cause regressions because clone() has never failed on invalid flag values. I was looking for a way to not have this problem. But given what you say below this change might be ok/worth risking? > > > (Btw, iiuc, this would've always been a problem, right? In the sense that > > userspace only got away with this because there were no additional flags > > defined and couldn't.) > > It depends on how the clone system call wrapper is implemented, and > what the C function call ABI is like. In some cases, you probably get > zero-extension, as expected by the kernel. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 9:15 ` Christian Brauner @ 2020-05-05 9:36 ` Florian Weimer 2020-05-05 9:58 ` Christian Brauner 0 siblings, 1 reply; 19+ messages in thread From: Florian Weimer @ 2020-05-05 9:36 UTC (permalink / raw) To: ltp * Christian Brauner: >> Have any flags been added recently? > > /* Flags for the clone3() syscall. */ > #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ > #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ Are those flags expected to be compatible with the legacy clone interface on 64-bit architectures? >> > (Note, that CLONE_LEGACY_FLAGS is already defined as >> > #define CLONE_LEGACY_FLAGS 0xffffffffULL >> > and used in clone3().) >> > >> > So the better option might be to do what you suggested, Florian: >> > if (clone_flags & ~CLONE_LEGACY_FLAGS) >> > clone_flags = CLONE_LEGACY_FLAGS? >> > and move on? >> >> Not sure what you are suggesting here. Do you mean an unconditional >> masking of excess bits? >> >> clone_flags &= CLONE_LEGACY_FLAGS; >> >> I think I would prefer this: >> >> /* Userspace may have passed a sign-extended int value. */ >> if (clone_flags != (int) clone_flags) /* >> return -EINVAL; >> clone_flags = (unsigned) clone_flags; > > My worry is that this will cause regressions because clone() has never > failed on invalid flag values. I was looking for a way to not have this > problem. But given what you say below this change might be ok/worth > risking? I was under the impression that current kernels perform such a check, causing the problem with sign extension. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 9:36 ` Florian Weimer @ 2020-05-05 9:58 ` Christian Brauner 2020-05-05 10:21 ` Christian Brauner 2020-05-05 11:08 ` Florian Weimer 0 siblings, 2 replies; 19+ messages in thread From: Christian Brauner @ 2020-05-05 9:58 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 11:36:36AM +0200, Florian Weimer wrote: > * Christian Brauner: > >> Have any flags been added recently? > > > > /* Flags for the clone3() syscall. */ > > #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ > > #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ > > Are those flags expected to be compatible with the legacy clone > interface on 64-bit architectures? No, they are clone3() only. clone() is deprecated wrt to new features. > > >> > (Note, that CLONE_LEGACY_FLAGS is already defined as > >> > #define CLONE_LEGACY_FLAGS 0xffffffffULL > >> > and used in clone3().) > >> > > >> > So the better option might be to do what you suggested, Florian: > >> > if (clone_flags & ~CLONE_LEGACY_FLAGS) > >> > clone_flags = CLONE_LEGACY_FLAGS? > >> > and move on? > >> > >> Not sure what you are suggesting here. Do you mean an unconditional > >> masking of excess bits? > >> > >> clone_flags &= CLONE_LEGACY_FLAGS; > >> > >> I think I would prefer this: > >> > >> /* Userspace may have passed a sign-extended int value. */ > >> if (clone_flags != (int) clone_flags) /* > >> return -EINVAL; > >> clone_flags = (unsigned) clone_flags; > > > > My worry is that this will cause regressions because clone() has never > > failed on invalid flag values. I was looking for a way to not have this > > problem. But given what you say below this change might be ok/worth > > risking? > > I was under the impression that current kernels perform such a check, > causing the problem with sign extension. No, it doesn't, it never did. It only does it for clone3(). Legacy clone() _never_ reported an error no matter if you passed garbage flags or not. That's why we can't re-use clone() flags that have essentially been removed in kernel version before I could even program. :) Unless I'm misunderstanding what check you're referring to. If I understood the original mail correctly, then the issue is caused by an interaction with sign extension and a the new flag value CLONE_INTO_CGROUP being defined. So from what I gather from Jan's initial mail is that when clone() is called on ppc64le with the CLONE_IO|SIGCHLD flag: clone(do_child, stack+1024*1024, CLONE_IO|SIGCHLD, NULL, NULL, NULL, NULL); that the sign extension causes bits to be set that raise the CLONE_INTO_CGROUP flag. And since the do_fork() codepath is the same for legacy clone() and clone3() the kernel will think that someone requested CLONE_INTO_CGROUP but hasn't passed a valid fd to a cgroup. If that is the only issue here then couldn't we just do: clone_flags &= ~CLONE3_ONLY_FLAGS? and move on, i.e. all future clone3() flags we'll just remove since we can assume that they have been accidently set. Even if they have been intentionally set we can just ignore them since that's in line with legacy clone()'s (questionable) tradition of ignoring unknown flags. Thoughts? Or am I missing some subtlety here? Christian ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 9:58 ` Christian Brauner @ 2020-05-05 10:21 ` Christian Brauner 2020-05-05 11:34 ` Florian Weimer 2020-05-05 11:35 ` Dmitry V. Levin 2020-05-05 11:08 ` Florian Weimer 1 sibling, 2 replies; 19+ messages in thread From: Christian Brauner @ 2020-05-05 10:21 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 11:58:13AM +0200, Christian Brauner wrote: > On Tue, May 05, 2020 at 11:36:36AM +0200, Florian Weimer wrote: > > * Christian Brauner: > > >> Have any flags been added recently? > > > > > > /* Flags for the clone3() syscall. */ > > > #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ > > > #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ > > > > Are those flags expected to be compatible with the legacy clone > > interface on 64-bit architectures? > > No, they are clone3() only. clone() is deprecated wrt to new features. > > > > > >> > (Note, that CLONE_LEGACY_FLAGS is already defined as > > >> > #define CLONE_LEGACY_FLAGS 0xffffffffULL > > >> > and used in clone3().) > > >> > > > >> > So the better option might be to do what you suggested, Florian: > > >> > if (clone_flags & ~CLONE_LEGACY_FLAGS) > > >> > clone_flags = CLONE_LEGACY_FLAGS? > > >> > and move on? > > >> > > >> Not sure what you are suggesting here. Do you mean an unconditional > > >> masking of excess bits? > > >> > > >> clone_flags &= CLONE_LEGACY_FLAGS; > > >> > > >> I think I would prefer this: > > >> > > >> /* Userspace may have passed a sign-extended int value. */ > > >> if (clone_flags != (int) clone_flags) /* > > >> return -EINVAL; > > >> clone_flags = (unsigned) clone_flags; > > > > > > My worry is that this will cause regressions because clone() has never > > > failed on invalid flag values. I was looking for a way to not have this > > > problem. But given what you say below this change might be ok/worth > > > risking? > > > > I was under the impression that current kernels perform such a check, > > causing the problem with sign extension. > > No, it doesn't, it never did. It only does it for clone3(). Legacy > clone() _never_ reported an error no matter if you passed garbage flags > or not. That's why we can't re-use clone() flags that have essentially > been removed in kernel version before I could even program. :) Unless > I'm misunderstanding what check you're referring to. > > If I understood the original mail correctly, then the issue is caused by > an interaction with sign extension and a the new flag value > CLONE_INTO_CGROUP being defined. > So from what I gather from Jan's initial mail is that when clone() is > called on ppc64le with the CLONE_IO|SIGCHLD flag: > clone(do_child, stack+1024*1024, CLONE_IO|SIGCHLD, NULL, NULL, NULL, NULL); > that the sign extension causes bits to be set that raise the > CLONE_INTO_CGROUP flag. And since the do_fork() codepath is the same for > legacy clone() and clone3() the kernel will think that someone requested > CLONE_INTO_CGROUP but hasn't passed a valid fd to a cgroup. If that is > the only issue here then couldn't we just do: > > clone_flags &= ~CLONE3_ONLY_FLAGS? > > and move on, i.e. all future clone3() flags we'll just remove since we > can assume that they have been accidently set. Even if they have been > intentionally set we can just ignore them since that's in line with > legacy clone()'s (questionable) tradition of ignoring unknown flags. > Thoughts? Or am I missing some subtlety here? So essentially: diff --git a/kernel/fork.c b/kernel/fork.c index 8c700f881d92..e192089f133e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2569,12 +2569,15 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp, unsigned long, tls) #endif { + /* Ignore the upper 32 bits. */ + unsigned int flags = (clone_flags & 0xfffffff); + struct kernel_clone_args args = { - .flags = (clone_flags & ~CSIGNAL), + .flags = (flags & ~CSIGNAL), .pidfd = parent_tidptr, .child_tid = child_tidptr, .parent_tid = parent_tidptr, - .exit_signal = (clone_flags & CSIGNAL), + .exit_signal = (flags & CSIGNAL), .stack = newsp, .tls = tls, } (Note that kernel_clone_args->flags is a 64 bit unsigned integer.) Christian ^ permalink raw reply related [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 10:21 ` Christian Brauner @ 2020-05-05 11:34 ` Florian Weimer 2020-05-05 11:35 ` Dmitry V. Levin 1 sibling, 0 replies; 19+ messages in thread From: Florian Weimer @ 2020-05-05 11:34 UTC (permalink / raw) To: ltp * Christian Brauner: > diff --git a/kernel/fork.c b/kernel/fork.c > index 8c700f881d92..e192089f133e 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -2569,12 +2569,15 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp, > unsigned long, tls) > #endif > { > + /* Ignore the upper 32 bits. */ > + unsigned int flags = (clone_flags & 0xfffffff); > + > struct kernel_clone_args args = { > - .flags = (clone_flags & ~CSIGNAL), > + .flags = (flags & ~CSIGNAL), > .pidfd = parent_tidptr, > .child_tid = child_tidptr, > .parent_tid = parent_tidptr, > - .exit_signal = (clone_flags & CSIGNAL), > + .exit_signal = (flags & CSIGNAL), > .stack = newsp, > .tls = tls, > } > > (Note that kernel_clone_args->flags is a 64 bit unsigned integer.) This looks reasonable to me, but I have not tested it. I think it will restore the expected no-check behavior for clone flags. ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 10:21 ` Christian Brauner 2020-05-05 11:34 ` Florian Weimer @ 2020-05-05 11:35 ` Dmitry V. Levin 2020-05-05 11:43 ` Christian Brauner 1 sibling, 1 reply; 19+ messages in thread From: Dmitry V. Levin @ 2020-05-05 11:35 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 12:21:54PM +0200, Christian Brauner wrote: > On Tue, May 05, 2020 at 11:58:13AM +0200, Christian Brauner wrote: > > On Tue, May 05, 2020 at 11:36:36AM +0200, Florian Weimer wrote: > > > * Christian Brauner: > > > >> Have any flags been added recently? > > > > > > > > /* Flags for the clone3() syscall. */ > > > > #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ > > > > #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ > > > > > > Are those flags expected to be compatible with the legacy clone > > > interface on 64-bit architectures? > > > > No, they are clone3() only. clone() is deprecated wrt to new features. > > > > > > > > >> > (Note, that CLONE_LEGACY_FLAGS is already defined as > > > >> > #define CLONE_LEGACY_FLAGS 0xffffffffULL > > > >> > and used in clone3().) > > > >> > > > > >> > So the better option might be to do what you suggested, Florian: > > > >> > if (clone_flags & ~CLONE_LEGACY_FLAGS) > > > >> > clone_flags = CLONE_LEGACY_FLAGS? > > > >> > and move on? > > > >> > > > >> Not sure what you are suggesting here. Do you mean an unconditional > > > >> masking of excess bits? > > > >> > > > >> clone_flags &= CLONE_LEGACY_FLAGS; > > > >> > > > >> I think I would prefer this: > > > >> > > > >> /* Userspace may have passed a sign-extended int value. */ > > > >> if (clone_flags != (int) clone_flags) /* > > > >> return -EINVAL; > > > >> clone_flags = (unsigned) clone_flags; > > > > > > > > My worry is that this will cause regressions because clone() has never > > > > failed on invalid flag values. I was looking for a way to not have this > > > > problem. But given what you say below this change might be ok/worth > > > > risking? > > > > > > I was under the impression that current kernels perform such a check, > > > causing the problem with sign extension. > > > > No, it doesn't, it never did. It only does it for clone3(). Legacy > > clone() _never_ reported an error no matter if you passed garbage flags > > or not. That's why we can't re-use clone() flags that have essentially > > been removed in kernel version before I could even program. :) Unless > > I'm misunderstanding what check you're referring to. > > > > If I understood the original mail correctly, then the issue is caused by > > an interaction with sign extension and a the new flag value > > CLONE_INTO_CGROUP being defined. > > So from what I gather from Jan's initial mail is that when clone() is > > called on ppc64le with the CLONE_IO|SIGCHLD flag: > > clone(do_child, stack+1024*1024, CLONE_IO|SIGCHLD, NULL, NULL, NULL, NULL); > > that the sign extension causes bits to be set that raise the > > CLONE_INTO_CGROUP flag. And since the do_fork() codepath is the same for > > legacy clone() and clone3() the kernel will think that someone requested > > CLONE_INTO_CGROUP but hasn't passed a valid fd to a cgroup. If that is > > the only issue here then couldn't we just do: > > > > clone_flags &= ~CLONE3_ONLY_FLAGS? > > > > and move on, i.e. all future clone3() flags we'll just remove since we > > can assume that they have been accidently set. Even if they have been > > intentionally set we can just ignore them since that's in line with > > legacy clone()'s (questionable) tradition of ignoring unknown flags. > > Thoughts? Or am I missing some subtlety here? > > So essentially: > > diff --git a/kernel/fork.c b/kernel/fork.c > index 8c700f881d92..e192089f133e 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -2569,12 +2569,15 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp, > unsigned long, tls) > #endif > { > + /* Ignore the upper 32 bits. */ > + unsigned int flags = (clone_flags & 0xfffffff); Not enough f's. What about unsigned int flags = (unsigned int) clone_flags; instead? -- ldv ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 11:35 ` Dmitry V. Levin @ 2020-05-05 11:43 ` Christian Brauner 2020-05-05 11:49 ` Dmitry V. Levin 0 siblings, 1 reply; 19+ messages in thread From: Christian Brauner @ 2020-05-05 11:43 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 02:35:14PM +0300, Dmitry V. Levin wrote: > On Tue, May 05, 2020 at 12:21:54PM +0200, Christian Brauner wrote: > > On Tue, May 05, 2020 at 11:58:13AM +0200, Christian Brauner wrote: > > > On Tue, May 05, 2020 at 11:36:36AM +0200, Florian Weimer wrote: > > > > * Christian Brauner: > > > > >> Have any flags been added recently? > > > > > > > > > > /* Flags for the clone3() syscall. */ > > > > > #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ > > > > > #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ > > > > > > > > Are those flags expected to be compatible with the legacy clone > > > > interface on 64-bit architectures? > > > > > > No, they are clone3() only. clone() is deprecated wrt to new features. > > > > > > > > > > > >> > (Note, that CLONE_LEGACY_FLAGS is already defined as > > > > >> > #define CLONE_LEGACY_FLAGS 0xffffffffULL > > > > >> > and used in clone3().) > > > > >> > > > > > >> > So the better option might be to do what you suggested, Florian: > > > > >> > if (clone_flags & ~CLONE_LEGACY_FLAGS) > > > > >> > clone_flags = CLONE_LEGACY_FLAGS? > > > > >> > and move on? > > > > >> > > > > >> Not sure what you are suggesting here. Do you mean an unconditional > > > > >> masking of excess bits? > > > > >> > > > > >> clone_flags &= CLONE_LEGACY_FLAGS; > > > > >> > > > > >> I think I would prefer this: > > > > >> > > > > >> /* Userspace may have passed a sign-extended int value. */ > > > > >> if (clone_flags != (int) clone_flags) /* > > > > >> return -EINVAL; > > > > >> clone_flags = (unsigned) clone_flags; > > > > > > > > > > My worry is that this will cause regressions because clone() has never > > > > > failed on invalid flag values. I was looking for a way to not have this > > > > > problem. But given what you say below this change might be ok/worth > > > > > risking? > > > > > > > > I was under the impression that current kernels perform such a check, > > > > causing the problem with sign extension. > > > > > > No, it doesn't, it never did. It only does it for clone3(). Legacy > > > clone() _never_ reported an error no matter if you passed garbage flags > > > or not. That's why we can't re-use clone() flags that have essentially > > > been removed in kernel version before I could even program. :) Unless > > > I'm misunderstanding what check you're referring to. > > > > > > If I understood the original mail correctly, then the issue is caused by > > > an interaction with sign extension and a the new flag value > > > CLONE_INTO_CGROUP being defined. > > > So from what I gather from Jan's initial mail is that when clone() is > > > called on ppc64le with the CLONE_IO|SIGCHLD flag: > > > clone(do_child, stack+1024*1024, CLONE_IO|SIGCHLD, NULL, NULL, NULL, NULL); > > > that the sign extension causes bits to be set that raise the > > > CLONE_INTO_CGROUP flag. And since the do_fork() codepath is the same for > > > legacy clone() and clone3() the kernel will think that someone requested > > > CLONE_INTO_CGROUP but hasn't passed a valid fd to a cgroup. If that is > > > the only issue here then couldn't we just do: > > > > > > clone_flags &= ~CLONE3_ONLY_FLAGS? > > > > > > and move on, i.e. all future clone3() flags we'll just remove since we > > > can assume that they have been accidently set. Even if they have been > > > intentionally set we can just ignore them since that's in line with > > > legacy clone()'s (questionable) tradition of ignoring unknown flags. > > > Thoughts? Or am I missing some subtlety here? > > > > So essentially: > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > index 8c700f881d92..e192089f133e 100644 > > --- a/kernel/fork.c > > +++ b/kernel/fork.c > > @@ -2569,12 +2569,15 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp, > > unsigned long, tls) > > #endif > > { > > + /* Ignore the upper 32 bits. */ > > + unsigned int flags = (clone_flags & 0xfffffff); > > Not enough f's. What about > unsigned int flags = (unsigned int) clone_flags; > instead? Yeah, I guess that should do it. Though maybe: u32 flags = (u32)clone_flags; is more transparent since we're stating visually "we're capping this to 32 bits"? Christian ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 11:43 ` Christian Brauner @ 2020-05-05 11:49 ` Dmitry V. Levin 2020-05-05 11:57 ` Christian Brauner 0 siblings, 1 reply; 19+ messages in thread From: Dmitry V. Levin @ 2020-05-05 11:49 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 01:43:32PM +0200, Christian Brauner wrote: > On Tue, May 05, 2020 at 02:35:14PM +0300, Dmitry V. Levin wrote: > > On Tue, May 05, 2020 at 12:21:54PM +0200, Christian Brauner wrote: > > > On Tue, May 05, 2020 at 11:58:13AM +0200, Christian Brauner wrote: > > > > On Tue, May 05, 2020 at 11:36:36AM +0200, Florian Weimer wrote: > > > > > * Christian Brauner: > > > > > >> Have any flags been added recently? > > > > > > > > > > > > /* Flags for the clone3() syscall. */ > > > > > > #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ > > > > > > #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ > > > > > > > > > > Are those flags expected to be compatible with the legacy clone > > > > > interface on 64-bit architectures? > > > > > > > > No, they are clone3() only. clone() is deprecated wrt to new features. > > > > > > > > > > > > > > >> > (Note, that CLONE_LEGACY_FLAGS is already defined as > > > > > >> > #define CLONE_LEGACY_FLAGS 0xffffffffULL > > > > > >> > and used in clone3().) > > > > > >> > > > > > > >> > So the better option might be to do what you suggested, Florian: > > > > > >> > if (clone_flags & ~CLONE_LEGACY_FLAGS) > > > > > >> > clone_flags = CLONE_LEGACY_FLAGS? > > > > > >> > and move on? > > > > > >> > > > > > >> Not sure what you are suggesting here. Do you mean an unconditional > > > > > >> masking of excess bits? > > > > > >> > > > > > >> clone_flags &= CLONE_LEGACY_FLAGS; > > > > > >> > > > > > >> I think I would prefer this: > > > > > >> > > > > > >> /* Userspace may have passed a sign-extended int value. */ > > > > > >> if (clone_flags != (int) clone_flags) /* > > > > > >> return -EINVAL; > > > > > >> clone_flags = (unsigned) clone_flags; > > > > > > > > > > > > My worry is that this will cause regressions because clone() has never > > > > > > failed on invalid flag values. I was looking for a way to not have this > > > > > > problem. But given what you say below this change might be ok/worth > > > > > > risking? > > > > > > > > > > I was under the impression that current kernels perform such a check, > > > > > causing the problem with sign extension. > > > > > > > > No, it doesn't, it never did. It only does it for clone3(). Legacy > > > > clone() _never_ reported an error no matter if you passed garbage flags > > > > or not. That's why we can't re-use clone() flags that have essentially > > > > been removed in kernel version before I could even program. :) Unless > > > > I'm misunderstanding what check you're referring to. > > > > > > > > If I understood the original mail correctly, then the issue is caused by > > > > an interaction with sign extension and a the new flag value > > > > CLONE_INTO_CGROUP being defined. > > > > So from what I gather from Jan's initial mail is that when clone() is > > > > called on ppc64le with the CLONE_IO|SIGCHLD flag: > > > > clone(do_child, stack+1024*1024, CLONE_IO|SIGCHLD, NULL, NULL, NULL, NULL); > > > > that the sign extension causes bits to be set that raise the > > > > CLONE_INTO_CGROUP flag. And since the do_fork() codepath is the same for > > > > legacy clone() and clone3() the kernel will think that someone requested > > > > CLONE_INTO_CGROUP but hasn't passed a valid fd to a cgroup. If that is > > > > the only issue here then couldn't we just do: > > > > > > > > clone_flags &= ~CLONE3_ONLY_FLAGS? > > > > > > > > and move on, i.e. all future clone3() flags we'll just remove since we > > > > can assume that they have been accidently set. Even if they have been > > > > intentionally set we can just ignore them since that's in line with > > > > legacy clone()'s (questionable) tradition of ignoring unknown flags. > > > > Thoughts? Or am I missing some subtlety here? > > > > > > So essentially: > > > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > > index 8c700f881d92..e192089f133e 100644 > > > --- a/kernel/fork.c > > > +++ b/kernel/fork.c > > > @@ -2569,12 +2569,15 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp, > > > unsigned long, tls) > > > #endif > > > { > > > + /* Ignore the upper 32 bits. */ > > > + unsigned int flags = (clone_flags & 0xfffffff); > > > > Not enough f's. What about > > unsigned int flags = (unsigned int) clone_flags; > > instead? > > Yeah, I guess that should do it. Though maybe: > > u32 flags = (u32)clone_flags; > > is more transparent since we're stating visually "we're capping this to > 32 bits"? Yes, this should work as well. I wonder whether we could just change the type of clone_flags to unsigned int in this function. -- ldv ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 11:49 ` Dmitry V. Levin @ 2020-05-05 11:57 ` Christian Brauner 0 siblings, 0 replies; 19+ messages in thread From: Christian Brauner @ 2020-05-05 11:57 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 02:49:49PM +0300, Dmitry V. Levin wrote: > On Tue, May 05, 2020 at 01:43:32PM +0200, Christian Brauner wrote: > > On Tue, May 05, 2020 at 02:35:14PM +0300, Dmitry V. Levin wrote: > > > On Tue, May 05, 2020 at 12:21:54PM +0200, Christian Brauner wrote: > > > > On Tue, May 05, 2020 at 11:58:13AM +0200, Christian Brauner wrote: > > > > > On Tue, May 05, 2020 at 11:36:36AM +0200, Florian Weimer wrote: > > > > > > * Christian Brauner: > > > > > > >> Have any flags been added recently? > > > > > > > > > > > > > > /* Flags for the clone3() syscall. */ > > > > > > > #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ > > > > > > > #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ > > > > > > > > > > > > Are those flags expected to be compatible with the legacy clone > > > > > > interface on 64-bit architectures? > > > > > > > > > > No, they are clone3() only. clone() is deprecated wrt to new features. > > > > > > > > > > > > > > > > > >> > (Note, that CLONE_LEGACY_FLAGS is already defined as > > > > > > >> > #define CLONE_LEGACY_FLAGS 0xffffffffULL > > > > > > >> > and used in clone3().) > > > > > > >> > > > > > > > >> > So the better option might be to do what you suggested, Florian: > > > > > > >> > if (clone_flags & ~CLONE_LEGACY_FLAGS) > > > > > > >> > clone_flags = CLONE_LEGACY_FLAGS? > > > > > > >> > and move on? > > > > > > >> > > > > > > >> Not sure what you are suggesting here. Do you mean an unconditional > > > > > > >> masking of excess bits? > > > > > > >> > > > > > > >> clone_flags &= CLONE_LEGACY_FLAGS; > > > > > > >> > > > > > > >> I think I would prefer this: > > > > > > >> > > > > > > >> /* Userspace may have passed a sign-extended int value. */ > > > > > > >> if (clone_flags != (int) clone_flags) /* > > > > > > >> return -EINVAL; > > > > > > >> clone_flags = (unsigned) clone_flags; > > > > > > > > > > > > > > My worry is that this will cause regressions because clone() has never > > > > > > > failed on invalid flag values. I was looking for a way to not have this > > > > > > > problem. But given what you say below this change might be ok/worth > > > > > > > risking? > > > > > > > > > > > > I was under the impression that current kernels perform such a check, > > > > > > causing the problem with sign extension. > > > > > > > > > > No, it doesn't, it never did. It only does it for clone3(). Legacy > > > > > clone() _never_ reported an error no matter if you passed garbage flags > > > > > or not. That's why we can't re-use clone() flags that have essentially > > > > > been removed in kernel version before I could even program. :) Unless > > > > > I'm misunderstanding what check you're referring to. > > > > > > > > > > If I understood the original mail correctly, then the issue is caused by > > > > > an interaction with sign extension and a the new flag value > > > > > CLONE_INTO_CGROUP being defined. > > > > > So from what I gather from Jan's initial mail is that when clone() is > > > > > called on ppc64le with the CLONE_IO|SIGCHLD flag: > > > > > clone(do_child, stack+1024*1024, CLONE_IO|SIGCHLD, NULL, NULL, NULL, NULL); > > > > > that the sign extension causes bits to be set that raise the > > > > > CLONE_INTO_CGROUP flag. And since the do_fork() codepath is the same for > > > > > legacy clone() and clone3() the kernel will think that someone requested > > > > > CLONE_INTO_CGROUP but hasn't passed a valid fd to a cgroup. If that is > > > > > the only issue here then couldn't we just do: > > > > > > > > > > clone_flags &= ~CLONE3_ONLY_FLAGS? > > > > > > > > > > and move on, i.e. all future clone3() flags we'll just remove since we > > > > > can assume that they have been accidently set. Even if they have been > > > > > intentionally set we can just ignore them since that's in line with > > > > > legacy clone()'s (questionable) tradition of ignoring unknown flags. > > > > > Thoughts? Or am I missing some subtlety here? > > > > > > > > So essentially: > > > > > > > > diff --git a/kernel/fork.c b/kernel/fork.c > > > > index 8c700f881d92..e192089f133e 100644 > > > > --- a/kernel/fork.c > > > > +++ b/kernel/fork.c > > > > @@ -2569,12 +2569,15 @@ SYSCALL_DEFINE5(clone, unsigned long, clone_flags, unsigned long, newsp, > > > > unsigned long, tls) > > > > #endif > > > > { > > > > + /* Ignore the upper 32 bits. */ > > > > + unsigned int flags = (clone_flags & 0xfffffff); > > > > > > Not enough f's. What about > > > unsigned int flags = (unsigned int) clone_flags; > > > instead? > > > > Yeah, I guess that should do it. Though maybe: > > > > u32 flags = (u32)clone_flags; > > > > is more transparent since we're stating visually "we're capping this to > > 32 bits"? > > Yes, this should work as well. > > I wonder whether we could just change the type of clone_flags to unsigned int > in this function. I think we should go with capping the flags argument for now since it'll stop the bleeding and will likely be fairly uncontroversial. Then we can bring up changing the syscall signature later which I bet is going to meet more resistance. Christian ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 9:58 ` Christian Brauner 2020-05-05 10:21 ` Christian Brauner @ 2020-05-05 11:08 ` Florian Weimer 2020-05-05 11:26 ` Christian Brauner 1 sibling, 1 reply; 19+ messages in thread From: Florian Weimer @ 2020-05-05 11:08 UTC (permalink / raw) To: ltp * Christian Brauner: > On Tue, May 05, 2020 at 11:36:36AM +0200, Florian Weimer wrote: >> * Christian Brauner: >> >> Have any flags been added recently? >> > >> > /* Flags for the clone3() syscall. */ >> > #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ >> > #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ >> >> Are those flags expected to be compatible with the legacy clone >> interface on 64-bit architectures? > > No, they are clone3() only. clone() is deprecated wrt to new features. But without masking the clone_flags argument, won't it be passed down to the implementation which is also used to back clone3? > If I understood the original mail correctly, then the issue is caused by > an interaction with sign extension and a the new flag value > CLONE_INTO_CGROUP being defined. Could be, but for that to happen, the flag would have to be passed down, no? > So from what I gather from Jan's initial mail is that when clone() is > called on ppc64le with the CLONE_IO|SIGCHLD flag: > clone(do_child, stack+1024*1024, CLONE_IO|SIGCHLD, NULL, NULL, NULL, NULL); > that the sign extension causes bits to be set that raise the > CLONE_INTO_CGROUP flag. And since the do_fork() codepath is the same for > legacy clone() and clone3() the kernel will think that someone requested > CLONE_INTO_CGROUP but hasn't passed a valid fd to a cgroup. If that is > the only issue here then couldn't we just do: > > clone_flags &= ~CLONE3_ONLY_FLAGS? > > and move on, i.e. all future clone3() flags we'll just remove since we > can assume that they have been accidently set. Even if they have been > intentionally set we can just ignore them since that's in line with > legacy clone()'s (questionable) tradition of ignoring unknown flags. > Thoughts? Or am I missing some subtlety here? What's the difference between clone_flags &= ~CLONE3_ONLY_FLAGS; and clone_flags &= (unsigned) -1; in this context? ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 11:08 ` Florian Weimer @ 2020-05-05 11:26 ` Christian Brauner 0 siblings, 0 replies; 19+ messages in thread From: Christian Brauner @ 2020-05-05 11:26 UTC (permalink / raw) To: ltp On Tue, May 05, 2020 at 01:08:02PM +0200, Florian Weimer wrote: > * Christian Brauner: > > > On Tue, May 05, 2020 at 11:36:36AM +0200, Florian Weimer wrote: > >> * Christian Brauner: > >> >> Have any flags been added recently? > >> > > >> > /* Flags for the clone3() syscall. */ > >> > #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */ > >> > #define CLONE_INTO_CGROUP 0x200000000ULL /* Clone into a specific cgroup given the right permissions. */ > >> > >> Are those flags expected to be compatible with the legacy clone > >> interface on 64-bit architectures? > > > > No, they are clone3() only. clone() is deprecated wrt to new features. > > But without masking the clone_flags argument, won't it be passed down > to the implementation which is also used to back clone3? Yeah, the masking fix should definitely go in. That problem wasn't on our radar initially at all. > > > If I understood the original mail correctly, then the issue is caused by > > an interaction with sign extension and a the new flag value > > CLONE_INTO_CGROUP being defined. > > Could be, but for that to happen, the flag would have to be passed > down, no? The strace output Jan posted: clone(child_stack=0x1011ffe0, flags=CLONE_IO|0xffffffff00000000|SIGCHLD) = -1 EBADF (Bad file descriptor) seems to suggest that 0xffffffff00000000 is filled in CLONE_IO|0xffffffff00000000|SIGCHLD which means int main(int argc, char *argv[]) { #define DUMMY_FLAG (CLONE_IO | 0xffffffff00000000 | SIGCHLD) #ifndef CLONE_INTO_CGROUP #define CLONE_INTO_CGROUP 0x200000000ULL #endif printf("%d\n", (DUMMY_FLAG & CLONE_INTO_CGROUP) > 0); exit(EXIT_SUCCESS); } will return 1. In fact, it will also return 1 with CLONE_CLEAR_SIGHAND, I believe. > > > So from what I gather from Jan's initial mail is that when clone() is > > called on ppc64le with the CLONE_IO|SIGCHLD flag: > > clone(do_child, stack+1024*1024, CLONE_IO|SIGCHLD, NULL, NULL, NULL, NULL); > > that the sign extension causes bits to be set that raise the > > CLONE_INTO_CGROUP flag. And since the do_fork() codepath is the same for > > legacy clone() and clone3() the kernel will think that someone requested > > CLONE_INTO_CGROUP but hasn't passed a valid fd to a cgroup. If that is > > the only issue here then couldn't we just do: > > > > clone_flags &= ~CLONE3_ONLY_FLAGS? > > > > and move on, i.e. all future clone3() flags we'll just remove since we > > can assume that they have been accidently set. Even if they have been > > intentionally set we can just ignore them since that's in line with > > legacy clone()'s (questionable) tradition of ignoring unknown flags. > > Thoughts? Or am I missing some subtlety here? > > What's the difference between > > clone_flags &= ~CLONE3_ONLY_FLAGS; > > and > > clone_flags &= (unsigned) -1; > > in this context? Yeah, I just used CLONE3_ONLY_FLAGS as more descriptive example. Wdyt, about the patch I proposed in the follow up mail? Christian ^ permalink raw reply [flat|nested] 19+ messages in thread
* [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 2020-05-05 7:28 ` [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 Jan Stancek 2020-05-05 7:49 ` Florian Weimer @ 2020-05-05 7:54 ` Andreas Schwab 1 sibling, 0 replies; 19+ messages in thread From: Andreas Schwab @ 2020-05-05 7:54 UTC (permalink / raw) To: ltp On Mai 05 2020, Jan Stancek via Libc-alpha wrote: > I'm seeing an issue with CLONE_IO and libc' clone() on ppc64le, > where flags parameter appears to be sign extended before it's passed > to kernel syscall. Looks like the flags argument should be declared as unsigned int, instead of int, both in glibc and in the kernel. #define CLONE_IO 0x80000000 /* Clone io context */ Andreas. -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different." ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2020-05-05 11:57 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <100149681.11244932.1588661282331.JavaMail.zimbra@redhat.com> 2020-05-05 7:28 ` [LTP] [bug?] clone(CLONE_IO) failing after kernel commit commit ef2c41cf38a7 Jan Stancek 2020-05-05 7:49 ` Florian Weimer 2020-05-05 7:59 ` Christian Brauner 2020-05-05 8:02 ` Christian Brauner 2020-05-05 8:32 ` Christian Brauner 2020-05-05 8:58 ` Jan Stancek 2020-05-05 9:05 ` Florian Weimer 2020-05-05 9:15 ` Christian Brauner 2020-05-05 9:36 ` Florian Weimer 2020-05-05 9:58 ` Christian Brauner 2020-05-05 10:21 ` Christian Brauner 2020-05-05 11:34 ` Florian Weimer 2020-05-05 11:35 ` Dmitry V. Levin 2020-05-05 11:43 ` Christian Brauner 2020-05-05 11:49 ` Dmitry V. Levin 2020-05-05 11:57 ` Christian Brauner 2020-05-05 11:08 ` Florian Weimer 2020-05-05 11:26 ` Christian Brauner 2020-05-05 7:54 ` Andreas Schwab
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.