linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
@ 2022-01-12 11:45 Naresh Kamboju
  2022-01-12 13:18 ` Christian Brauner
  0 siblings, 1 reply; 8+ messages in thread
From: Naresh Kamboju @ 2022-01-12 11:45 UTC (permalink / raw)
  To: open list, Linux-Next Mailing List, lkft-triage, LTP List,
	linux-fsdevel, regressions
  Cc: Alexey Gladkov, Sven Schnelle, Eric W. Biederman, Alexander Viro,
	Arnd Bergmann

While testing LTP syscalls with Linux next 20220110 (and till date 20220112)
on x86_64, i386, arm and arm64 the following tests failed.

tst_test.c:1365: TINFO: Timeout per run is 0h 15m 00s
getxattr05.c:87: TPASS: Got same data when acquiring the value of
system.posix_acl_access twice
getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
tst_test.c:391: TBROK: Invalid child (13545) exit value 1

fanotify17.c:176: TINFO: Test #1: Global groups limit in privileged user ns
fanotify17.c:155: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
tst_test.c:391: TBROK: Invalid child (14739) exit value 1

sendto03.c:48: TBROK: unshare(268435456) failed: ENOSPC (28)

setsockopt05.c:45: TBROK: unshare(268435456) failed: ENOSPC (28)

strace output:
--------------
[pid   481] wait4(-1, 0x7fff52f5ae8c, 0, NULL) = -1 ECHILD (No child processes)
[pid   481] clone(child_stack=NULL,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7f3af0fa7a10) = 483
strace: Process 483 attached
[pid   481] wait4(-1,  <unfinished ...>
[pid   483] unshare(CLONE_NEWUSER)      = -1 ENOSPC (No space left on device)

metadata:
  git branch: master
  git repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
  git commit: 57c149e506d5bec1b845ad1a8a631063fcac1f6e
  git describe: next-20220110
  arch: x86
  toolchain: gcc-11

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>

GOOD: next-20220107
 BAD:    next-20220110

Test logs:
https://lkft.validation.linaro.org/scheduler/job/4301888#L1474
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20220110/testrun/7253656/suite/ltp-syscalls-tests/test/getxattr05/log

compare test history:
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20220112/testrun/7277164/suite/ltp-syscalls-tests/test/getxattr05/history/

kernel-config:
https://builds.tuxbuild.com/23V6AwGvHW7H3kr6WxZZwueajVS/config

We are investigating this regression.

Steps to reproduce:
   # cd /opt/ltp
   # ./runltp -s getxattr05

--
Linaro LKFT
https://lkft.linaro.org

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
  2022-01-12 11:45 [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28) Naresh Kamboju
@ 2022-01-12 13:18 ` Christian Brauner
  2022-01-12 13:22   ` Anders Roxell
  0 siblings, 1 reply; 8+ messages in thread
From: Christian Brauner @ 2022-01-12 13:18 UTC (permalink / raw)
  To: Naresh Kamboju
  Cc: open list, Linux-Next Mailing List, lkft-triage, LTP List,
	linux-fsdevel, regressions, containers, Alexey Gladkov,
	Sven Schnelle, Eric W. Biederman, Alexander Viro, Arnd Bergmann

On Wed, Jan 12, 2022 at 05:15:37PM +0530, Naresh Kamboju wrote:
> While testing LTP syscalls with Linux next 20220110 (and till date 20220112)
> on x86_64, i386, arm and arm64 the following tests failed.
> 
> tst_test.c:1365: TINFO: Timeout per run is 0h 15m 00s
> getxattr05.c:87: TPASS: Got same data when acquiring the value of
> system.posix_acl_access twice
> getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> tst_test.c:391: TBROK: Invalid child (13545) exit value 1
> 
> fanotify17.c:176: TINFO: Test #1: Global groups limit in privileged user ns
> fanotify17.c:155: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> tst_test.c:391: TBROK: Invalid child (14739) exit value 1
> 
> sendto03.c:48: TBROK: unshare(268435456) failed: ENOSPC (28)
> 
> setsockopt05.c:45: TBROK: unshare(268435456) failed: ENOSPC (28)
> 
> strace output:
> --------------
> [pid   481] wait4(-1, 0x7fff52f5ae8c, 0, NULL) = -1 ECHILD (No child processes)
> [pid   481] clone(child_stack=NULL,
> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> child_tidptr=0x7f3af0fa7a10) = 483
> strace: Process 483 attached
> [pid   481] wait4(-1,  <unfinished ...>
> [pid   483] unshare(CLONE_NEWUSER)      = -1 ENOSPC (No space left on device)

This looks like another regression in the ucount code. Reverting the
following commit fixes it and makes the getxattr05 test work again:

commit 0315b634f933b0f12cfa82660322f6186c1aa0f4
Author: Alexey Gladkov <legion@kernel.org>
Date:   Fri Dec 17 15:48:23 2021 +0100

    ucounts: Split rlimit and ucount values and max values

    Since the semantics of maximum rlimit values are different, it would be
    better not to mix ucount and rlimit values. This will prevent the error
    of using inc_count/dec_ucount for rlimit parameters.

    This patch also renames the functions to emphasize the lack of
    connection between rlimit and ucount.

    v2:
    - Fix the array-index-out-of-bounds that was found by the lkp project.

    Reported-by: kernel test robot <oliver.sang@intel.com>
    Signed-off-by: Alexey Gladkov <legion@kernel.org>
    Link: https://lkml.kernel.org/r/73ea569042babda5cee2092423da85027ceb471f.1639752364.git.legion@kernel.org
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>

The issue only surfaces if /proc/sys/user/max_user_namespaces is
actually written to.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
  2022-01-12 13:18 ` Christian Brauner
@ 2022-01-12 13:22   ` Anders Roxell
  2022-01-12 14:02     ` Christian Brauner
  0 siblings, 1 reply; 8+ messages in thread
From: Anders Roxell @ 2022-01-12 13:22 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Naresh Kamboju, open list, Linux-Next Mailing List, lkft-triage,
	LTP List, linux-fsdevel, regressions, containers, Alexey Gladkov,
	Sven Schnelle, Eric W. Biederman, Alexander Viro, Arnd Bergmann

On Wed, 12 Jan 2022 at 14:18, Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Wed, Jan 12, 2022 at 05:15:37PM +0530, Naresh Kamboju wrote:
> > While testing LTP syscalls with Linux next 20220110 (and till date 20220112)
> > on x86_64, i386, arm and arm64 the following tests failed.
> >
> > tst_test.c:1365: TINFO: Timeout per run is 0h 15m 00s
> > getxattr05.c:87: TPASS: Got same data when acquiring the value of
> > system.posix_acl_access twice
> > getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > tst_test.c:391: TBROK: Invalid child (13545) exit value 1
> >
> > fanotify17.c:176: TINFO: Test #1: Global groups limit in privileged user ns
> > fanotify17.c:155: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > tst_test.c:391: TBROK: Invalid child (14739) exit value 1
> >
> > sendto03.c:48: TBROK: unshare(268435456) failed: ENOSPC (28)
> >
> > setsockopt05.c:45: TBROK: unshare(268435456) failed: ENOSPC (28)
> >
> > strace output:
> > --------------
> > [pid   481] wait4(-1, 0x7fff52f5ae8c, 0, NULL) = -1 ECHILD (No child processes)
> > [pid   481] clone(child_stack=NULL,
> > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> > child_tidptr=0x7f3af0fa7a10) = 483
> > strace: Process 483 attached
> > [pid   481] wait4(-1,  <unfinished ...>
> > [pid   483] unshare(CLONE_NEWUSER)      = -1 ENOSPC (No space left on device)
>
> This looks like another regression in the ucount code. Reverting the
> following commit fixes it and makes the getxattr05 test work again:
>
> commit 0315b634f933b0f12cfa82660322f6186c1aa0f4
> Author: Alexey Gladkov <legion@kernel.org>
> Date:   Fri Dec 17 15:48:23 2021 +0100
>
>     ucounts: Split rlimit and ucount values and max values
>
>     Since the semantics of maximum rlimit values are different, it would be
>     better not to mix ucount and rlimit values. This will prevent the error
>     of using inc_count/dec_ucount for rlimit parameters.
>
>     This patch also renames the functions to emphasize the lack of
>     connection between rlimit and ucount.
>
>     v2:
>     - Fix the array-index-out-of-bounds that was found by the lkp project.
>
>     Reported-by: kernel test robot <oliver.sang@intel.com>
>     Signed-off-by: Alexey Gladkov <legion@kernel.org>
>     Link: https://lkml.kernel.org/r/73ea569042babda5cee2092423da85027ceb471f.1639752364.git.legion@kernel.org
>     Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>
> The issue only surfaces if /proc/sys/user/max_user_namespaces is
> actually written to.

I did a git bisect and that pointed me to this patch too.

Cheers,
Anders

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
  2022-01-12 13:22   ` Anders Roxell
@ 2022-01-12 14:02     ` Christian Brauner
  2022-01-12 14:14       ` Alexey Gladkov
  0 siblings, 1 reply; 8+ messages in thread
From: Christian Brauner @ 2022-01-12 14:02 UTC (permalink / raw)
  To: Anders Roxell
  Cc: Naresh Kamboju, open list, Linux-Next Mailing List, lkft-triage,
	LTP List, linux-fsdevel, regressions, containers, Alexey Gladkov,
	Sven Schnelle, Eric W. Biederman, Alexander Viro, Arnd Bergmann

On Wed, Jan 12, 2022 at 02:22:42PM +0100, Anders Roxell wrote:
> On Wed, 12 Jan 2022 at 14:18, Christian Brauner
> <christian.brauner@ubuntu.com> wrote:
> >
> > On Wed, Jan 12, 2022 at 05:15:37PM +0530, Naresh Kamboju wrote:
> > > While testing LTP syscalls with Linux next 20220110 (and till date 20220112)
> > > on x86_64, i386, arm and arm64 the following tests failed.
> > >
> > > tst_test.c:1365: TINFO: Timeout per run is 0h 15m 00s
> > > getxattr05.c:87: TPASS: Got same data when acquiring the value of
> > > system.posix_acl_access twice
> > > getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > > tst_test.c:391: TBROK: Invalid child (13545) exit value 1
> > >
> > > fanotify17.c:176: TINFO: Test #1: Global groups limit in privileged user ns
> > > fanotify17.c:155: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > > tst_test.c:391: TBROK: Invalid child (14739) exit value 1
> > >
> > > sendto03.c:48: TBROK: unshare(268435456) failed: ENOSPC (28)
> > >
> > > setsockopt05.c:45: TBROK: unshare(268435456) failed: ENOSPC (28)
> > >
> > > strace output:
> > > --------------
> > > [pid   481] wait4(-1, 0x7fff52f5ae8c, 0, NULL) = -1 ECHILD (No child processes)
> > > [pid   481] clone(child_stack=NULL,
> > > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> > > child_tidptr=0x7f3af0fa7a10) = 483
> > > strace: Process 483 attached
> > > [pid   481] wait4(-1,  <unfinished ...>
> > > [pid   483] unshare(CLONE_NEWUSER)      = -1 ENOSPC (No space left on device)
> >
> > This looks like another regression in the ucount code. Reverting the
> > following commit fixes it and makes the getxattr05 test work again:
> >
> > commit 0315b634f933b0f12cfa82660322f6186c1aa0f4
> > Author: Alexey Gladkov <legion@kernel.org>
> > Date:   Fri Dec 17 15:48:23 2021 +0100
> >
> >     ucounts: Split rlimit and ucount values and max values
> >
> >     Since the semantics of maximum rlimit values are different, it would be
> >     better not to mix ucount and rlimit values. This will prevent the error
> >     of using inc_count/dec_ucount for rlimit parameters.
> >
> >     This patch also renames the functions to emphasize the lack of
> >     connection between rlimit and ucount.
> >
> >     v2:
> >     - Fix the array-index-out-of-bounds that was found by the lkp project.
> >
> >     Reported-by: kernel test robot <oliver.sang@intel.com>
> >     Signed-off-by: Alexey Gladkov <legion@kernel.org>
> >     Link: https://lkml.kernel.org/r/73ea569042babda5cee2092423da85027ceb471f.1639752364.git.legion@kernel.org
> >     Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> >
> > The issue only surfaces if /proc/sys/user/max_user_namespaces is
> > actually written to.
> 
> I did a git bisect and that pointed me to this patch too.

Uhm, doesn't this want to be:

diff --git a/kernel/ucount.c b/kernel/ucount.c
index 22070f004e97..108c6a879cd8 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -264,7 +264,7 @@ long inc_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
        long ret = 0;

        for (iter = ucounts; iter; iter = iter->ns->ucounts) {
-               long new = atomic_long_add_return(v, &iter->ucount[type]);
+               long new = atomic_long_add_return(v, &iter->rlimit[type]);
                if (new < 0 || new > max)
                        ret = LONG_MAX;
                else if (iter == ucounts)
@@ -279,7 +279,7 @@ bool dec_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
        struct ucounts *iter;
        long new = -1; /* Silence compiler warning */
        for (iter = ucounts; iter; iter = iter->ns->ucounts) {
-               long dec = atomic_long_sub_return(v, &iter->ucount[type]);
+               long dec = atomic_long_sub_return(v, &iter->rlimit[type]);
                WARN_ON_ONCE(dec < 0);
                if (iter == ucounts)
                        new = dec;


otherwise,

inc_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_NPROC, 1)

means

long inc_rlimit_ucounts(struct ucounts *ucounts, UCOUNT_RLIMIT_NPROC, long v)
{
	struct ucounts *iter;
	long max = LONG_MAX;
	long ret = 0;

	for (iter = ucounts; iter; iter = iter->ns->ucounts) {
		long new = atomic_long_add_return(v, &iter->ucount[UCOUNT_RLIMIT_NPROC]);
		if (new < 0 || new > max)
			ret = LONG_MAX;
		else if (iter == ucounts)
			ret = new;
		max = get_userns_rlimit_max(iter->ns, UCOUNT_RLIMIT_NPROC);
	}
	return ret;
}

which means that UCOUNT_RLIMIT_NPROC overwrites ucount[UCOUNT_RLIMIT_NPROC]?

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
  2022-01-12 14:02     ` Christian Brauner
@ 2022-01-12 14:14       ` Alexey Gladkov
  2022-01-12 14:28         ` Alexey Gladkov
  0 siblings, 1 reply; 8+ messages in thread
From: Alexey Gladkov @ 2022-01-12 14:14 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Anders Roxell, Naresh Kamboju, open list,
	Linux-Next Mailing List, lkft-triage, LTP List, linux-fsdevel,
	regressions, containers, Sven Schnelle, Eric W. Biederman,
	Alexander Viro, Arnd Bergmann

On Wed, Jan 12, 2022 at 03:02:54PM +0100, Christian Brauner wrote:
> On Wed, Jan 12, 2022 at 02:22:42PM +0100, Anders Roxell wrote:
> > On Wed, 12 Jan 2022 at 14:18, Christian Brauner
> > <christian.brauner@ubuntu.com> wrote:
> > >
> > > On Wed, Jan 12, 2022 at 05:15:37PM +0530, Naresh Kamboju wrote:
> > > > While testing LTP syscalls with Linux next 20220110 (and till date 20220112)
> > > > on x86_64, i386, arm and arm64 the following tests failed.
> > > >
> > > > tst_test.c:1365: TINFO: Timeout per run is 0h 15m 00s
> > > > getxattr05.c:87: TPASS: Got same data when acquiring the value of
> > > > system.posix_acl_access twice
> > > > getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > > > tst_test.c:391: TBROK: Invalid child (13545) exit value 1
> > > >
> > > > fanotify17.c:176: TINFO: Test #1: Global groups limit in privileged user ns
> > > > fanotify17.c:155: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > > > tst_test.c:391: TBROK: Invalid child (14739) exit value 1
> > > >
> > > > sendto03.c:48: TBROK: unshare(268435456) failed: ENOSPC (28)
> > > >
> > > > setsockopt05.c:45: TBROK: unshare(268435456) failed: ENOSPC (28)
> > > >
> > > > strace output:
> > > > --------------
> > > > [pid   481] wait4(-1, 0x7fff52f5ae8c, 0, NULL) = -1 ECHILD (No child processes)
> > > > [pid   481] clone(child_stack=NULL,
> > > > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> > > > child_tidptr=0x7f3af0fa7a10) = 483
> > > > strace: Process 483 attached
> > > > [pid   481] wait4(-1,  <unfinished ...>
> > > > [pid   483] unshare(CLONE_NEWUSER)      = -1 ENOSPC (No space left on device)
> > >
> > > This looks like another regression in the ucount code. Reverting the
> > > following commit fixes it and makes the getxattr05 test work again:
> > >
> > > commit 0315b634f933b0f12cfa82660322f6186c1aa0f4
> > > Author: Alexey Gladkov <legion@kernel.org>
> > > Date:   Fri Dec 17 15:48:23 2021 +0100
> > >
> > >     ucounts: Split rlimit and ucount values and max values
> > >
> > >     Since the semantics of maximum rlimit values are different, it would be
> > >     better not to mix ucount and rlimit values. This will prevent the error
> > >     of using inc_count/dec_ucount for rlimit parameters.
> > >
> > >     This patch also renames the functions to emphasize the lack of
> > >     connection between rlimit and ucount.
> > >
> > >     v2:
> > >     - Fix the array-index-out-of-bounds that was found by the lkp project.
> > >
> > >     Reported-by: kernel test robot <oliver.sang@intel.com>
> > >     Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > >     Link: https://lkml.kernel.org/r/73ea569042babda5cee2092423da85027ceb471f.1639752364.git.legion@kernel.org
> > >     Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> > >
> > > The issue only surfaces if /proc/sys/user/max_user_namespaces is
> > > actually written to.
> > 
> > I did a git bisect and that pointed me to this patch too.
> 
> Uhm, doesn't this want to be:

Yes. I miss it. I tried not to mix the logic, but I myself stepped on this
problem.

> 
> diff --git a/kernel/ucount.c b/kernel/ucount.c
> index 22070f004e97..108c6a879cd8 100644
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -264,7 +264,7 @@ long inc_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
>         long ret = 0;
> 
>         for (iter = ucounts; iter; iter = iter->ns->ucounts) {
> -               long new = atomic_long_add_return(v, &iter->ucount[type]);
> +               long new = atomic_long_add_return(v, &iter->rlimit[type]);
>                 if (new < 0 || new > max)
>                         ret = LONG_MAX;
>                 else if (iter == ucounts)
> @@ -279,7 +279,7 @@ bool dec_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
>         struct ucounts *iter;
>         long new = -1; /* Silence compiler warning */
>         for (iter = ucounts; iter; iter = iter->ns->ucounts) {
> -               long dec = atomic_long_sub_return(v, &iter->ucount[type]);
> +               long dec = atomic_long_sub_return(v, &iter->rlimit[type]);
>                 WARN_ON_ONCE(dec < 0);
>                 if (iter == ucounts)
>                         new = dec;
> 
> 
> otherwise,
> 
> inc_rlimit_ucounts(ucounts, UCOUNT_RLIMIT_NPROC, 1)
> 
> means
> 
> long inc_rlimit_ucounts(struct ucounts *ucounts, UCOUNT_RLIMIT_NPROC, long v)
> {
> 	struct ucounts *iter;
> 	long max = LONG_MAX;
> 	long ret = 0;
> 
> 	for (iter = ucounts; iter; iter = iter->ns->ucounts) {
> 		long new = atomic_long_add_return(v, &iter->ucount[UCOUNT_RLIMIT_NPROC]);
> 		if (new < 0 || new > max)
> 			ret = LONG_MAX;
> 		else if (iter == ucounts)
> 			ret = new;
> 		max = get_userns_rlimit_max(iter->ns, UCOUNT_RLIMIT_NPROC);
> 	}
> 	return ret;
> }
> 
> which means that UCOUNT_RLIMIT_NPROC overwrites ucount[UCOUNT_RLIMIT_NPROC]?
> 

-- 
Rgrds, legion


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
  2022-01-12 14:14       ` Alexey Gladkov
@ 2022-01-12 14:28         ` Alexey Gladkov
  2022-01-12 15:56           ` Anders Roxell
  0 siblings, 1 reply; 8+ messages in thread
From: Alexey Gladkov @ 2022-01-12 14:28 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Anders Roxell, Naresh Kamboju, open list,
	Linux-Next Mailing List, lkft-triage, LTP List, linux-fsdevel,
	regressions, containers, Sven Schnelle, Eric W. Biederman,
	Alexander Viro, Arnd Bergmann

On Wed, Jan 12, 2022 at 03:14:45PM +0100, Alexey Gladkov wrote:
> On Wed, Jan 12, 2022 at 03:02:54PM +0100, Christian Brauner wrote:
> > On Wed, Jan 12, 2022 at 02:22:42PM +0100, Anders Roxell wrote:
> > > On Wed, 12 Jan 2022 at 14:18, Christian Brauner
> > > <christian.brauner@ubuntu.com> wrote:
> > > >
> > > > On Wed, Jan 12, 2022 at 05:15:37PM +0530, Naresh Kamboju wrote:
> > > > > While testing LTP syscalls with Linux next 20220110 (and till date 20220112)
> > > > > on x86_64, i386, arm and arm64 the following tests failed.
> > > > >
> > > > > tst_test.c:1365: TINFO: Timeout per run is 0h 15m 00s
> > > > > getxattr05.c:87: TPASS: Got same data when acquiring the value of
> > > > > system.posix_acl_access twice
> > > > > getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > > > > tst_test.c:391: TBROK: Invalid child (13545) exit value 1
> > > > >
> > > > > fanotify17.c:176: TINFO: Test #1: Global groups limit in privileged user ns
> > > > > fanotify17.c:155: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > > > > tst_test.c:391: TBROK: Invalid child (14739) exit value 1
> > > > >
> > > > > sendto03.c:48: TBROK: unshare(268435456) failed: ENOSPC (28)
> > > > >
> > > > > setsockopt05.c:45: TBROK: unshare(268435456) failed: ENOSPC (28)
> > > > >
> > > > > strace output:
> > > > > --------------
> > > > > [pid   481] wait4(-1, 0x7fff52f5ae8c, 0, NULL) = -1 ECHILD (No child processes)
> > > > > [pid   481] clone(child_stack=NULL,
> > > > > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> > > > > child_tidptr=0x7f3af0fa7a10) = 483
> > > > > strace: Process 483 attached
> > > > > [pid   481] wait4(-1,  <unfinished ...>
> > > > > [pid   483] unshare(CLONE_NEWUSER)      = -1 ENOSPC (No space left on device)
> > > >
> > > > This looks like another regression in the ucount code. Reverting the
> > > > following commit fixes it and makes the getxattr05 test work again:
> > > >
> > > > commit 0315b634f933b0f12cfa82660322f6186c1aa0f4
> > > > Author: Alexey Gladkov <legion@kernel.org>
> > > > Date:   Fri Dec 17 15:48:23 2021 +0100
> > > >
> > > >     ucounts: Split rlimit and ucount values and max values
> > > >
> > > >     Since the semantics of maximum rlimit values are different, it would be
> > > >     better not to mix ucount and rlimit values. This will prevent the error
> > > >     of using inc_count/dec_ucount for rlimit parameters.
> > > >
> > > >     This patch also renames the functions to emphasize the lack of
> > > >     connection between rlimit and ucount.
> > > >
> > > >     v2:
> > > >     - Fix the array-index-out-of-bounds that was found by the lkp project.
> > > >
> > > >     Reported-by: kernel test robot <oliver.sang@intel.com>
> > > >     Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > >     Link: https://lkml.kernel.org/r/73ea569042babda5cee2092423da85027ceb471f.1639752364.git.legion@kernel.org
> > > >     Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> > > >
> > > > The issue only surfaces if /proc/sys/user/max_user_namespaces is
> > > > actually written to.
> > > 
> > > I did a git bisect and that pointed me to this patch too.
> > 
> > Uhm, doesn't this want to be:
> 
> Yes. I miss it. I tried not to mix the logic, but I myself stepped on this
> problem.

It should be fixed in the four places:

diff --git a/kernel/ucount.c b/kernel/ucount.c
index 22070f004e97..5c373a453f43 100644
--- a/kernel/ucount.c
+++ b/kernel/ucount.c
@@ -264,7 +264,7 @@ long inc_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
 	long ret = 0;
 
 	for (iter = ucounts; iter; iter = iter->ns->ucounts) {
-		long new = atomic_long_add_return(v, &iter->ucount[type]);
+		long new = atomic_long_add_return(v, &iter->rlimit[type]);
 		if (new < 0 || new > max)
 			ret = LONG_MAX;
 		else if (iter == ucounts)
@@ -279,7 +279,7 @@ bool dec_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
 	struct ucounts *iter;
 	long new = -1; /* Silence compiler warning */
 	for (iter = ucounts; iter; iter = iter->ns->ucounts) {
-		long dec = atomic_long_sub_return(v, &iter->ucount[type]);
+		long dec = atomic_long_sub_return(v, &iter->rlimit[type]);
 		WARN_ON_ONCE(dec < 0);
 		if (iter == ucounts)
 			new = dec;
@@ -292,7 +292,7 @@ static void do_dec_rlimit_put_ucounts(struct ucounts *ucounts,
 {
 	struct ucounts *iter, *next;
 	for (iter = ucounts; iter != last; iter = next) {
-		long dec = atomic_long_sub_return(1, &iter->ucount[type]);
+		long dec = atomic_long_sub_return(1, &iter->rlimit[type]);
 		WARN_ON_ONCE(dec < 0);
 		next = iter->ns->ucounts;
 		if (dec == 0)
@@ -313,7 +313,7 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type)
 	long dec, ret = 0;
 
 	for (iter = ucounts; iter; iter = iter->ns->ucounts) {
-		long new = atomic_long_add_return(1, &iter->ucount[type]);
+		long new = atomic_long_add_return(1, &iter->rlimit[type]);
 		if (new < 0 || new > max)
 			goto unwind;
 		if (iter == ucounts)
@@ -330,7 +330,7 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type)
 	}
 	return ret;
 dec_unwind:
-	dec = atomic_long_sub_return(1, &iter->ucount[type]);
+	dec = atomic_long_sub_return(1, &iter->rlimit[type]);
 	WARN_ON_ONCE(dec < 0);
 unwind:
 	do_dec_rlimit_put_ucounts(ucounts, iter, type);

-- 
Rgrds, legion


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
  2022-01-12 14:28         ` Alexey Gladkov
@ 2022-01-12 15:56           ` Anders Roxell
  2022-01-12 16:42             ` Eric W. Biederman
  0 siblings, 1 reply; 8+ messages in thread
From: Anders Roxell @ 2022-01-12 15:56 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: Christian Brauner, Naresh Kamboju, open list,
	Linux-Next Mailing List, lkft-triage, LTP List, linux-fsdevel,
	regressions, containers, Sven Schnelle, Eric W. Biederman,
	Alexander Viro, Arnd Bergmann

On Wed, 12 Jan 2022 at 15:28, Alexey Gladkov <legion@kernel.org> wrote:
>
> On Wed, Jan 12, 2022 at 03:14:45PM +0100, Alexey Gladkov wrote:
> > On Wed, Jan 12, 2022 at 03:02:54PM +0100, Christian Brauner wrote:
> > > On Wed, Jan 12, 2022 at 02:22:42PM +0100, Anders Roxell wrote:
> > > > On Wed, 12 Jan 2022 at 14:18, Christian Brauner
> > > > <christian.brauner@ubuntu.com> wrote:
> > > > >
> > > > > On Wed, Jan 12, 2022 at 05:15:37PM +0530, Naresh Kamboju wrote:
> > > > > > While testing LTP syscalls with Linux next 20220110 (and till date 20220112)
> > > > > > on x86_64, i386, arm and arm64 the following tests failed.
> > > > > >
> > > > > > tst_test.c:1365: TINFO: Timeout per run is 0h 15m 00s
> > > > > > getxattr05.c:87: TPASS: Got same data when acquiring the value of
> > > > > > system.posix_acl_access twice
> > > > > > getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > > > > > tst_test.c:391: TBROK: Invalid child (13545) exit value 1
> > > > > >
> > > > > > fanotify17.c:176: TINFO: Test #1: Global groups limit in privileged user ns
> > > > > > fanotify17.c:155: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
> > > > > > tst_test.c:391: TBROK: Invalid child (14739) exit value 1
> > > > > >
> > > > > > sendto03.c:48: TBROK: unshare(268435456) failed: ENOSPC (28)
> > > > > >
> > > > > > setsockopt05.c:45: TBROK: unshare(268435456) failed: ENOSPC (28)
> > > > > >
> > > > > > strace output:
> > > > > > --------------
> > > > > > [pid   481] wait4(-1, 0x7fff52f5ae8c, 0, NULL) = -1 ECHILD (No child processes)
> > > > > > [pid   481] clone(child_stack=NULL,
> > > > > > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> > > > > > child_tidptr=0x7f3af0fa7a10) = 483
> > > > > > strace: Process 483 attached
> > > > > > [pid   481] wait4(-1,  <unfinished ...>
> > > > > > [pid   483] unshare(CLONE_NEWUSER)      = -1 ENOSPC (No space left on device)
> > > > >
> > > > > This looks like another regression in the ucount code. Reverting the
> > > > > following commit fixes it and makes the getxattr05 test work again:
> > > > >
> > > > > commit 0315b634f933b0f12cfa82660322f6186c1aa0f4
> > > > > Author: Alexey Gladkov <legion@kernel.org>
> > > > > Date:   Fri Dec 17 15:48:23 2021 +0100
> > > > >
> > > > >     ucounts: Split rlimit and ucount values and max values
> > > > >
> > > > >     Since the semantics of maximum rlimit values are different, it would be
> > > > >     better not to mix ucount and rlimit values. This will prevent the error
> > > > >     of using inc_count/dec_ucount for rlimit parameters.
> > > > >
> > > > >     This patch also renames the functions to emphasize the lack of
> > > > >     connection between rlimit and ucount.
> > > > >
> > > > >     v2:
> > > > >     - Fix the array-index-out-of-bounds that was found by the lkp project.
> > > > >
> > > > >     Reported-by: kernel test robot <oliver.sang@intel.com>
> > > > >     Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > > >     Link: https://lkml.kernel.org/r/73ea569042babda5cee2092423da85027ceb471f.1639752364.git.legion@kernel.org
> > > > >     Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
> > > > >
> > > > > The issue only surfaces if /proc/sys/user/max_user_namespaces is
> > > > > actually written to.
> > > >
> > > > I did a git bisect and that pointed me to this patch too.
> > >
> > > Uhm, doesn't this want to be:
> >
> > Yes. I miss it. I tried not to mix the logic, but I myself stepped on this
> > problem.
>
> It should be fixed in the four places:
>
> diff --git a/kernel/ucount.c b/kernel/ucount.c
> index 22070f004e97..5c373a453f43 100644
> --- a/kernel/ucount.c
> +++ b/kernel/ucount.c
> @@ -264,7 +264,7 @@ long inc_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
>         long ret = 0;
>
>         for (iter = ucounts; iter; iter = iter->ns->ucounts) {
> -               long new = atomic_long_add_return(v, &iter->ucount[type]);
> +               long new = atomic_long_add_return(v, &iter->rlimit[type]);
>                 if (new < 0 || new > max)
>                         ret = LONG_MAX;
>                 else if (iter == ucounts)
> @@ -279,7 +279,7 @@ bool dec_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
>         struct ucounts *iter;
>         long new = -1; /* Silence compiler warning */
>         for (iter = ucounts; iter; iter = iter->ns->ucounts) {
> -               long dec = atomic_long_sub_return(v, &iter->ucount[type]);
> +               long dec = atomic_long_sub_return(v, &iter->rlimit[type]);
>                 WARN_ON_ONCE(dec < 0);
>                 if (iter == ucounts)
>                         new = dec;
> @@ -292,7 +292,7 @@ static void do_dec_rlimit_put_ucounts(struct ucounts *ucounts,
>  {
>         struct ucounts *iter, *next;
>         for (iter = ucounts; iter != last; iter = next) {
> -               long dec = atomic_long_sub_return(1, &iter->ucount[type]);
> +               long dec = atomic_long_sub_return(1, &iter->rlimit[type]);
>                 WARN_ON_ONCE(dec < 0);
>                 next = iter->ns->ucounts;
>                 if (dec == 0)
> @@ -313,7 +313,7 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type)
>         long dec, ret = 0;
>
>         for (iter = ucounts; iter; iter = iter->ns->ucounts) {
> -               long new = atomic_long_add_return(1, &iter->ucount[type]);
> +               long new = atomic_long_add_return(1, &iter->rlimit[type]);
>                 if (new < 0 || new > max)
>                         goto unwind;
>                 if (iter == ucounts)
> @@ -330,7 +330,7 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type)
>         }
>         return ret;
>  dec_unwind:
> -       dec = atomic_long_sub_return(1, &iter->ucount[type]);
> +       dec = atomic_long_sub_return(1, &iter->rlimit[type]);
>         WARN_ON_ONCE(dec < 0);
>  unwind:
>         do_dec_rlimit_put_ucounts(ucounts, iter, type);
>

Thank you for the fix.
I applied this patch and built and ran it in qemu for arm64 and x86.
'./runltp -s getxattr05' passed on both architectures.

Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>


Cheers,
Anders

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
  2022-01-12 15:56           ` Anders Roxell
@ 2022-01-12 16:42             ` Eric W. Biederman
  0 siblings, 0 replies; 8+ messages in thread
From: Eric W. Biederman @ 2022-01-12 16:42 UTC (permalink / raw)
  To: Anders Roxell
  Cc: Alexey Gladkov, Christian Brauner, Naresh Kamboju, open list,
	Linux-Next Mailing List, lkft-triage, LTP List, linux-fsdevel,
	regressions, containers, Sven Schnelle, Alexander Viro,
	Arnd Bergmann

Anders Roxell <anders.roxell@linaro.org> writes:

> On Wed, 12 Jan 2022 at 15:28, Alexey Gladkov <legion@kernel.org> wrote:
>>
>> On Wed, Jan 12, 2022 at 03:14:45PM +0100, Alexey Gladkov wrote:
>> > On Wed, Jan 12, 2022 at 03:02:54PM +0100, Christian Brauner wrote:
>> > > On Wed, Jan 12, 2022 at 02:22:42PM +0100, Anders Roxell wrote:
>> > > > On Wed, 12 Jan 2022 at 14:18, Christian Brauner
>> > > > <christian.brauner@ubuntu.com> wrote:
>> > > > >
>> > > > > On Wed, Jan 12, 2022 at 05:15:37PM +0530, Naresh Kamboju wrote:
>> > > > > > While testing LTP syscalls with Linux next 20220110 (and till date 20220112)
>> > > > > > on x86_64, i386, arm and arm64 the following tests failed.
>> > > > > >
>> > > > > > tst_test.c:1365: TINFO: Timeout per run is 0h 15m 00s
>> > > > > > getxattr05.c:87: TPASS: Got same data when acquiring the value of
>> > > > > > system.posix_acl_access twice
>> > > > > > getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
>> > > > > > tst_test.c:391: TBROK: Invalid child (13545) exit value 1
>> > > > > >
>> > > > > > fanotify17.c:176: TINFO: Test #1: Global groups limit in privileged user ns
>> > > > > > fanotify17.c:155: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28)
>> > > > > > tst_test.c:391: TBROK: Invalid child (14739) exit value 1
>> > > > > >
>> > > > > > sendto03.c:48: TBROK: unshare(268435456) failed: ENOSPC (28)
>> > > > > >
>> > > > > > setsockopt05.c:45: TBROK: unshare(268435456) failed: ENOSPC (28)
>> > > > > >
>> > > > > > strace output:
>> > > > > > --------------
>> > > > > > [pid   481] wait4(-1, 0x7fff52f5ae8c, 0, NULL) = -1 ECHILD (No child processes)
>> > > > > > [pid   481] clone(child_stack=NULL,
>> > > > > > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
>> > > > > > child_tidptr=0x7f3af0fa7a10) = 483
>> > > > > > strace: Process 483 attached
>> > > > > > [pid   481] wait4(-1,  <unfinished ...>
>> > > > > > [pid   483] unshare(CLONE_NEWUSER)      = -1 ENOSPC (No space left on device)
>> > > > >
>> > > > > This looks like another regression in the ucount code. Reverting the
>> > > > > following commit fixes it and makes the getxattr05 test work again:
>> > > > >
>> > > > > commit 0315b634f933b0f12cfa82660322f6186c1aa0f4
>> > > > > Author: Alexey Gladkov <legion@kernel.org>
>> > > > > Date:   Fri Dec 17 15:48:23 2021 +0100
>> > > > >
>> > > > >     ucounts: Split rlimit and ucount values and max values
>> > > > >
>> > > > >     Since the semantics of maximum rlimit values are different, it would be
>> > > > >     better not to mix ucount and rlimit values. This will prevent the error
>> > > > >     of using inc_count/dec_ucount for rlimit parameters.
>> > > > >
>> > > > >     This patch also renames the functions to emphasize the lack of
>> > > > >     connection between rlimit and ucount.
>> > > > >
>> > > > >     v2:
>> > > > >     - Fix the array-index-out-of-bounds that was found by the lkp project.
>> > > > >
>> > > > >     Reported-by: kernel test robot <oliver.sang@intel.com>
>> > > > >     Signed-off-by: Alexey Gladkov <legion@kernel.org>
>> > > > >     Link: https://lkml.kernel.org/r/73ea569042babda5cee2092423da85027ceb471f.1639752364.git.legion@kernel.org
>> > > > >     Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
>> > > > >
>> > > > > The issue only surfaces if /proc/sys/user/max_user_namespaces is
>> > > > > actually written to.
>> > > >
>> > > > I did a git bisect and that pointed me to this patch too.
>> > >
>> > > Uhm, doesn't this want to be:
>> >
>> > Yes. I miss it. I tried not to mix the logic, but I myself stepped on this
>> > problem.
>>
>> It should be fixed in the four places:
>>
>> diff --git a/kernel/ucount.c b/kernel/ucount.c
>> index 22070f004e97..5c373a453f43 100644
>> --- a/kernel/ucount.c
>> +++ b/kernel/ucount.c
>> @@ -264,7 +264,7 @@ long inc_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
>>         long ret = 0;
>>
>>         for (iter = ucounts; iter; iter = iter->ns->ucounts) {
>> -               long new = atomic_long_add_return(v, &iter->ucount[type]);
>> +               long new = atomic_long_add_return(v, &iter->rlimit[type]);
>>                 if (new < 0 || new > max)
>>                         ret = LONG_MAX;
>>                 else if (iter == ucounts)
>> @@ -279,7 +279,7 @@ bool dec_rlimit_ucounts(struct ucounts *ucounts, enum rlimit_type type, long v)
>>         struct ucounts *iter;
>>         long new = -1; /* Silence compiler warning */
>>         for (iter = ucounts; iter; iter = iter->ns->ucounts) {
>> -               long dec = atomic_long_sub_return(v, &iter->ucount[type]);
>> +               long dec = atomic_long_sub_return(v, &iter->rlimit[type]);
>>                 WARN_ON_ONCE(dec < 0);
>>                 if (iter == ucounts)
>>                         new = dec;
>> @@ -292,7 +292,7 @@ static void do_dec_rlimit_put_ucounts(struct ucounts *ucounts,
>>  {
>>         struct ucounts *iter, *next;
>>         for (iter = ucounts; iter != last; iter = next) {
>> -               long dec = atomic_long_sub_return(1, &iter->ucount[type]);
>> +               long dec = atomic_long_sub_return(1, &iter->rlimit[type]);
>>                 WARN_ON_ONCE(dec < 0);
>>                 next = iter->ns->ucounts;
>>                 if (dec == 0)
>> @@ -313,7 +313,7 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type)
>>         long dec, ret = 0;
>>
>>         for (iter = ucounts; iter; iter = iter->ns->ucounts) {
>> -               long new = atomic_long_add_return(1, &iter->ucount[type]);
>> +               long new = atomic_long_add_return(1, &iter->rlimit[type]);
>>                 if (new < 0 || new > max)
>>                         goto unwind;
>>                 if (iter == ucounts)
>> @@ -330,7 +330,7 @@ long inc_rlimit_get_ucounts(struct ucounts *ucounts, enum rlimit_type type)
>>         }
>>         return ret;
>>  dec_unwind:
>> -       dec = atomic_long_sub_return(1, &iter->ucount[type]);
>> +       dec = atomic_long_sub_return(1, &iter->rlimit[type]);
>>         WARN_ON_ONCE(dec < 0);
>>  unwind:
>>         do_dec_rlimit_put_ucounts(ucounts, iter, type);
>>
>
> Thank you for the fix.
> I applied this patch and built and ran it in qemu for arm64 and x86.
> './runltp -s getxattr05' passed on both architectures.
>
> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>

Thank you all.

For now I have dropped this from linux-next.  I will add the fix and
will aim to get this cleanup in the next merge window.

Eric

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-01-12 16:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-12 11:45 [next]: LTP: getxattr05.c:97: TFAIL: unshare(CLONE_NEWUSER) failed: ENOSPC (28) Naresh Kamboju
2022-01-12 13:18 ` Christian Brauner
2022-01-12 13:22   ` Anders Roxell
2022-01-12 14:02     ` Christian Brauner
2022-01-12 14:14       ` Alexey Gladkov
2022-01-12 14:28         ` Alexey Gladkov
2022-01-12 15:56           ` Anders Roxell
2022-01-12 16:42             ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).