linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] exec: load_script: don't blindly truncate shebang string
@ 2018-11-12 16:09 Oleg Nesterov
  2018-11-12 16:09 ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256 Oleg Nesterov
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Oleg Nesterov @ 2018-11-12 16:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ben Woodard, Eric W. Biederman, Kees Cook, Michal Hocko, linux-kernel

load_script() simply truncates bprm->buf and this is very wrong if the
length of shebang string exceeds BINPRM_BUF_SIZE-2. This can silently
truncate i_arg or (worse) we can execute the wrong binary if buf[2:126]
happens to be the valid executable path.

Change load_script() to return ENOEXEC if it can't find '\n' or zero in
bprm->buf. Note that '\0' can come from either prepare_binprm()->memset()
or from kernel_read(), we do not care.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 fs/binfmt_script.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/binfmt_script.c b/fs/binfmt_script.c
index 7cde3f4..d0078cb 100644
--- a/fs/binfmt_script.c
+++ b/fs/binfmt_script.c
@@ -42,10 +42,14 @@ static int load_script(struct linux_binprm *bprm)
 	fput(bprm->file);
 	bprm->file = NULL;
 
-	bprm->buf[BINPRM_BUF_SIZE - 1] = '\0';
-	if ((cp = strchr(bprm->buf, '\n')) == NULL)
-		cp = bprm->buf+BINPRM_BUF_SIZE-1;
+	for (cp = bprm->buf+2;; cp++) {
+		if (cp >= bprm->buf + BINPRM_BUF_SIZE)
+			return -ENOEXEC;
+		if (!*cp || (*cp == '\n'))
+			break;
+	}
 	*cp = '\0';
+
 	while (cp > bprm->buf) {
 		cp--;
 		if ((*cp == ' ') || (*cp == '\t'))
-- 
2.5.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-12 16:09 [PATCH 1/2] exec: load_script: don't blindly truncate shebang string Oleg Nesterov
@ 2018-11-12 16:09 ` Oleg Nesterov
  2018-11-12 23:52   ` Andrew Morton
                     ` (3 more replies)
  2018-11-13 10:27 ` [PATCH 1/2] exec: load_script: don't blindly truncate shebang string Michal Hocko
  2018-11-13 20:16 ` Kees Cook
  2 siblings, 4 replies; 19+ messages in thread
From: Oleg Nesterov @ 2018-11-12 16:09 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ben Woodard, Eric W. Biederman, Kees Cook, Michal Hocko, linux-kernel

Large enterprise clients often times run applications out of networked
file systems where the IT mandated layout of project volumes can end up
leading to paths that are longer than 128 characters. Bumping this up to
the next order of two solves this problem in all but the most egregious
case while still fitting into a 512b slab.

Reported-by: Ben Woodard <woodard@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
 include/uapi/linux/binfmts.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/uapi/linux/binfmts.h b/include/uapi/linux/binfmts.h
index 4abad03..689025d 100644
--- a/include/uapi/linux/binfmts.h
+++ b/include/uapi/linux/binfmts.h
@@ -16,6 +16,6 @@ struct pt_regs;
 #define MAX_ARG_STRINGS 0x7FFFFFFF
 
 /* sizeof(linux_binprm->buf) */
-#define BINPRM_BUF_SIZE 128
+#define BINPRM_BUF_SIZE 256
 
 #endif /* _UAPI_LINUX_BINFMTS_H */
-- 
2.5.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-12 16:09 ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256 Oleg Nesterov
@ 2018-11-12 23:52   ` Andrew Morton
  2018-11-13  5:03     ` Kees Cook
  2018-11-13 16:55     ` Oleg Nesterov
  2018-11-13 10:29   ` Michal Hocko
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 19+ messages in thread
From: Andrew Morton @ 2018-11-12 23:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ben Woodard, Eric W. Biederman, Kees Cook, Michal Hocko, linux-kernel

On Mon, 12 Nov 2018 17:09:56 +0100 Oleg Nesterov <oleg@redhat.com> wrote:

> Large enterprise clients often times run applications out of networked
> file systems where the IT mandated layout of project volumes can end up
> leading to paths that are longer than 128 characters. Bumping this up to
> the next order of two solves this problem in all but the most egregious
> case while still fitting into a 512b slab.
> 
> ...
>
> --- a/include/uapi/linux/binfmts.h
> +++ b/include/uapi/linux/binfmts.h
> @@ -16,6 +16,6 @@ struct pt_regs;
>  #define MAX_ARG_STRINGS 0x7FFFFFFF
>  
>  /* sizeof(linux_binprm->buf) */
> -#define BINPRM_BUF_SIZE 128
> +#define BINPRM_BUF_SIZE 256
>  
>  #endif /* _UAPI_LINUX_BINFMTS_H */

It does seem a rather silly restriction, and it's tempting to suggest
reworking the code so that linux_binprm.buf is dynamically sized to
accommodate even ludicrously large strings.

But obviously 128 bytes has been enough for all this time, so that's
going too far.  However it would be basically cost-free to increase
BINPRM_BUF_SIZE up to the point where sizeof(struct linux_binprm) ==
PAGE_SIZE?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-12 23:52   ` Andrew Morton
@ 2018-11-13  5:03     ` Kees Cook
  2018-11-13 16:55     ` Oleg Nesterov
  1 sibling, 0 replies; 19+ messages in thread
From: Kees Cook @ 2018-11-13  5:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oleg Nesterov, Ben Woodard, Eric W. Biederman, Michal Hocko, LKML

On Mon, Nov 12, 2018 at 5:52 PM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Mon, 12 Nov 2018 17:09:56 +0100 Oleg Nesterov <oleg@redhat.com> wrote:
>
>> Large enterprise clients often times run applications out of networked
>> file systems where the IT mandated layout of project volumes can end up
>> leading to paths that are longer than 128 characters. Bumping this up to
>> the next order of two solves this problem in all but the most egregious
>> case while still fitting into a 512b slab.
>>
>> ...
>>
>> --- a/include/uapi/linux/binfmts.h
>> +++ b/include/uapi/linux/binfmts.h
>> @@ -16,6 +16,6 @@ struct pt_regs;
>>  #define MAX_ARG_STRINGS 0x7FFFFFFF
>>
>>  /* sizeof(linux_binprm->buf) */
>> -#define BINPRM_BUF_SIZE 128
>> +#define BINPRM_BUF_SIZE 256

This comment needs updating too:

fs/exec.c: * Check permissions, then read the first 128 (BINPRM_BUF_SIZE) bytes

>>  #endif /* _UAPI_LINUX_BINFMTS_H */
>
> It does seem a rather silly restriction, and it's tempting to suggest
> reworking the code so that linux_binprm.buf is dynamically sized to
> accommodate even ludicrously large strings.
>
> But obviously 128 bytes has been enough for all this time, so that's
> going too far.  However it would be basically cost-free to increase
> BINPRM_BUF_SIZE up to the point where sizeof(struct linux_binprm) ==
> PAGE_SIZE?

Yeah, and this might be a useful detail included in a comment above
the #define...

Regardless:

Acked-by: Kees Cook <keescook@chromium.org>


-- 
Kees Cook

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] exec: load_script: don't blindly truncate shebang string
  2018-11-12 16:09 [PATCH 1/2] exec: load_script: don't blindly truncate shebang string Oleg Nesterov
  2018-11-12 16:09 ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256 Oleg Nesterov
@ 2018-11-13 10:27 ` Michal Hocko
  2018-11-13 16:41   ` Oleg Nesterov
  2018-11-13 20:16 ` Kees Cook
  2 siblings, 1 reply; 19+ messages in thread
From: Michal Hocko @ 2018-11-13 10:27 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook, linux-kernel

On Mon 12-11-18 17:09:31, Oleg Nesterov wrote:
> load_script() simply truncates bprm->buf and this is very wrong if the
> length of shebang string exceeds BINPRM_BUF_SIZE-2. This can silently
> truncate i_arg or (worse) we can execute the wrong binary if buf[2:126]
> happens to be the valid executable path.
> 
> Change load_script() to return ENOEXEC if it can't find '\n' or zero in
> bprm->buf. Note that '\0' can come from either prepare_binprm()->memset()
> or from kernel_read(), we do not care.
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

A bit cryptic to my taste but it looks correct. I have tried to come up
with something more tasty but I am afraid it would be just a matter of
taste.

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  fs/binfmt_script.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/binfmt_script.c b/fs/binfmt_script.c
> index 7cde3f4..d0078cb 100644
> --- a/fs/binfmt_script.c
> +++ b/fs/binfmt_script.c
> @@ -42,10 +42,14 @@ static int load_script(struct linux_binprm *bprm)
>  	fput(bprm->file);
>  	bprm->file = NULL;
>  
> -	bprm->buf[BINPRM_BUF_SIZE - 1] = '\0';
> -	if ((cp = strchr(bprm->buf, '\n')) == NULL)
> -		cp = bprm->buf+BINPRM_BUF_SIZE-1;
> +	for (cp = bprm->buf+2;; cp++) {
> +		if (cp >= bprm->buf + BINPRM_BUF_SIZE)
> +			return -ENOEXEC;
> +		if (!*cp || (*cp == '\n'))
> +			break;
> +	}
>  	*cp = '\0';
> +
>  	while (cp > bprm->buf) {
>  		cp--;
>  		if ((*cp == ' ') || (*cp == '\t'))
> -- 
> 2.5.0
> 
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-12 16:09 ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256 Oleg Nesterov
  2018-11-12 23:52   ` Andrew Morton
@ 2018-11-13 10:29   ` Michal Hocko
  2018-11-16 17:49   ` Alan Cox
  2019-02-18 19:37   ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE " Guenter Roeck
  3 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2018-11-13 10:29 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook, linux-kernel

On Mon 12-11-18 17:09:56, Oleg Nesterov wrote:
> Large enterprise clients often times run applications out of networked
> file systems where the IT mandated layout of project volumes can end up
> leading to paths that are longer than 128 characters. Bumping this up to
> the next order of two solves this problem in all but the most egregious
> case while still fitting into a 512b slab.
> 
> Reported-by: Ben Woodard <woodard@redhat.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

Increasing it to a larger value wouldn't hurt but I wouldn't bind it to
the page size because the layout might change and result in higher order
request.

> ---
>  include/uapi/linux/binfmts.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/binfmts.h b/include/uapi/linux/binfmts.h
> index 4abad03..689025d 100644
> --- a/include/uapi/linux/binfmts.h
> +++ b/include/uapi/linux/binfmts.h
> @@ -16,6 +16,6 @@ struct pt_regs;
>  #define MAX_ARG_STRINGS 0x7FFFFFFF
>  
>  /* sizeof(linux_binprm->buf) */
> -#define BINPRM_BUF_SIZE 128
> +#define BINPRM_BUF_SIZE 256
>  
>  #endif /* _UAPI_LINUX_BINFMTS_H */
> -- 
> 2.5.0
> 
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] exec: load_script: don't blindly truncate shebang string
  2018-11-13 10:27 ` [PATCH 1/2] exec: load_script: don't blindly truncate shebang string Michal Hocko
@ 2018-11-13 16:41   ` Oleg Nesterov
  0 siblings, 0 replies; 19+ messages in thread
From: Oleg Nesterov @ 2018-11-13 16:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook, linux-kernel

On 11/13, Michal Hocko wrote:
>
> A bit cryptic to my taste

Yeeees, because I didn't want to touch the code below. We need to cleanup
the whole "parse bprm->buf" code, not only this part.

> but it looks correct. I have tried to come up
> with something more tasty but I am afraid it would be just a matter of
> taste.
>
> Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

Oleg.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-12 23:52   ` Andrew Morton
  2018-11-13  5:03     ` Kees Cook
@ 2018-11-13 16:55     ` Oleg Nesterov
  2018-11-13 20:43       ` Andrew Morton
  1 sibling, 1 reply; 19+ messages in thread
From: Oleg Nesterov @ 2018-11-13 16:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ben Woodard, Eric W. Biederman, Kees Cook, Michal Hocko, linux-kernel

On 11/12, Andrew Morton wrote:

> On Mon, 12 Nov 2018 17:09:56 +0100 Oleg Nesterov <oleg@redhat.com> wrote:
>
> >  /* sizeof(linux_binprm->buf) */
> > -#define BINPRM_BUF_SIZE 128
> > +#define BINPRM_BUF_SIZE 256
> >
> >  #endif /* _UAPI_LINUX_BINFMTS_H */
>
> It does seem a rather silly restriction, and it's tempting to suggest
> reworking the code so that linux_binprm.buf is dynamically sized to
> accommodate even ludicrously large strings.

I actually tried to do this ;)

Of course this is possible, but we need some limits anyway, we need to
read the file until we find '\n' or '\0' in kmalloc/vmalloc'ed buffer,
then split and copy the strings to bprm->vma in reverse order.

So I decided to make the trivial change for now and (hopefully) forget
about this problem.

> However it would be basically cost-free to increase
> BINPRM_BUF_SIZE up to the point where sizeof(struct linux_binprm) ==
> PAGE_SIZE?

I don't think we should take sizeof(struct linux_binprm) into account, the
new members can come at any time and we can never decrease BINPRM_BUF_SIZE.

Oleg.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 1/2] exec: load_script: don't blindly truncate shebang string
  2018-11-12 16:09 [PATCH 1/2] exec: load_script: don't blindly truncate shebang string Oleg Nesterov
  2018-11-12 16:09 ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256 Oleg Nesterov
  2018-11-13 10:27 ` [PATCH 1/2] exec: load_script: don't blindly truncate shebang string Michal Hocko
@ 2018-11-13 20:16 ` Kees Cook
  2 siblings, 0 replies; 19+ messages in thread
From: Kees Cook @ 2018-11-13 20:16 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Michal Hocko, LKML

On Mon, Nov 12, 2018 at 10:09 AM, Oleg Nesterov <oleg@redhat.com> wrote:
> load_script() simply truncates bprm->buf and this is very wrong if the
> length of shebang string exceeds BINPRM_BUF_SIZE-2. This can silently
> truncate i_arg or (worse) we can execute the wrong binary if buf[2:126]
> happens to be the valid executable path.
>
> Change load_script() to return ENOEXEC if it can't find '\n' or zero in
> bprm->buf. Note that '\0' can come from either prepare_binprm()->memset()
> or from kernel_read(), we do not care.
>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  fs/binfmt_script.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/fs/binfmt_script.c b/fs/binfmt_script.c
> index 7cde3f4..d0078cb 100644
> --- a/fs/binfmt_script.c
> +++ b/fs/binfmt_script.c
> @@ -42,10 +42,14 @@ static int load_script(struct linux_binprm *bprm)
>         fput(bprm->file);
>         bprm->file = NULL;
>
> -       bprm->buf[BINPRM_BUF_SIZE - 1] = '\0';
> -       if ((cp = strchr(bprm->buf, '\n')) == NULL)
> -               cp = bprm->buf+BINPRM_BUF_SIZE-1;
> +       for (cp = bprm->buf+2;; cp++) {
> +               if (cp >= bprm->buf + BINPRM_BUF_SIZE)
> +                       return -ENOEXEC;
> +               if (!*cp || (*cp == '\n'))
> +                       break;
> +       }
>         *cp = '\0';
> +
>         while (cp > bprm->buf) {
>                 cp--;
>                 if ((*cp == ' ') || (*cp == '\t'))
> --
> 2.5.0
>
>



-- 
Kees Cook

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-13 16:55     ` Oleg Nesterov
@ 2018-11-13 20:43       ` Andrew Morton
  2018-11-14 15:54         ` Oleg Nesterov
  0 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2018-11-13 20:43 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ben Woodard, Eric W. Biederman, Kees Cook, Michal Hocko, linux-kernel

On Tue, 13 Nov 2018 17:55:58 +0100 Oleg Nesterov <oleg@redhat.com> wrote:

> > However it would be basically cost-free to increase
> > BINPRM_BUF_SIZE up to the point where sizeof(struct linux_binprm) ==
> > PAGE_SIZE?
> 
> I don't think we should take sizeof(struct linux_binprm) into account, the
> new members can come at any time and we can never decrease BINPRM_BUF_SIZE.

My main point is..  why not make BINPRM_BUF_SIZE a lot larger than 256?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-13 20:43       ` Andrew Morton
@ 2018-11-14 15:54         ` Oleg Nesterov
  2018-11-14 16:01           ` Michal Hocko
  0 siblings, 1 reply; 19+ messages in thread
From: Oleg Nesterov @ 2018-11-14 15:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Ben Woodard, Eric W. Biederman, Kees Cook, Michal Hocko, linux-kernel

On 11/13, Andrew Morton wrote:
>
> On Tue, 13 Nov 2018 17:55:58 +0100 Oleg Nesterov <oleg@redhat.com> wrote:
>
> > > However it would be basically cost-free to increase
> > > BINPRM_BUF_SIZE up to the point where sizeof(struct linux_binprm) ==
> > > PAGE_SIZE?
> >
> > I don't think we should take sizeof(struct linux_binprm) into account, the
> > new members can come at any time and we can never decrease BINPRM_BUF_SIZE.
>
> My main point is..  why not make BINPRM_BUF_SIZE a lot larger than 256?

Of course we can make it larger. And of course 256 is just another silly/random
value. Currently it seems to work, but if we have another bug report we should
probably rework load_script() to use vmalloc()'ed buffer. Perhaps we should do
this right now and I am just too lazy.

Oleg.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-14 15:54         ` Oleg Nesterov
@ 2018-11-14 16:01           ` Michal Hocko
  0 siblings, 0 replies; 19+ messages in thread
From: Michal Hocko @ 2018-11-14 16:01 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook, linux-kernel

On Wed 14-11-18 16:54:14, Oleg Nesterov wrote:
> On 11/13, Andrew Morton wrote:
> >
> > On Tue, 13 Nov 2018 17:55:58 +0100 Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > > > However it would be basically cost-free to increase
> > > > BINPRM_BUF_SIZE up to the point where sizeof(struct linux_binprm) ==
> > > > PAGE_SIZE?
> > >
> > > I don't think we should take sizeof(struct linux_binprm) into account, the
> > > new members can come at any time and we can never decrease BINPRM_BUF_SIZE.
> >
> > My main point is..  why not make BINPRM_BUF_SIZE a lot larger than 256?
> 
> Of course we can make it larger. And of course 256 is just another silly/random
> value. Currently it seems to work, but if we have another bug report we should
> probably rework load_script() to use vmalloc()'ed buffer. Perhaps we should do
> this right now and I am just too lazy.

I would rather not over-engineer this after a first bug. Even 256 path
seems like a torturing to me ;)

We would have to have some limit anyway and arbitrary value might just
not work for somebody crazy enough. Making it a part of of rlimit sounds
like opening a cane of worms to me.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-12 16:09 ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256 Oleg Nesterov
  2018-11-12 23:52   ` Andrew Morton
  2018-11-13 10:29   ` Michal Hocko
@ 2018-11-16 17:49   ` Alan Cox
  2018-11-22 12:15     ` Oleg Nesterov
       [not found]     ` <20181121160753.GA32685@asgard.redhat.com>
  2019-02-18 19:37   ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE " Guenter Roeck
  3 siblings, 2 replies; 19+ messages in thread
From: Alan Cox @ 2018-11-16 17:49 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook,
	Michal Hocko, linux-kernel, linux-man

On Mon, 12 Nov 2018 17:09:56 +0100
Oleg Nesterov <oleg@redhat.com> wrote:

> Large enterprise clients often times run applications out of networked
> file systems where the IT mandated layout of project volumes can end up
> leading to paths that are longer than 128 characters. Bumping this up to
> the next order of two solves this problem in all but the most egregious
> case while still fitting into a 512b slab.

You also need to update the execve manual page as it explicitly documents
the 128 byte limit.

Alan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-16 17:49   ` Alan Cox
@ 2018-11-22 12:15     ` Oleg Nesterov
       [not found]     ` <20181121160753.GA32685@asgard.redhat.com>
  1 sibling, 0 replies; 19+ messages in thread
From: Oleg Nesterov @ 2018-11-22 12:15 UTC (permalink / raw)
  To: Alan Cox
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook,
	Michal Hocko, linux-kernel, linux-man

On 11/16, Alan Cox wrote:
>
> On Mon, 12 Nov 2018 17:09:56 +0100
> Oleg Nesterov <oleg@redhat.com> wrote:
>
> > Large enterprise clients often times run applications out of networked
> > file systems where the IT mandated layout of project volumes can end up
> > leading to paths that are longer than 128 characters. Bumping this up to
> > the next order of two solves this problem in all but the most egregious
> > case while still fitting into a 512b slab.
>
> You also need to update the execve manual page as it explicitly documents
> the 128 byte limit.

Thanks, Eugene sent the patch:

	[PATCH] execve.2: document an effect of BINPRM_BUF_SIZE increase to 256
	https://marc.info/?l=linux-man&m=154281645706657

Oleg.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] execve.2: document an effect of BINPRM_BUF_SIZE increase to 256
       [not found]       ` <CAKgNAkhFikJXeOx3W3yL3EUKa9ruXtAw93m4M=N+3Kg-bXbPDQ@mail.gmail.com>
@ 2018-11-22 17:32         ` Eugene Syromiatnikov
  0 siblings, 0 replies; 19+ messages in thread
From: Eugene Syromiatnikov @ 2018-11-22 17:32 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: linux-man, Andrew Morton, Ben Woodard, Eric W. Biederman,
	Kees Cook, Michal Hocko, linux-kernel, Oleg Nesterov, Alan Cox

On Thu, Nov 22, 2018 at 04:59:24PM +0100, Michael Kerrisk (man-pages) wrote:
> Hi Eugene,
> 
> On Wed, 21 Nov 2018 at 17:07, Eugene Syromiatnikov <esyr@redhat.com> wrote:
> >
> > Increase of BINPRM_BUF_SIZE to 256 increases the limit on the possible
> > interpreter line length for scripts to 255.
> 
> I don't see this in the kernel source tree? Could you point me at the
> right source file / git commit?

Sorry, it's a corresponding man page update for a proposed kernel
change[1][2], and I failed to Cc: all the related parties properly.

[1] https://lkml.org/lkml/2018/11/12/2330
[2] https://lkml.org/lkml/2018/11/12/2328

> Thanks,
> 
> Michael
> 
> > Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
> > ---
> >  man2/execve.2 | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/man2/execve.2 b/man2/execve.2
> > index d2bb861..f4e37d6 100644
> > --- a/man2/execve.2
> > +++ b/man2/execve.2
> > @@ -633,8 +633,9 @@ prototype:
> >  .in
> >  .\"
> >  .SS Interpreter scripts
> > -A maximum line length of 127 characters is allowed for the first line in
> > -an interpreter script.
> > +Before Linux 4.20, a maximum line length of 127 characters was allowed
> > +for the first line in an interpreter script.
> > +In Linux 4.20 onwards, this limit was increased up to 255 characters.
> >  .PP
> >  The semantics of the
> >  .I optional-arg
> > --
> > 2.1.4
> >
> 
> 
> -- 
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2018-11-12 16:09 ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256 Oleg Nesterov
                     ` (2 preceding siblings ...)
  2018-11-16 17:49   ` Alan Cox
@ 2019-02-18 19:37   ` Guenter Roeck
  2019-02-19 12:37     ` Oleg Nesterov
  3 siblings, 1 reply; 19+ messages in thread
From: Guenter Roeck @ 2019-02-18 19:37 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook,
	Michal Hocko, linux-kernel

Hi,

On Mon, Nov 12, 2018 at 05:09:56PM +0100, Oleg Nesterov wrote:
> Large enterprise clients often times run applications out of networked
> file systems where the IT mandated layout of project volumes can end up
> leading to paths that are longer than 128 characters. Bumping this up to
> the next order of two solves this problem in all but the most egregious
> case while still fitting into a 512b slab.
> 
> Reported-by: Ben Woodard <woodard@redhat.com>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> Acked-by: Kees Cook <keescook@chromium.org>
> Acked-by: Michal Hocko <mhocko@suse.com>

Unfortunately, this patch causes one of my qemu emulations to crash.
The crash is not always seen, but at least with every other boot attempt.
Reverting the patch fixes the problem. Crash log and bisect results
are attached below.

For reference, here is the qemu command line. This is with qemu 3.1.

qemu-system-ppc -kernel vmlinux -M virtex-ml507 -m 256 -no-reboot \
	-initrd rootfs.cpio \
	-dtb arch/powerpc/boot/dts/virtex440-ml507.dtb \
	--append 'rdinit=/sbin/init panic=-1 mem=256M console=ttyS0' \
	-monitor none -nographic

Configuration is 44x/virtex5_defconfig with various debug options
enabled.

Guenter

---
bisect:

# bad: [cb916fc5eabf8832e05f73c246eb467259846ef0] Add linux-next specific files for 20190218
# good: [d13937116f1e82bf508a6325111b322c30c85eb9] Linux 5.0-rc6
git bisect start 'HEAD' 'v5.0-rc6'
# good: [8a076e150433e8623c844aef43b38889df81a503] Merge remote-tracking branch 'nand/nand/next'
git bisect good 8a076e150433e8623c844aef43b38889df81a503
# good: [3db519b877efbfa9d58e141a1049774ea56fc90b] Merge remote-tracking branch 'spi/for-next'
git bisect good 3db519b877efbfa9d58e141a1049774ea56fc90b
# good: [8033390e270f04ee300a6358dcbff633a0b99e2c] Merge remote-tracking branch 'staging/staging-next'
git bisect good 8033390e270f04ee300a6358dcbff633a0b99e2c
# good: [e8d89c35e4c67686ebe2471962e1bda2449fbad2] Merge remote-tracking branch 'kselftest/next'
git bisect good e8d89c35e4c67686ebe2471962e1bda2449fbad2
# good: [adf037fa027a2cfa841d358bf40e62fb124674f5] epoll: use rwlock in order to reduce ep_poll_callback() contention
git bisect good adf037fa027a2cfa841d358bf40e62fb124674f5
# good: [a69d64f0ccb8dc334debd1681d0f5f507f35cb83] Merge remote-tracking branch 'nvmem/for-next'
git bisect good a69d64f0ccb8dc334debd1681d0f5f507f35cb83
# bad: [521ffb22b5e3eefc5f4a44d540acf746d6233cde] Merge branch 'akpm-current/current'
git bisect bad 521ffb22b5e3eefc5f4a44d540acf746d6233cde
# good: [f5f8a43b28ac49c9182d5ed86724550bbaeff3e9] Merge remote-tracking branch 'kspp-gustavo/for-next/kspp'
git bisect good f5f8a43b28ac49c9182d5ed86724550bbaeff3e9
# bad: [e0fdf04adb0928a941210ba919507450e9eaf196] sysctl: handle overflow for file-max
git bisect bad e0fdf04adb0928a941210ba919507450e9eaf196
# good: [c427946e32843fa6f9e8a34ff8daf4076ccaebd6] ptrace: take into account saved_sigmask in PTRACE_{GET,SET}SIGMASK
git bisect good c427946e32843fa6f9e8a34ff8daf4076ccaebd6
# bad: [e42c7d9007a96ee56813a304ca5b6b38f3ba181c] kernel: workqueue: clarify wq_worker_last_func() caller requirements
git bisect bad e42c7d9007a96ee56813a304ca5b6b38f3ba181c
# good: [afcd760e9725f4846d10c243f29931d34faa660a] fs/exec.c: replace opencoded set_mask_bits()
git bisect good afcd760e9725f4846d10c243f29931d34faa660a
# bad: [6328cf069588d413c7cd56fa55243cd2a05a0d9d] exec-increase-binprm_buf_size-to-256-fix
git bisect bad 6328cf069588d413c7cd56fa55243cd2a05a0d9d
# bad: [2342f07a7a01eb302fc101664255c3a09dcfdfe9] exec: increase BINPRM_BUF_SIZE to 256
git bisect bad 2342f07a7a01eb302fc101664255c3a09dcfdfe9
# first bad commit: [2342f07a7a01eb302fc101664255c3a09dcfdfe9] exec: increase BINPRM_BUF_SIZE to 256

---
crash log:

...
[    8.240183] squashfs: version 4.0 (2009/01/31) Phillip Lougher
[    8.243094] romfs: ROMFS MTD (C) 2007 Red Hat, Inc.
[    8.264478] ------------[ cut here ]------------
[    8.264517] DEBUG_LOCKS_WARN_ON(!current->hardirqs_enabled)
[    8.264794] WARNING: CPU: 0 PID: 354 at kernel/locking/lockdep.c:3754 check_flags.part.23+0x1cc/0x1f8
[    8.264832] Modules linked in:
[    8.264980] CPU: 0 PID: 354 Comm: modprobe Not tainted 5.0.0-rc6-next-20190218+ #2
[    8.265030] NIP:  c00771f4 LR: c00771f4 CTR: 00000000
[    8.265079] REGS: cfb23e00 TRAP: 0700   Not tainted  (5.0.0-rc6-next-20190218+)
[    8.265093] MSR:  00021000 <CE,ME>  CR: 28000222  XER: 20000000
[    8.265167] 
[    8.265167] GPR00: c00771f4 cfb23eb0 cf92a220 0000002f 00000001 c008b5e8 c118b816 0000002f 
[    8.265167] GPR08: 00021000 00000000 00000001 00000004 28000428 00000000 00000000 00000000 
[    8.265167] GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 bffd89b8 b7a2a688 
[    8.265167] GPR24: 00000000 00000001 00029000 ffffffff b79eb000 c0a1b1b4 cf92a220 c0a90000 
[    8.265271] NIP [c00771f4] check_flags.part.23+0x1cc/0x1f8
[    8.265289] LR [c00771f4] check_flags.part.23+0x1cc/0x1f8
[    8.265321] Call Trace:
[    8.265362] [cfb23eb0] [c00771f4] check_flags.part.23+0x1cc/0x1f8 (unreliable)
[    8.265472] [cfb23ec0] [c0079e18] lock_is_held_type+0x164/0x1a8
[    8.265521] [cfb23ee0] [c07aad64] __schedule+0x49c/0x870
[    8.265537] [cfb23f30] [c07ab16c] schedule+0x34/0x94
[    8.265551] [cfb23f40] [c000f95c] recheck+0x0/0x24
[    8.265633] --- interrupt: 901 at 0xb7a0281c
[    8.265633]     LR = 0xb7a2b854
[    8.265667] Instruction dump:
[    8.265783] 3c80c091 3c60c090 3884b100 38635d6c 4bfba3cd 0fe00000 4bffff6c 3c80c091 
[    8.265811] 3c60c090 3884b144 38635d6c 4bfba3b1 <0fe00000> 4bfffec8 3c80c091 3c60c090 
[    8.265864] irq event stamp: 354
[    8.265884] hardirqs last  enabled at (352): [<c00a472c>] rcu_core+0x6d8/0x738
[    8.265910] hardirqs last disabled at (353): [<c07b1d68>] __do_softirq+0x2f8/0x518
[    8.265928] softirqs last  enabled at (354): [<c07b1dd8>] __do_softirq+0x368/0x518
[    8.265949] softirqs last disabled at (327): [<c003717c>] irq_exit+0x128/0x134
[    8.265990] ---[ end trace 13c9ff51d41b9d3f ]---
[    8.266074] possible reason: unannotated irqs-on.
[    8.266091] irq event stamp: 354
[    8.266104] hardirqs last  enabled at (352): [<c00a472c>] rcu_core+0x6d8/0x738
[    8.266117] hardirqs last disabled at (353): [<c07b1d68>] __do_softirq+0x2f8/0x518
[    8.266129] softirqs last  enabled at (354): [<c07b1dd8>] __do_softirq+0x368/0x518
[    8.266142] softirqs last disabled at (327): [<c003717c>] irq_exit+0x128/0x134
[    8.376997] io scheduler mq-deadline registered
[    8.377091] io scheduler kyber registered
[    9.499525] String selftests succeeded
[    9.500387] test_firmware: interface ready
[    9.501040] test passed
[    9.502388] test_bitmap: test 13: input is '0-2047:128/256' OK, Time: 480
[    9.512115] test_bitmap: all 1524 tests passed
[    9.512492] test_uuid: all 18 tests passed
[    9.513657] crc32: CRC_LE_BITS = 64, CRC_BE BITS = 64
[    9.513680] crc32: self tests passed, processed 225944 bytes in 492400 nsec
[    9.514233] crc32c: CRC_LE_BITS = 64
[    9.514257] crc32c: self tests passed, processed 225944 bytes in 257883 nsec
[    9.562213] crc32_combine: 8373 self tests passed
[    9.608528] crc32c_combine: 8373 self tests passed
[    9.608847] glob: 64 self-tests passed, 0 failed
[    9.609064] rbtree testing
[    9.626316]  -> test 1 (latency of nnodes insert+delete): 6875 cycles
[    9.646472]  -> test 2 (latency of nnodes cached insert+delete): 8050 cycles
[    9.649627]  -> test 3 (latency of inorder traversal): 1241 cycles
[    9.649711]  -> test 4 (latency to fetch first node)
[    9.649730]         non-cached: 25 cycles
[    9.649748]         cached: 0 cycles
[    9.800745] augmented rbtree testing
[    9.836800]  -> test 1 (latency of nnodes insert+delete): 14408 cycles
[    9.873724]  -> test 2 (latency of nnodes cached insert+delete): 14749 cycles
[   10.097628] interval tree insert/remove
[   10.124548]  -> 10735 cycles
[   10.124572] interval tree search
[   10.345597]  -> 88388 cycles (2692 results)
[   10.570802] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[   10.579938] printk: console [ttyS0] disabled
[   10.581102] 83e00000.serial: ttyS0 at MMIO 0x83e01003 (irq = 20, base_baud = 6250000) is a 16550
[   10.615302] printk: console [ttyS0] enabled
[   10.658149] brd: module loaded
[   10.679108] loop: module loaded
[   10.681671] BUG: Kernel NULL pointer dereference at 0x00000040
[   10.681826] Faulting instruction address: 0xc0431480
[   10.682072] Oops: Kernel access of bad area, sig: 11 [#1]
[   10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440
[   10.682387] Modules linked in:
[   10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G        W         5.0.0-rc6-next-20190218+ #2
[   10.682733] NIP:  c0431480 LR: c043147c CTR: c0422ad8
[   10.682863] REGS: cf82fbe0 TRAP: 0300   Tainted: G        W          (5.0.0-rc6-next-20190218+)
[   10.683065] MSR:  00029000 <CE,EE,ME>  CR: 22000222  XER: 00000000
[   10.683236] DEAR: 00000040 ESR: 00000000 
[   10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000 
[   10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000 
[   10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000 
[   10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800 
[   10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114
[   10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114
[   10.684602] Call Trace:
[   10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable)
[   10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c
[   10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68
[   10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c
[   10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508
[   10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8
[   10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c
[   10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464
[   10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4
[   10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc
[   10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0
[   10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234
[   10.686457] [cf82fe50] [c052c46c] driver_register+0x88/0x15c
[   10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac
[   10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330
[   10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478
[   10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114
[   10.687188] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c
[   10.687349] Instruction dump:
[   10.687435] 3863ffd4 4bfffd70 9421ffd0 7c0802a6 93c10028 7c9e2378 93e1002c 38810008 
[   10.687637] 7c7f1b78 90010034 4bfffc25 813f008c <81290040> 75290100 4182002c 80810008 
[   10.688056] ---[ end trace 13c9ff51d41b9d40 ]---
[   10.688230] 
[   11.688548] BUG: sleeping function called from invalid context at ./include/linux/cgroup-defs.h:726
[   11.688768] in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: swapper
[   11.688913] INFO: lockdep is turned off.
[   11.689002] irq event stamp: 110526
[   11.689092] hardirqs last  enabled at (110525): [<c07b0fc4>] _raw_spin_unlock_irq+0x30/0x80
[   11.689277] hardirqs last disabled at (110526): [<c07aa964>] __schedule+0x9c/0x870
[   11.689460] softirqs last  enabled at (110390): [<c07b1dd8>] __do_softirq+0x368/0x518
[   11.689634] softirqs last disabled at (110361): [<c003717c>] irq_exit+0x128/0x134
[   11.689834] Preemption disabled at:
[   11.689848] [<c008b490>] wake_up_klogd+0x14/0x94
[   11.690084] CPU: 0 PID: 1 Comm: swapper Tainted: G      D W         5.0.0-rc6-next-20190218+ #2
[   11.690307] Call Trace:
[   11.690388] [cf82fb10] [c0060c38] ___might_sleep+0x210/0x284 (unreliable)
[   11.690560] [cf82fb30] [c0045198] exit_signals+0x28/0x140
[   11.690699] [cf82fb50] [c0033ff0] do_exit+0xb8/0xb20
[   11.690823] [cf82fba0] [c000b59c] die+0x214/0x274
[   11.690940] [cf82fbd0] [c000f44c] handle_page_fault+0x34/0x38
[   11.691085] --- interrupt: 300 at blk_mq_run_hw_queue+0x28/0x114
[   11.691085]     LR = blk_mq_run_hw_queue+0x24/0x114
[   11.691338] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c
[   11.691491] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68
[   11.691641] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c
[   11.691794] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508
[   11.691999] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8
[   11.692175] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c
[   11.692345] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464
[   11.692540] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4
[   11.692721] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc
[   11.692890] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0
[   11.693058] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234
[   11.693221] [cf82fe50] [c052c46c] driver_register+0x88/0x15c
[   11.693386] [cf82fe60] [c09de288] ace_init+0x4c/0xac
[   11.693531] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330
[   11.693703] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478
[   11.693888] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114
[   11.694043] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c
[   11.724430] BUG: sleeping function called from invalid context at kernel/nsproxy.c:221
[   11.724657] in_atomic(): 0, irqs_disabled(): 0, pid: 1, name: swapper
[   11.724821] INFO: lockdep is turned off.
[   11.724932] Preemption disabled at:
[   11.724977] [<00000000>]   (null)
[   11.725184] CPU: 0 PID: 1 Comm: swapper Tainted: G      D W         5.0.0-rc6-next-20190218+ #2
[   11.725430] Call Trace:
[   11.725521] [cf82fb10] [c0060c38] ___might_sleep+0x210/0x284 (unreliable)
[   11.725729] [cf82fb30] [c005a0cc] switch_task_namespaces+0x3c/0xb8
[   11.725905] [cf82fb50] [c0034264] do_exit+0x32c/0xb20
[   11.726049] [cf82fba0] [c000b59c] die+0x214/0x274
[   11.726195] [cf82fbd0] [c000f44c] handle_page_fault+0x34/0x38
[   11.726389] --- interrupt: 300 at blk_mq_run_hw_queue+0x28/0x114
[   11.726389]     LR = blk_mq_run_hw_queue+0x24/0x114
[   11.726730] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c
[   11.726907] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68
[   11.727075] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c
[   11.727243] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508
[   11.727391] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8
[   11.727552] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c
[   11.727702] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464
[   11.727910] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4
[   11.728095] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc
[   11.728322] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0
[   11.728486] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234
[   11.728639] [cf82fe50] [c052c46c] driver_register+0x88/0x15c
[   11.728806] [cf82fe60] [c09de288] ace_init+0x4c/0xac
[   11.728954] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330
[   11.729135] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478
[   11.729319] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114
[   11.729484] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c
[   11.729774] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
Reset PowerPC system

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2019-02-18 19:37   ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE " Guenter Roeck
@ 2019-02-19 12:37     ` Oleg Nesterov
  2019-02-19 16:26       ` Guenter Roeck
  0 siblings, 1 reply; 19+ messages in thread
From: Oleg Nesterov @ 2019-02-19 12:37 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook,
	Michal Hocko, linux-kernel

On 02/18, Guenter Roeck wrote:
>
> Unfortunately, this patch causes one of my qemu emulations to crash.
> The crash is not always seen, but at least with every other boot attempt.

Hmm. I can't imagine how this change can cause the null-ptr-deref in
blk_mq_run_hw_queue().

> Reverting the patch fixes the problem. Crash log and bisect results
> are attached below.

Do you mean that you applied the "revert" patch on top of linux-next ?

Or did you rely on git-bisect ?

> [   10.681671] BUG: Kernel NULL pointer dereference at 0x00000040
> [   10.681826] Faulting instruction address: 0xc0431480
> [   10.682072] Oops: Kernel access of bad area, sig: 11 [#1]
> [   10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440
> [   10.682387] Modules linked in:
> [   10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G        W         5.0.0-rc6-next-20190218+ #2
> [   10.682733] NIP:  c0431480 LR: c043147c CTR: c0422ad8
> [   10.682863] REGS: cf82fbe0 TRAP: 0300   Tainted: G        W          (5.0.0-rc6-next-20190218+)
> [   10.683065] MSR:  00029000 <CE,EE,ME>  CR: 22000222  XER: 00000000
> [   10.683236] DEAR: 00000040 ESR: 00000000 
> [   10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000 
> [   10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000 
> [   10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000 
> [   10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800 
> [   10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114
> [   10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114
> [   10.684602] Call Trace:
> [   10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable)
> [   10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c
> [   10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68
> [   10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c
> [   10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508
> [   10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8
> [   10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c
> [   10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464
> [   10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4
> [   10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc
> [   10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0
> [   10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234
> [   10.686457] [cf82fe50] [c052c46c] driver_register+0x88/0x15c
> [   10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac
> [   10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330
> [   10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478
> [   10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114

looks unrelated...

Oleg.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2019-02-19 12:37     ` Oleg Nesterov
@ 2019-02-19 16:26       ` Guenter Roeck
  2019-02-19 17:36         ` Oleg Nesterov
  0 siblings, 1 reply; 19+ messages in thread
From: Guenter Roeck @ 2019-02-19 16:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook,
	Michal Hocko, linux-kernel

On Tue, Feb 19, 2019 at 01:37:57PM +0100, Oleg Nesterov wrote:
> On 02/18, Guenter Roeck wrote:
> >
> > Unfortunately, this patch causes one of my qemu emulations to crash.
> > The crash is not always seen, but at least with every other boot attempt.
> 
> Hmm. I can't imagine how this change can cause the null-ptr-deref in
> blk_mq_run_hw_queue().
> 
Me not either.

> > Reverting the patch fixes the problem. Crash log and bisect results
> > are attached below.
> 
> Do you mean that you applied the "revert" patch on top of linux-next ?
> 
I reverted to patch on top of linux-next (next-20190218, more specifically).
The problem was gone. I then reverted the revert and the probllem was back.

> Or did you rely on git-bisect ?
> 
Sorry, I don't understand the question. git bisect, unless I am missing
something, doesn't revert any patches.

> > [   10.681671] BUG: Kernel NULL pointer dereference at 0x00000040
> > [   10.681826] Faulting instruction address: 0xc0431480
> > [   10.682072] Oops: Kernel access of bad area, sig: 11 [#1]
> > [   10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440
> > [   10.682387] Modules linked in:
> > [   10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G        W         5.0.0-rc6-next-20190218+ #2
> > [   10.682733] NIP:  c0431480 LR: c043147c CTR: c0422ad8
> > [   10.682863] REGS: cf82fbe0 TRAP: 0300   Tainted: G        W          (5.0.0-rc6-next-20190218+)
> > [   10.683065] MSR:  00029000 <CE,EE,ME>  CR: 22000222  XER: 00000000
> > [   10.683236] DEAR: 00000040 ESR: 00000000 
> > [   10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000 
> > [   10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000 
> > [   10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000 
> > [   10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800 
> > [   10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114
> > [   10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114
> > [   10.684602] Call Trace:
> > [   10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable)
> > [   10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c
> > [   10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68
> > [   10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c
> > [   10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508
> > [   10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8
> > [   10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c
> > [   10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464
> > [   10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4
> > [   10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc
> > [   10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0
> > [   10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234
> > [   10.686457] [cf82fe50] [c052c46c] driver_register+0x88/0x15c
> > [   10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac
> > [   10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330
> > [   10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478
> > [   10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114
> 
> looks unrelated...
> 

Indeed...

The underlying problem is in the error handling code of ace_setup(),
which calls put_disk() followed by blk_cleanup_queue(). put_disk()
calls disk_release(), which calls blk_put_queue(), which in turn
results in a call to blk_mq_hw_sysfs_release().

Added debug code, with your patch reverted, shows:

 ######### blk_mq_hw_sysfs_release hctx=cee4a800
 ...
 ######### blk_mq_run_hw_queue hctx=cee4a800

blk_mq_hw_sysfs_release() calls kfree(htcx), so accessing it later is most
definitely not a good idea.

No idea why this only causes problems with your patch applied.

I'll send a patch to fix the underlying problem.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256
  2019-02-19 16:26       ` Guenter Roeck
@ 2019-02-19 17:36         ` Oleg Nesterov
  0 siblings, 0 replies; 19+ messages in thread
From: Oleg Nesterov @ 2019-02-19 17:36 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Andrew Morton, Ben Woodard, Eric W. Biederman, Kees Cook,
	Michal Hocko, linux-kernel

On 02/19, Guenter Roeck wrote:
>
> On Tue, Feb 19, 2019 at 01:37:57PM +0100, Oleg Nesterov wrote:
> >
> > looks unrelated...
> >
>
> Indeed...
>
> The underlying problem is in the error handling code of ace_setup(),
> which calls put_disk() followed by blk_cleanup_queue(). put_disk()
> calls disk_release(), which calls blk_put_queue(), which in turn
> results in a call to blk_mq_hw_sysfs_release().
>
> Added debug code, with your patch reverted, shows:
>
>  ######### blk_mq_hw_sysfs_release hctx=cee4a800
>  ...
>  ######### blk_mq_run_hw_queue hctx=cee4a800
>
> blk_mq_hw_sysfs_release() calls kfree(htcx), so accessing it later is most
> definitely not a good idea.

Thanks!

> No idea why this only causes problems with your patch applied.

Well... blk_put_queue() may trigger kobject_uevent() which does call_usermodehelper.

So if one of the used-after-free datastructures was already re-allocated as
linux_binprm, then with my patch it can look "more corrupted"...

But honestly, I too have no idea.

Thanks Guenter.

Oleg.


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2019-02-19 17:37 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-12 16:09 [PATCH 1/2] exec: load_script: don't blindly truncate shebang string Oleg Nesterov
2018-11-12 16:09 ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE to 256 Oleg Nesterov
2018-11-12 23:52   ` Andrew Morton
2018-11-13  5:03     ` Kees Cook
2018-11-13 16:55     ` Oleg Nesterov
2018-11-13 20:43       ` Andrew Morton
2018-11-14 15:54         ` Oleg Nesterov
2018-11-14 16:01           ` Michal Hocko
2018-11-13 10:29   ` Michal Hocko
2018-11-16 17:49   ` Alan Cox
2018-11-22 12:15     ` Oleg Nesterov
     [not found]     ` <20181121160753.GA32685@asgard.redhat.com>
     [not found]       ` <CAKgNAkhFikJXeOx3W3yL3EUKa9ruXtAw93m4M=N+3Kg-bXbPDQ@mail.gmail.com>
2018-11-22 17:32         ` [PATCH] execve.2: document an effect of BINPRM_BUF_SIZE increase " Eugene Syromiatnikov
2019-02-18 19:37   ` [PATCH 2/2] exec: increase BINPRM_BUF_SIZE " Guenter Roeck
2019-02-19 12:37     ` Oleg Nesterov
2019-02-19 16:26       ` Guenter Roeck
2019-02-19 17:36         ` Oleg Nesterov
2018-11-13 10:27 ` [PATCH 1/2] exec: load_script: don't blindly truncate shebang string Michal Hocko
2018-11-13 16:41   ` Oleg Nesterov
2018-11-13 20:16 ` Kees Cook

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).