linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] iov_iter: separate direction from flavour
@ 2021-07-04 17:29 Guenter Roeck
  2021-07-04 18:31 ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2021-07-04 17:29 UTC (permalink / raw)
  To: Al Viro
  Cc: Linus Torvalds, linux-fsdevel, linux-kernel, David Sterba,
	Miklos Szeredi, Anton Altaparmakov, David Howells,
	Matthew Wilcox, Pavel Begunkov

Hi,

On Thu, Apr 22, 2021 at 02:50:39PM -0400, Al Viro wrote:
> Instead of having them mixed in iter->type, use separate ->iter_type
> and ->data_source (u8 and bool resp.)  And don't bother with (pseudo-)
> bitmap for the former - microoptimizations from being able to check
> if the flavour is one of two values are not worth the confusion for
> optimizer.  It can't prove that we never get e.g. ITER_IOVEC | ITER_PIPE,
> so we end up with extra headache.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

This patch results in the following runtime warning on nommu systems.

[    8.567154] Run /init as init process
[    8.572112] ------------[ cut here ]------------
[    8.572248] WARNING: CPU: 0 PID: 1 at lib/iov_iter.c:468 iov_iter_init+0x35/0x58
[    8.572484] CPU: 0 PID: 1 Comm: init Not tainted 5.13.0-09606-g303392fd5c16 #1
[    8.572695] Hardware name: MPS2 (Device Tree Support)
[    8.573278] [<2100ae75>] (unwind_backtrace) from [<2100a2bb>] (show_stack+0xb/0xc)
[    8.573594] [<2100a2bb>] (show_stack) from [<2100da03>] (__warn+0x5f/0x80)
[    8.573738] [<2100da03>] (__warn) from [<2100da55>] (warn_slowpath_fmt+0x31/0x60)
[    8.573886] [<2100da55>] (warn_slowpath_fmt) from [<210d8e1d>] (iov_iter_init+0x35/0x58)
[    8.574044] [<210d8e1d>] (iov_iter_init) from [<21059cab>] (vfs_read+0x89/0xc6)
[    8.574191] [<21059cab>] (vfs_read) from [<2105d92b>] (read_code+0x15/0x2e)
[    8.574329] [<2105d92b>] (read_code) from [<21085a8d>] (load_flat_file+0x341/0x4f0)
[    8.574481] [<21085a8d>] (load_flat_file) from [<21085e03>] (load_flat_binary+0x47/0x2dc)
[    8.574639] [<21085e03>] (load_flat_binary) from [<2105d581>] (bprm_execve+0x1fd/0x32c)
[    8.574797] [<2105d581>] (bprm_execve) from [<2105dbb3>] (kernel_execve+0xa3/0xac)
[    8.574947] [<2105dbb3>] (kernel_execve) from [<211e7095>] (kernel_init+0x31/0xb0)
[    8.575099] [<211e7095>] (kernel_init) from [<2100814d>] (ret_from_fork+0x11/0x24)
[    8.575287] Exception stack(0x21429fb0 to 0x21429ff8)
[    8.575433] 9fa0:                                     00000000 00000000 00000000 00000000
[    8.575593] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    8.575743] 9fe0: 00000000 00000000 00000000 00000000 00000000 00000000
[    8.575933] ---[ end trace ba15568c05035a77 ]---

This is with qemu's mps2-an385 emulation and and mps2_defconfig.
The same warning is also observed with m68k and mcf5208evb,
though the traceback isn't as nice.

WARNING: CPU: 0 PID: 1 at lib/iov_iter.c:468 0x40135e4e
...
Call Trace:
        [<402b0f42>] 0x402b0f42
 [<402b0fea>] 0x402b0fea
 [<40135e4e>] 0x40135e4e
 [<40135e4e>] 0x40135e4e
 [<4009c610>] 0x4009c610
...

Reverting this patch fixes the problem for both mps2-an385 and mcf5208evb.

Guenter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 17:29 [PATCH] iov_iter: separate direction from flavour Guenter Roeck
@ 2021-07-04 18:31 ` Linus Torvalds
  2021-07-04 18:54   ` Guenter Roeck
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-07-04 18:31 UTC (permalink / raw)
  To: Guenter Roeck, Christoph Hellwig
  Cc: Al Viro, linux-fsdevel, Linux Kernel Mailing List, David Sterba,
	Miklos Szeredi, Anton Altaparmakov, David Howells,
	Matthew Wilcox, Pavel Begunkov

[ Added Christoph, since the issue technically goes further back than
when the warning appeared - it just used to be silent ]

On Sun, Jul 4, 2021 at 10:29 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> This patch results in the following runtime warning on nommu systems.

Ok, good, it actually found something.

> [    8.574191] [<21059cab>] (vfs_read) from [<2105d92b>] (read_code+0x15/0x2e)
> [    8.574329] [<2105d92b>] (read_code) from [<21085a8d>] (load_flat_file+0x341/0x4f0)
> [    8.574481] [<21085a8d>] (load_flat_file) from [<21085e03>] (load_flat_binary+0x47/0x2dc)
> [    8.574639] [<21085e03>] (load_flat_binary) from [<2105d581>] (bprm_execve+0x1fd/0x32c)

Hmm. That actually loads things into user space, so the problem isn't
that it shouldn't use vfs_read() or that iov_iter_init() would be
doing somethign wrong - the problem appears purely to be that we're in
an "uaccess_kernel()" region.

And yes, we're still in the early init code:

> [    8.574797] [<2105d581>] (bprm_execve) from [<2105dbb3>] (kernel_execve+0xa3/0xac)
> [    8.574947] [<2105dbb3>] (kernel_execve) from [<211e7095>] (kernel_init+0x31/0xb0)
> [    8.575099] [<211e7095>] (kernel_init) from [<2100814d>] (ret_from_fork+0x11/0x24)

which presumably runs with KERNEL_DS.

Which is kind of bogus in the new world order.

None of this *matters* for a nommu system, of course, which is why
that code used to work, and why it's now warning.

But for the same reason, it should still continue to work, despite the
warning. Because iov_iter_init() will actually be doing the right
thing and making it all about user pointers.

Can you verify that it otherwise looks ok apart from the new warning?

I *think* we should move to initializing the kernel state to
"set_fs(USER_DS)", and that would be the right model these days.

Of course, that could cause other things to pop up on architectures
that haven't been converted away from CONFIG_SET_FS.

The safer thing might be to move it earlier in kernel_execve(): it
does end up doing that "set_fs(USER_DS)" eventually, but it's done
fairly late in "begin_new_exec()" (and it's done as a
force_uaccess_begin(), not set_fs(), but in a CONFIG_SET_FS
configuration that ends up being what it does.

> The same warning is also observed with m68k and mcf5208evb,
> though the traceback isn't as nice.

Hmm. Either the m68k trace printing is just bad, or maybe it just
doesn't have CONFIG_KALLSYMS (or KALLSYMS_ALL) enabled.

Anyway, does a hacky patch something like this

   diff --git a/fs/exec.c b/fs/exec.c
   index 38f63451b928..26293bd7c502 100644
   --- a/fs/exec.c
   +++ b/fs/exec.c
   @@ -1934,6 +1934,10 @@ int kernel_execve(const char *kernel_filename,
        int fd = AT_FDCWD;
        int retval;

   +    // Make sure CONFIG_SET_FS architectures actually
   +    // do things into user space.
   +    force_uaccess_begin();
   +
        filename = getname_kernel(kernel_filename);
        if (IS_ERR(filename))
                return PTR_ERR(filename);

make the warning go away? I really would like to set USER_DS even
earlier, but this might be the least disruptive place.

Anything that accesses kernel space should *not* depend on KERNEL_DS
at this point, since that would make all the properly converted
architectures already fail.

And any architectures that haven't been converted away from
CONFIG_SET_FS would have been hitting that force_uaccess_begin() later
in  begin_new_exec(), so they can't depend on any KERNEL_DS games
after kernel_execve() either.

So that one-liner is hacky, but *feels* safe to me. Am I perhaps
missing something?

It probably means we could remove the force_uaccess_begin() in
begin_new_exec() entirely, but let's first see if the one-liner at
least makes the warning go away.

           Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 18:31 ` Linus Torvalds
@ 2021-07-04 18:54   ` Guenter Roeck
  2021-07-04 19:04     ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2021-07-04 18:54 UTC (permalink / raw)
  To: Linus Torvalds, Christoph Hellwig
  Cc: Al Viro, linux-fsdevel, Linux Kernel Mailing List, David Sterba,
	Miklos Szeredi, Anton Altaparmakov, David Howells,
	Matthew Wilcox, Pavel Begunkov

On 7/4/21 11:31 AM, Linus Torvalds wrote:
> [ Added Christoph, since the issue technically goes further back than
> when the warning appeared - it just used to be silent ]
> 
> On Sun, Jul 4, 2021 at 10:29 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> This patch results in the following runtime warning on nommu systems.
> 
> Ok, good, it actually found something.
> 
>> [    8.574191] [<21059cab>] (vfs_read) from [<2105d92b>] (read_code+0x15/0x2e)
>> [    8.574329] [<2105d92b>] (read_code) from [<21085a8d>] (load_flat_file+0x341/0x4f0)
>> [    8.574481] [<21085a8d>] (load_flat_file) from [<21085e03>] (load_flat_binary+0x47/0x2dc)
>> [    8.574639] [<21085e03>] (load_flat_binary) from [<2105d581>] (bprm_execve+0x1fd/0x32c)
> 
> Hmm. That actually loads things into user space, so the problem isn't
> that it shouldn't use vfs_read() or that iov_iter_init() would be
> doing somethign wrong - the problem appears purely to be that we're in
> an "uaccess_kernel()" region.
> 
> And yes, we're still in the early init code:
> 
>> [    8.574797] [<2105d581>] (bprm_execve) from [<2105dbb3>] (kernel_execve+0xa3/0xac)
>> [    8.574947] [<2105dbb3>] (kernel_execve) from [<211e7095>] (kernel_init+0x31/0xb0)
>> [    8.575099] [<211e7095>] (kernel_init) from [<2100814d>] (ret_from_fork+0x11/0x24)
> 
> which presumably runs with KERNEL_DS.
> 
> Which is kind of bogus in the new world order.
> 
> None of this *matters* for a nommu system, of course, which is why
> that code used to work, and why it's now warning.
> 
> But for the same reason, it should still continue to work, despite the
> warning. Because iov_iter_init() will actually be doing the right
> thing and making it all about user pointers.
> 
> Can you verify that it otherwise looks ok apart from the new warning?
> 

Yes, it does, at least as far as I can see.

> I *think* we should move to initializing the kernel state to
> "set_fs(USER_DS)", and that would be the right model these days.
> 
> Of course, that could cause other things to pop up on architectures
> that haven't been converted away from CONFIG_SET_FS.
> 
> The safer thing might be to move it earlier in kernel_execve(): it
> does end up doing that "set_fs(USER_DS)" eventually, but it's done
> fairly late in "begin_new_exec()" (and it's done as a
> force_uaccess_begin(), not set_fs(), but in a CONFIG_SET_FS
> configuration that ends up being what it does.
> 
>> The same warning is also observed with m68k and mcf5208evb,
>> though the traceback isn't as nice.
> 
> Hmm. Either the m68k trace printing is just bad, or maybe it just
> doesn't have CONFIG_KALLSYMS (or KALLSYMS_ALL) enabled.
> 

You are correct. After enabling CONFIG_KALLSYMS:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at lib/iov_iter.c:468 iov_iter_init+0x82/0xe4
CPU: 0 PID: 1 Comm: init Not tainted 5.13.0-09608-g678b12cd4025 #1
Stack from 40835d8c:
         40835d8c 4030cee2 4030cee2 402b1a76 403122a4 00000000 000001d4 00000009
         40835e52 40835eb0 402b1b1e 403122a4 000001d4 401368e6 00000009 00000000
         00000000 00000000 00000000 00000001 00000000 00000000 00000007 40835e88
         401368e6 403122a4 000001d4 00000009 00000000 40835e52 408e2360 4009d050
         40835e52 00000000 40835e4a 00000001 000994b8 00000000 410994c0 0003d87c
         00005919 00027418 41000000 40c7ec00 400a4b18 000d6d34 000994b8 00004100
Call Trace: [<402b1a76>] __warn+0xac/0x112
  [<402b1b1e>] warn_slowpath_fmt+0x42/0x72
  [<401368e6>] iov_iter_init+0x82/0xe4
  [<401368e6>] iov_iter_init+0x82/0xe4
  [<4009d050>] vfs_read+0x1c8/0x31c
  [<400a4b18>] read_code+0x0/0x32
  [<4002dea4>] __flush_itimer_signals+0x0/0xcc
  [<40087082>] vm_mmap_pgoff+0x5c/0x84
  [<400a4b32>] read_code+0x1a/0x32
  [<400f513c>] load_flat_binary+0x596/0x92a
  [<4009cdf0>] kernel_read+0x0/0x98
  [<402b1060>] memset+0x0/0x70
  [<4009f64a>] fput+0x0/0x18
  [<400a4220>] bprm_execve+0x188/0x3be
  [<400a47da>] copy_strings_kernel+0x0/0x86
  [<400a4e4c>] kernel_execve+0xfc/0x18c
  [<40098278>] kfree+0x0/0x1dc
  [<402b6a34>] schedule+0x0/0xd0
  [<402b26b8>] printk+0x0/0x18
  [<402b11d8>] run_init_process+0x80/0x8c
  [<402b26b8>] printk+0x0/0x18
  [<402b5362>] kernel_init+0x0/0xe8
  [<402b53b2>] kernel_init+0x50/0xe8
  [<400208c4>] ret_from_kernel_thread+0xc/0x14

---[ end trace 4a67f75a6eb7dc98 ]---

> Anyway, does a hacky patch something like this
> 
>     diff --git a/fs/exec.c b/fs/exec.c
>     index 38f63451b928..26293bd7c502 100644
>     --- a/fs/exec.c
>     +++ b/fs/exec.c
>     @@ -1934,6 +1934,10 @@ int kernel_execve(const char *kernel_filename,
>          int fd = AT_FDCWD;
>          int retval;
> 
>     +    // Make sure CONFIG_SET_FS architectures actually
>     +    // do things into user space.
>     +    force_uaccess_begin();
>     +
>          filename = getname_kernel(kernel_filename);
>          if (IS_ERR(filename))
>                  return PTR_ERR(filename);
> 
> make the warning go away? I really would like to set USER_DS even
> earlier, but this might be the least disruptive place.
> 

No, I still see the same warning, with the same traceback. I did make sure
that the code is executed by adding a printk in front of it.

Guenter

> Anything that accesses kernel space should *not* depend on KERNEL_DS
> at this point, since that would make all the properly converted
> architectures already fail.
> 
> And any architectures that haven't been converted away from
> CONFIG_SET_FS would have been hitting that force_uaccess_begin() later
> in  begin_new_exec(), so they can't depend on any KERNEL_DS games
> after kernel_execve() either.
> 
> So that one-liner is hacky, but *feels* safe to me. Am I perhaps
> missing something?
> 
> It probably means we could remove the force_uaccess_begin() in
> begin_new_exec() entirely, but let's first see if the one-liner at
> least makes the warning go away.
> 
>             Linus
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 18:54   ` Guenter Roeck
@ 2021-07-04 19:04     ` Linus Torvalds
  2021-07-04 20:28       ` Guenter Roeck
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-07-04 19:04 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Christoph Hellwig, Al Viro, linux-fsdevel,
	Linux Kernel Mailing List, David Sterba, Miklos Szeredi,
	Anton Altaparmakov, David Howells, Matthew Wilcox,
	Pavel Begunkov

On Sun, Jul 4, 2021 at 11:54 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> No, I still see the same warning, with the same traceback. I did make sure
> that the code is executed by adding a printk in front of it.

And that printk() hits before the WARN_ON_ONCE() hits?

Funky. That sounds to me like something is then doing
set_fs(KERNEL_DS) again later, but it's also possible that I've been
dropped on my head a few too many times as a young child, and am
missing something completely obvious.

Can somebody put me out of my misery and say "Oh, Linus, please take
your meds - you're missing xyz..."

                 Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 19:04     ` Linus Torvalds
@ 2021-07-04 20:28       ` Guenter Roeck
  2021-07-04 20:41         ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2021-07-04 20:28 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Al Viro, linux-fsdevel,
	Linux Kernel Mailing List, David Sterba, Miklos Szeredi,
	Anton Altaparmakov, David Howells, Matthew Wilcox,
	Pavel Begunkov

On 7/4/21 12:04 PM, Linus Torvalds wrote:
> On Sun, Jul 4, 2021 at 11:54 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> No, I still see the same warning, with the same traceback. I did make sure
>> that the code is executed by adding a printk in front of it.
> 
> And that printk() hits before the WARN_ON_ONCE() hits?
> 

Yes:

[    8.604785] Run /init as init process
[    8.604933] ##################### calling force_uaccess_begin()
[    8.609691] ------------[ cut here ]------------
[    8.609795] WARNING: CPU: 0 PID: 1 at lib/iov_iter.c:468 iov_iter_init+0x35/0x58
[    8.609979] CPU: 0 PID: 1 Comm: init Not tainted 5.13.0-09608-g678b12cd4025-dirty #1

Either case, the code doesn't do anything, because force_uaccess_begin() is
already called. With more added debugging:

##################### calling force_uaccess_begin()
############## force_uaccess_begin(), called from run_init_process+0x80/0x8c
############## force_uaccess_begin(), called from load_flat_binary+0x10e/0x92a

> Funky. That sounds to me like something is then doing
> set_fs(KERNEL_DS) again later, but it's also possible that I've been
> dropped on my head a few too many times as a young child, and am
> missing something completely obvious.
> 
> Can somebody put me out of my misery and say "Oh, Linus, please take
> your meds - you're missing xyz..."
> 

Turns out that, at least on m68k/nommu, USER_DS and KERNEL_DS are the same.

#define USER_DS         MAKE_MM_SEG(TASK_SIZE)
#define KERNEL_DS       MAKE_MM_SEG(0xFFFFFFFF)

and:

#define TASK_SIZE       (0xFFFFFFFFUL)

I didn't check mps2, but I strongly suspect the same is true there.

Guenter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 20:28       ` Guenter Roeck
@ 2021-07-04 20:41         ` Linus Torvalds
  2021-07-04 21:47           ` Guenter Roeck
  2021-07-05  5:17           ` Christoph Hellwig
  0 siblings, 2 replies; 12+ messages in thread
From: Linus Torvalds @ 2021-07-04 20:41 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Christoph Hellwig, Al Viro, linux-fsdevel,
	Linux Kernel Mailing List, David Sterba, Miklos Szeredi,
	Anton Altaparmakov, David Howells, Matthew Wilcox,
	Pavel Begunkov

On Sun, Jul 4, 2021 at 1:28 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> Turns out that, at least on m68k/nommu, USER_DS and KERNEL_DS are the same.
>
> #define USER_DS         MAKE_MM_SEG(TASK_SIZE)
> #define KERNEL_DS       MAKE_MM_SEG(0xFFFFFFFF)

Ahh. So the code is fine, it's just that "uaccess_kernel()" isn't
something that can be reliably even tested for, and it will always
return true on those nommu platforms.

And we don't have a "uaccess_user()" macro that would test if it
matches USER_DS (and that also would always return true on those
configurations), so we can't just change the

        WARN_ON_ONCE(uaccess_kernel());

into a

        WARN_ON_ONCE(!uaccess_user());

instead.

Very annoying. Basically, every single use of "uaccess_kernel()" is unreliable.

There aren't all that many of them, and most of them are irrelevant
for no-mmu anyway (like the bpf tracing ones, or mm/memory.c). So this
iov_iter.c case is likely the only one that would be an issue.

That warning is something that should go away eventually anyway, but I
_like_ that warning for now, just to get coverage. But apparently it's
just not going to be the case for these situations.

My inclination is to keep it around for a while - to see if it catches
anything else - but remove it for the final 5.14 release because of
these nommu issues.

Of course, I will almost certainly not remember to do that unless
somebody reminds me...

The other alternative would be to just make nommu platforms that have
KERNEL_DS==USER_DS simply do

    #define uaccess_kernel() (false)

and avoid it that way, since that's closer to what the modern
non-CONFIG_SET_FS world view is, and is what include/linux/uaccess.h
does for that case..

               Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 20:41         ` Linus Torvalds
@ 2021-07-04 21:47           ` Guenter Roeck
  2021-07-04 22:44             ` Linus Torvalds
  2021-07-05  5:17           ` Christoph Hellwig
  1 sibling, 1 reply; 12+ messages in thread
From: Guenter Roeck @ 2021-07-04 21:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Al Viro, linux-fsdevel,
	Linux Kernel Mailing List, David Sterba, Miklos Szeredi,
	Anton Altaparmakov, David Howells, Matthew Wilcox,
	Pavel Begunkov

On 7/4/21 1:41 PM, Linus Torvalds wrote:
> On Sun, Jul 4, 2021 at 1:28 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> Turns out that, at least on m68k/nommu, USER_DS and KERNEL_DS are the same.
>>
>> #define USER_DS         MAKE_MM_SEG(TASK_SIZE)
>> #define KERNEL_DS       MAKE_MM_SEG(0xFFFFFFFF)
> 
> Ahh. So the code is fine, it's just that "uaccess_kernel()" isn't
> something that can be reliably even tested for, and it will always
> return true on those nommu platforms.
> 
> And we don't have a "uaccess_user()" macro that would test if it
> matches USER_DS (and that also would always return true on those
> configurations), so we can't just change the
> 
>          WARN_ON_ONCE(uaccess_kernel());
> 
> into a
> 
>          WARN_ON_ONCE(!uaccess_user());
> 
> instead.
> 
> Very annoying. Basically, every single use of "uaccess_kernel()" is unreliable.
> 
> There aren't all that many of them, and most of them are irrelevant
> for no-mmu anyway (like the bpf tracing ones, or mm/memory.c). So this
> iov_iter.c case is likely the only one that would be an issue.
> 
> That warning is something that should go away eventually anyway, but I
> _like_ that warning for now, just to get coverage. But apparently it's
> just not going to be the case for these situations.
> 
> My inclination is to keep it around for a while - to see if it catches
> anything else - but remove it for the final 5.14 release because of
> these nommu issues.
> 
> Of course, I will almost certainly not remember to do that unless
> somebody reminds me...
> 
> The other alternative would be to just make nommu platforms that have
> KERNEL_DS==USER_DS simply do
> 
>      #define uaccess_kernel() (false)
> 
> and avoid it that way, since that's closer to what the modern
> non-CONFIG_SET_FS world view is, and is what include/linux/uaccess.h
> does for that case..
> 

Theoretically, but arm defines it as true with !CONFIG_MMU and then
uses it in user_addr_max():

#define user_addr_max() \
         (uaccess_kernel() ? ~0UL : get_fs())

with !CONFIG_MMU:

#define KERNEL_DS	0x00000000
#define get_fs()	(KERNEL_DS)

How about the following ?

	WARN_ON_ONCE(IS_ENABLED(CONFIG_MMU) && uaccess_kernel());

Guenter

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 21:47           ` Guenter Roeck
@ 2021-07-04 22:44             ` Linus Torvalds
  2021-07-04 22:47               ` Matthew Wilcox
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-07-04 22:44 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Christoph Hellwig, Al Viro, linux-fsdevel,
	Linux Kernel Mailing List, David Sterba, Miklos Szeredi,
	Anton Altaparmakov, David Howells, Matthew Wilcox,
	Pavel Begunkov

On Sun, Jul 4, 2021 at 2:47 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> How about the following ?
>
>         WARN_ON_ONCE(IS_ENABLED(CONFIG_MMU) && uaccess_kernel());

Nope, that doesn't work either, because there are no-MMU setups that
don't make the same mistake no-mmu arm and m68k do.

Example: xtensa. But afaik also generic-asm/uaccess.h unless the
architecture overrides something.

So this literally seems like just an arm/m68k bug.

         Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 22:44             ` Linus Torvalds
@ 2021-07-04 22:47               ` Matthew Wilcox
  2021-07-04 22:53                 ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2021-07-04 22:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Christoph Hellwig, Al Viro, linux-fsdevel,
	Linux Kernel Mailing List, David Sterba, Miklos Szeredi,
	Anton Altaparmakov, David Howells, Pavel Begunkov

On Sun, Jul 04, 2021 at 03:44:17PM -0700, Linus Torvalds wrote:
> On Sun, Jul 4, 2021 at 2:47 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > How about the following ?
> >
> >         WARN_ON_ONCE(IS_ENABLED(CONFIG_MMU) && uaccess_kernel());
> 
> Nope, that doesn't work either, because there are no-MMU setups that
> don't make the same mistake no-mmu arm and m68k do.
> 
> Example: xtensa. But afaik also generic-asm/uaccess.h unless the
> architecture overrides something.
> 
> So this literally seems like just an arm/m68k bug.

We could slip:

#ifndef uaccess_user
#define uaccess_user() !uaccess_kernel()
#endif

into asm-generic, switch the test over and then make it arm/m68k's
problem to define uaccess_user() to true?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 22:47               ` Matthew Wilcox
@ 2021-07-04 22:53                 ` Linus Torvalds
  2021-07-04 23:34                   ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2021-07-04 22:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Guenter Roeck, Christoph Hellwig, Al Viro, linux-fsdevel,
	Linux Kernel Mailing List, David Sterba, Miklos Szeredi,
	Anton Altaparmakov, David Howells, Pavel Begunkov

On Sun, Jul 4, 2021 at 3:47 PM Matthew Wilcox <willy@infradead.org> wrote:
>
>
> We could slip:
>
> #ifndef uaccess_user
> #define uaccess_user() !uaccess_kernel()
> #endif
>
> into asm-generic, switch the test over and then make it arm/m68k's
> problem to define uaccess_user() to true?

Yeah, except at this point I suspect I'll just remove that WARN_ON_ONCE().

I liked it as long as it wasn't giving these false positives. It's
conceptually the right thing to do, but it's also the case that only
CONFIG_SET_FS architectures can trigger it, and none of them get any
real testing or matter much any more.

All the truly relevant architectures have been converted away from set_fs().

And it's actually fairly hard to get this wrong even if you do have
CONFIG_SET_FS - because no generic code does set_fs() any more, so you
*really* have to screw up the few cases where you do it in your own
architecture code.

If it had been easy to fix I'd have kept it, but this amount of pain
isn't worth it - I just don't want to add extra code for architectures
that do things wrong and don't really matter any more.

             Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 22:53                 ` Linus Torvalds
@ 2021-07-04 23:34                   ` Linus Torvalds
  0 siblings, 0 replies; 12+ messages in thread
From: Linus Torvalds @ 2021-07-04 23:34 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Guenter Roeck, Christoph Hellwig, Al Viro, linux-fsdevel,
	Linux Kernel Mailing List, David Sterba, Miklos Szeredi,
	Anton Altaparmakov, David Howells, Pavel Begunkov

On Sun, Jul 4, 2021 at 3:53 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> If it had been easy to fix I'd have kept it, but this amount of pain
> isn't worth it - I just don't want to add extra code for architectures
> that do things wrong and don't really matter any more.

Ok, removed:

   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a180bd1d7e16173d965b263c5a536aa40afa2a2a

where 99% of the work was writing that commit message ;)

              Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] iov_iter: separate direction from flavour
  2021-07-04 20:41         ` Linus Torvalds
  2021-07-04 21:47           ` Guenter Roeck
@ 2021-07-05  5:17           ` Christoph Hellwig
  1 sibling, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2021-07-05  5:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Guenter Roeck, Christoph Hellwig, Al Viro, linux-fsdevel,
	Linux Kernel Mailing List, David Sterba, Miklos Szeredi,
	Anton Altaparmakov, David Howells, Matthew Wilcox,
	Pavel Begunkov

On Sun, Jul 04, 2021 at 01:41:51PM -0700, Linus Torvalds wrote:
> On Sun, Jul 4, 2021 at 1:28 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > Turns out that, at least on m68k/nommu, USER_DS and KERNEL_DS are the same.
> >
> > #define USER_DS         MAKE_MM_SEG(TASK_SIZE)
> > #define KERNEL_DS       MAKE_MM_SEG(0xFFFFFFFF)
> 
> Ahh. So the code is fine, it's just that "uaccess_kernel()" isn't
> something that can be reliably even tested for, and it will always
> return true on those nommu platforms.

Yes, I think m68knommu and armnommu have this problems.  They really
need to be converted to stop implementing set_fs ASAP, as there is no
point for them.

> And we don't have a "uaccess_user()" macro that would test if it
> matches USER_DS (and that also would always return true on those
> configurations), so we can't just change the
> 
>         WARN_ON_ONCE(uaccess_kernel());
> 
> into a
> 
>         WARN_ON_ONCE(!uaccess_user());
> 
> instead.
> 
> Very annoying. Basically, every single use of "uaccess_kernel()" is unreliable.

Yes.

> The other alternative would be to just make nommu platforms that have
> KERNEL_DS==USER_DS simply do
> 
>     #define uaccess_kernel() (false)
> 
> and avoid it that way, since that's closer to what the modern
> non-CONFIG_SET_FS world view is, and is what include/linux/uaccess.h
> does for that case..

Maybe that is the best short-term bandaid.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-07-05  5:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-04 17:29 [PATCH] iov_iter: separate direction from flavour Guenter Roeck
2021-07-04 18:31 ` Linus Torvalds
2021-07-04 18:54   ` Guenter Roeck
2021-07-04 19:04     ` Linus Torvalds
2021-07-04 20:28       ` Guenter Roeck
2021-07-04 20:41         ` Linus Torvalds
2021-07-04 21:47           ` Guenter Roeck
2021-07-04 22:44             ` Linus Torvalds
2021-07-04 22:47               ` Matthew Wilcox
2021-07-04 22:53                 ` Linus Torvalds
2021-07-04 23:34                   ` Linus Torvalds
2021-07-05  5:17           ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).