All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH -next] uaccess: fix __access_ok limit setup in compat mode
@ 2022-03-18  7:11 Chen Jiahao
  2022-03-18  7:44 ` Arnd Bergmann
  0 siblings, 1 reply; 6+ messages in thread
From: Chen Jiahao @ 2022-03-18  7:11 UTC (permalink / raw)
  To: arnd, ebiederm, sam, shorne, dinguyen, linux-arch, linux-kernel

In __access_ok, TASK_SIZE_MAX is used to check if a memory access
is in user address space, but some cases may get omitted in compat
mode.

For example, a 32-bit testcase calling pread64(fd, buf, -1, 1)
and running in x86-64 kernel, the obviously illegal size "-1" will
get ignored by __access_ok. Since from the kernel point of view,
32-bit userspace 0xffffffff is within the limit of 64-bit
TASK_SIZE_MAX.

Replacing the limit TASK_SIZE_MAX with TASK_SIZE in __access_ok
will fix the problem above.

Fixes: 967747bbc084 ("uaccess: remove CONFIG_SET_FS")
Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com>
---
 include/asm-generic/access_ok.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/asm-generic/access_ok.h b/include/asm-generic/access_ok.h
index 2866ae61b1cd..824a6bf1c32f 100644
--- a/include/asm-generic/access_ok.h
+++ b/include/asm-generic/access_ok.h
@@ -30,7 +30,7 @@
  */
 static inline int __access_ok(const void __user *ptr, unsigned long size)
 {
-	unsigned long limit = TASK_SIZE_MAX;
+	unsigned long limit = TASK_SIZE;
 	unsigned long addr = (unsigned long)ptr;
 
 	if (IS_ENABLED(CONFIG_ALTERNATE_USER_ADDRESS_SPACE) ||
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH -next] uaccess: fix __access_ok limit setup in compat mode
  2022-03-18  7:11 [PATCH -next] uaccess: fix __access_ok limit setup in compat mode Chen Jiahao
@ 2022-03-18  7:44 ` Arnd Bergmann
  2022-03-22 12:55   ` chenjiahao (C)
  0 siblings, 1 reply; 6+ messages in thread
From: Arnd Bergmann @ 2022-03-18  7:44 UTC (permalink / raw)
  To: Chen Jiahao
  Cc: Arnd Bergmann, Eric W . Biederman, Sam Ravnborg, Stafford Horne,
	Dinh Nguyen, linux-arch, Linux Kernel Mailing List

On Fri, Mar 18, 2022 at 8:11 AM Chen Jiahao <chenjiahao16@huawei.com> wrote:
>
> In __access_ok, TASK_SIZE_MAX is used to check if a memory access
> is in user address space, but some cases may get omitted in compat
> mode.
>
> For example, a 32-bit testcase calling pread64(fd, buf, -1, 1)
> and running in x86-64 kernel, the obviously illegal size "-1" will
> get ignored by __access_ok. Since from the kernel point of view,
> 32-bit userspace 0xffffffff is within the limit of 64-bit
> TASK_SIZE_MAX.
>
> Replacing the limit TASK_SIZE_MAX with TASK_SIZE in __access_ok
> will fix the problem above.

I don't see what problem this fixes, the choice of TASK_SIZE_MAX in
__access_ok() is intentional, as this means we can use a compile-time
constant as the limit, which produces better code.

Any user pointer between COMPAT_TASK_SIZE and TASK_SIZE_MAX is
not accessible by a user process but will not let user space access
any kernel data either, which is the point of the check.

In your example of using '-1' as the pointer, access_ok() returns true,
so the kernel can go on to perform an unchecked __get_user() on
__put_user() on 0xffffffffull, which causes page fault that is intercepted
by the ex_table fixup.

This should not result in any user visible difference, in both cases
user process will see a -EFAULT return code from its system call.
Are you able to come up with a test case that shows an observable
difference in behavior?

      Arnd

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -next] uaccess: fix __access_ok limit setup in compat mode
  2022-03-18  7:44 ` Arnd Bergmann
@ 2022-03-22 12:55   ` chenjiahao (C)
  2022-03-22 14:41     ` Arnd Bergmann
  0 siblings, 1 reply; 6+ messages in thread
From: chenjiahao (C) @ 2022-03-22 12:55 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Eric W . Biederman, Sam Ravnborg, Stafford Horne, Dinh Nguyen,
	linux-arch, Linux Kernel Mailing List


在 2022/3/18 15:44, Arnd Bergmann 写道:
> On Fri, Mar 18, 2022 at 8:11 AM Chen Jiahao <chenjiahao16@huawei.com> wrote:
>> In __access_ok, TASK_SIZE_MAX is used to check if a memory access
>> is in user address space, but some cases may get omitted in compat
>> mode.
>>
>> For example, a 32-bit testcase calling pread64(fd, buf, -1, 1)
>> and running in x86-64 kernel, the obviously illegal size "-1" will
>> get ignored by __access_ok. Since from the kernel point of view,
>> 32-bit userspace 0xffffffff is within the limit of 64-bit
>> TASK_SIZE_MAX.
>>
>> Replacing the limit TASK_SIZE_MAX with TASK_SIZE in __access_ok
>> will fix the problem above.
> I don't see what problem this fixes, the choice of TASK_SIZE_MAX in
> __access_ok() is intentional, as this means we can use a compile-time
> constant as the limit, which produces better code.
>
> Any user pointer between COMPAT_TASK_SIZE and TASK_SIZE_MAX is
> not accessible by a user process but will not let user space access
> any kernel data either, which is the point of the check.
>
> In your example of using '-1' as the pointer, access_ok() returns true,
> so the kernel can go on to perform an unchecked __get_user() on
> __put_user() on 0xffffffffull, which causes page fault that is intercepted
> by the ex_table fixup.
>
> This should not result in any user visible difference, in both cases
> user process will see a -EFAULT return code from its system call.
> Are you able to come up with a test case that shows an observable
> difference in behavior?
>
>        Arnd
>
> .

Actually, this patch do comes from a testcase failure, the code is 
pasted below:

#define TMPFILE "__1234567890"
#define BUF_SIZE    1024

int main()
{
     char buf[BUF_SIZE] = {0};
     int fd;
     int ret;
     int err;

     fd = open(TMPFILE, O_CREAT | O_RDWR);
     if(-1 == fd)
     {
         perror("open");
         return 1;
     }

     ret = pread64(fd, buf, -1, 1);
     if((-1 == ret) && (EFAULT == errno))
     {
         close(fd);
         unlink(TMPFILE);
         printf("PASS\n");
         return 0;
     }
     err = errno;
     perror("pread64");
     printf("err = %d\n", err);
     close(fd);
     unlink(TMPFILE);
     printf("FAIL\n");

     return 1;
  }

The expected result is:

PASS

but the result of 32-bit testcase running in x86-64 kernel with compat 
mode is:

pread64: Success
err = 0
FAIL


In my explanation, pread64 is called with count '0xffffffffull' and 
offset '1', which might still not trigger

page fault in 64-bit kernel.


This patch uses TASK_SIZE as the addr_limit to performance a stricter 
address check and intercepts

the illegal pointer address from 32-bit userspace at a very early time. 
Which is roughly the same

address limit check as __access_ok in arch/ia64.


This is why this fixes my testcase failure above, or have I missed 
anything else?


Jiahao


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -next] uaccess: fix __access_ok limit setup in compat mode
  2022-03-22 12:55   ` chenjiahao (C)
@ 2022-03-22 14:41     ` Arnd Bergmann
  2022-03-23  1:35       ` David Laight
  2022-03-23 10:34       ` chenjiahao (C)
  0 siblings, 2 replies; 6+ messages in thread
From: Arnd Bergmann @ 2022-03-22 14:41 UTC (permalink / raw)
  To: chenjiahao (C)
  Cc: Arnd Bergmann, Eric W . Biederman, Sam Ravnborg, Stafford Horne,
	Dinh Nguyen, linux-arch, Linux Kernel Mailing List, Linux API,
	Linux FS-devel Mailing List, the arch/x86 maintainers

On Tue, Mar 22, 2022 at 1:55 PM chenjiahao (C) <chenjiahao16@huawei.com> wrote:
> 在 2022/3/18 15:44, Arnd Bergmann 写道:
> >
> > This should not result in any user visible difference, in both cases
> > user process will see a -EFAULT return code from its system call.
> > Are you able to come up with a test case that shows an observable
> > difference in behavior?
>
> Actually, this patch do comes from a testcase failure, the code is
> pasted below:

Thank you for the test case!

> #define TMPFILE "__1234567890"
> #define BUF_SIZE    1024
>
> int main()
> {
>      char buf[BUF_SIZE] = {0};
>      int fd;
>      int ret;
>      int err;
>
>      fd = open(TMPFILE, O_CREAT | O_RDWR);
>      if(-1 == fd)
>      {
>          perror("open");
>          return 1;
>      }
>
>      ret = pread64(fd, buf, -1, 1);
>      if((-1 == ret) && (EFAULT == errno))
>      {
>          close(fd);
>          unlink(TMPFILE);
>          printf("PASS\n");
>          return 0;
>      }
>      err = errno;
>      perror("pread64");
>      printf("err = %d\n", err);
>      close(fd);
>      unlink(TMPFILE);
>      printf("FAIL\n");
>
>      return 1;
>   }
>
> The expected result is:
>
> PASS
>
> but the result of 32-bit testcase running in x86-64 kernel with compat
> mode is:
>
> pread64: Success
> err = 0
> FAIL
>
>
> In my explanation, pread64 is called with count '0xffffffffull' and
> offset '1', which might still not trigger
>
> page fault in 64-bit kernel.
>
>
> This patch uses TASK_SIZE as the addr_limit to performance a stricter
> address check and intercepts

I see. So while the kernel behavior was not meant to change from
my patch, it clearly did, which may cause problems. However, I'm
not sure if the changed behavior is actually wrong.

> the illegal pointer address from 32-bit userspace at a very early time.
> Which is roughly the same
>
> address limit check as __access_ok in arch/ia64.
>
>
> This is why this fixes my testcase failure above, or have I missed
> anything else?

My interpretation of what is going on here is that the pread64() call
still behaves according to the documented behavior, returning a small
number of bytes from the file, up to the first faulting address.

As the manual page for pread64() describes,

       On  success,  pread()  returns  the  number  of  bytes read
       (a return of zero indicates end of file) and pwrite() returns
       the number of bytes written.
       Note that it is not an error for a successful call to transfer
       fewer bytes than requested  (see  read(2)
       and write(2)).

The difference after my patch is that originally it returned
-EFAULT because part of the buffer is outside of the
addressable memory, but now it returns success because
part of the buffer is inside the addressable memory ;-)

I'm also not sure about which patch caused the change in
behavior, can you double-check that? The one you cited,
967747bbc084 ("uaccess: remove CONFIG_SET_FS"), does
not actually touch the x86 implementation, and commit
36903abedfe8 ("x86: remove __range_not_ok()") does touch
x86 but the limit was already TASK_SIZE_MAX since commit
47058bb54b57 ("x86: remove address space overrides using
set_fs()") back in linux-5.10.

        Arnd

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [PATCH -next] uaccess: fix __access_ok limit setup in compat mode
  2022-03-22 14:41     ` Arnd Bergmann
@ 2022-03-23  1:35       ` David Laight
  2022-03-23 10:34       ` chenjiahao (C)
  1 sibling, 0 replies; 6+ messages in thread
From: David Laight @ 2022-03-23  1:35 UTC (permalink / raw)
  To: 'Arnd Bergmann', chenjiahao (C)
  Cc: Eric W . Biederman, Sam Ravnborg, Stafford Horne, Dinh Nguyen,
	linux-arch, Linux Kernel Mailing List, Linux API,
	Linux FS-devel Mailing List, the arch/x86 maintainers

From: Arnd Bergmann
> Sent: 22 March 2022 14:41
> 
> On Tue, Mar 22, 2022 at 1:55 PM chenjiahao (C) <chenjiahao16@huawei.com> wrote:
> > 在 2022/3/18 15:44, Arnd Bergmann 写道:
> > >
> > > This should not result in any user visible difference, in both cases
> > > user process will see a -EFAULT return code from its system call.
> > > Are you able to come up with a test case that shows an observable
> > > difference in behavior?
> >
> > Actually, this patch do comes from a testcase failure, the code is
> > pasted below:
> 
> Thank you for the test case!
> 
...
> >      ret = pread64(fd, buf, -1, 1);
> >      if((-1 == ret) && (EFAULT == errno))
> >      {
...
> >          printf("PASS\n");
...
> > In my explanation, pread64 is called with count '0xffffffffull' and
> > offset '1', which might still not trigger
> >
> > page fault in 64-bit kernel.
> >
> >
> > This patch uses TASK_SIZE as the addr_limit to performance a stricter
> > address check and intercepts
> 
> I see. So while the kernel behavior was not meant to change from
> my patch, it clearly did, which may cause problems. However, I'm
> not sure if the changed behavior is actually wrong.

It isn't really any different from passing a length of (1 << 30)
(and a buffer at a low user address).
The entire buffer is valid user addresses, but most of it is
invalid because there is nothing mapped at the relevant addresses.
Unless you actually try to access one of the memory locations
you won't get a fault - and the correct return is then a partial read.

Similarly it is valid for the kernel to ensure there is an
unmapped page between user and kernel addresses and then
not check the buffer size at all - requiring the kernel code
do (adequately) sequential accesses.
Again your test 'fails'.

You could equally well argue that the 'old' behaviour is wrong!

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH -next] uaccess: fix __access_ok limit setup in compat mode
  2022-03-22 14:41     ` Arnd Bergmann
  2022-03-23  1:35       ` David Laight
@ 2022-03-23 10:34       ` chenjiahao (C)
  1 sibling, 0 replies; 6+ messages in thread
From: chenjiahao (C) @ 2022-03-23 10:34 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Eric W . Biederman, Sam Ravnborg, Stafford Horne, Dinh Nguyen,
	linux-arch, Linux Kernel Mailing List, Linux API,
	Linux FS-devel Mailing List, the arch/x86 maintainers


在 2022/3/22 22:41, Arnd Bergmann 写道:
> On Tue, Mar 22, 2022 at 1:55 PM chenjiahao (C) <chenjiahao16@huawei.com> wrote:
>> 在 2022/3/18 15:44, Arnd Bergmann 写道:
>>> This should not result in any user visible difference, in both cases
>>> user process will see a -EFAULT return code from its system call.
>>> Are you able to come up with a test case that shows an observable
>>> difference in behavior?
>> Actually, this patch do comes from a testcase failure, the code is
>> pasted below:
> Thank you for the test case!
>
>> #define TMPFILE "__1234567890"
>> #define BUF_SIZE    1024
>>
>> int main()
>> {
>>       char buf[BUF_SIZE] = {0};
>>       int fd;
>>       int ret;
>>       int err;
>>
>>       fd = open(TMPFILE, O_CREAT | O_RDWR);
>>       if(-1 == fd)
>>       {
>>           perror("open");
>>           return 1;
>>       }
>>
>>       ret = pread64(fd, buf, -1, 1);
>>       if((-1 == ret) && (EFAULT == errno))
>>       {
>>           close(fd);
>>           unlink(TMPFILE);
>>           printf("PASS\n");
>>           return 0;
>>       }
>>       err = errno;
>>       perror("pread64");
>>       printf("err = %d\n", err);
>>       close(fd);
>>       unlink(TMPFILE);
>>       printf("FAIL\n");
>>
>>       return 1;
>>    }
>>
>> The expected result is:
>>
>> PASS
>>
>> but the result of 32-bit testcase running in x86-64 kernel with compat
>> mode is:
>>
>> pread64: Success
>> err = 0
>> FAIL
>>
>>
>> In my explanation, pread64 is called with count '0xffffffffull' and
>> offset '1', which might still not trigger
>>
>> page fault in 64-bit kernel.
>>
>>
>> This patch uses TASK_SIZE as the addr_limit to performance a stricter
>> address check and intercepts
> I see. So while the kernel behavior was not meant to change from
> my patch, it clearly did, which may cause problems. However, I'm
> not sure if the changed behavior is actually wrong.
>
>> the illegal pointer address from 32-bit userspace at a very early time.
>> Which is roughly the same
>>
>> address limit check as __access_ok in arch/ia64.
>>
>>
>> This is why this fixes my testcase failure above, or have I missed
>> anything else?
> My interpretation of what is going on here is that the pread64() call
> still behaves according to the documented behavior, returning a small
> number of bytes from the file, up to the first faulting address.
>
> As the manual page for pread64() describes,
>
>         On  success,  pread()  returns  the  number  of  bytes read
>         (a return of zero indicates end of file) and pwrite() returns
>         the number of bytes written.
>         Note that it is not an error for a successful call to transfer
>         fewer bytes than requested  (see  read(2)
>         and write(2)).

I have really missed this point. The behavior here is and should

be aligned with the manual definition.

>
> The difference after my patch is that originally it returned
> -EFAULT because part of the buffer is outside of the
> addressable memory, but now it returns success because
> part of the buffer is inside the addressable memory ;-)
>
> I'm also not sure about which patch caused the change in
> behavior, can you double-check that? The one you cited,
> 967747bbc084 ("uaccess: remove CONFIG_SET_FS"), does
> not actually touch the x86 implementation, and commit
> 36903abedfe8 ("x86: remove __range_not_ok()") does touch
> x86 but the limit was already TASK_SIZE_MAX since commit
> 47058bb54b57 ("x86: remove address space overrides using
> set_fs()") back in linux-5.10.

I have performed the testcase on the same environment with

several old LTS kernel versions, all the results are "fault".

The behavior before and after your patches should be consistent.


So the fault should due to the disagreement between the

testcase's intention and the real pread64 definition. I have been

misled by the former one. Thanks for your interpretation.


Jiahao

>
>          Arnd
>
> .

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-03-23 10:35 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-18  7:11 [PATCH -next] uaccess: fix __access_ok limit setup in compat mode Chen Jiahao
2022-03-18  7:44 ` Arnd Bergmann
2022-03-22 12:55   ` chenjiahao (C)
2022-03-22 14:41     ` Arnd Bergmann
2022-03-23  1:35       ` David Laight
2022-03-23 10:34       ` chenjiahao (C)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.