linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86: uaccess: fix regression in unsafe_get_user
@ 2019-02-15 23:59 baloo
  2019-02-16  4:20 ` Jann Horn
  0 siblings, 1 reply; 23+ messages in thread
From: baloo @ 2019-02-15 23:59 UTC (permalink / raw)
  To: x86
  Cc: Arthur Gautier, Jann Horn, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, linux-kernel, Pascal Bouchareine

From: Arthur Gautier <baloo@gandi.net>

When extracting an initramfs, a filename may be near an allocation boundary.
Should that happen, strncopy_from_user will invoke unsafe_get_user which
may cross the allocation boundary. Should that happen, unsafe_get_user will
trigger a page fault, and strncopy_from_user would then bailout to
byte_at_a_time behavior.

unsafe_get_user is unsafe by nature, and rely on pagefault to detect boundaries.
After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
it may no longer rely on pagefault as the new page fault handler would
trigger a BUG().

This commit allows unsafe_get_user to explicitly trigger pagefaults and
handle them directly with the error target label.

Kernel bug:
[    0.965251] Unpacking initramfs...
[    1.797025] BUG: pagefault on kernel address 0xffffae80c0c7e000 in non-whitelisted uaccess
[    1.798992] BUG: unable to handle kernel paging request at ffffae80c0c7e000
[    1.798992] #PF error: [normal kernel read fault]
[    1.798992] PGD 68526067 P4D 68526067 PUD 68527067 PMD 67f0d067 PTE 0
[    1.798992] Oops: 0000 [#1] SMP PTI
[    1.798992] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc6+ #14
[    1.798992] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[    1.798992] RIP: 0010:strncpy_from_user+0x67/0xe0
[    1.798992] Code: fe fe 48 39 ca 49 ba 80 80 80 80 80 80 80 80 48 0f 46 ca 31 c0 45 31 db 49 89 c8 4c 89 c1 48 29 c1 48 83 f9 07 76 49 44 89 d9 <4c> 8b 0c 06 85 c9 75 3e 49 8d 0c 19 4c 89 0c 07 49 f7 d1 4c 21 c9
[    1.798992] RSP: 0000:ffffae80c031fc40 EFLAGS: 00050216
[    1.798992] RAX: 0000000000000040 RBX: fefefefefefefeff RCX: 0000000000000000
[    1.798992] RDX: 0000000000000fe0 RSI: ffffae80c0c7dfba RDI: ffff8b3d27cce020
[    1.798992] RBP: 00000000ffffff9c R08: 0000000000000fe0 R09: caccd29190978b86
[    1.798992] R10: 8080808080808080 R11: 0000000000000000 R12: ffffae80c0c7dfba
[    1.798992] R13: ffffae80c031fd00 R14: 0000000000000000 R15: 0000000000000000
[    1.798992] FS:  0000000000000000(0000) GS:ffff8b3d28a00000(0000) knlGS:0000000000000000
[    1.798992] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.798992] CR2: ffffae80c0c7e000 CR3: 000000003a60e001 CR4: 0000000000360eb0
[    1.798992] Call Trace:
[    1.798992]  getname_flags+0x69/0x187
[    1.798992]  user_path_at_empty+0x1e/0x41
[    1.798992]  vfs_statx+0x70/0xcc
[    1.798992]  clean_path+0x41/0xa2
[    1.798992]  ? parse_header+0x40/0x10a
[    1.798992]  do_name+0x78/0x2b5
[    1.798992]  write_buffer+0x27/0x37
[    1.798992]  flush_buffer+0x34/0x8b
[    1.798992]  ? md_run_setup+0x8a/0x8a
[    1.798992]  unlz4+0x20b/0x27c
[    1.798992]  ? write_buffer+0x37/0x37
[    1.798992]  ? decompress_method+0x80/0x80
[    1.798992]  unpack_to_rootfs+0x17a/0x2b7
[    1.798992]  ? md_run_setup+0x8a/0x8a
[    1.798992]  ? clean_rootfs+0x159/0x159
[    1.798992]  populate_rootfs+0x5d/0x105
[    1.798992]  do_one_initcall+0x86/0x169
[    1.798992]  ? do_early_param+0x8e/0x8e
[    1.798992]  kernel_init_freeable+0x16a/0x1f4
[    1.798992]  ? rest_init+0xaa/0xaa
[    1.798992]  kernel_init+0xa/0xfa
[    1.798992]  ret_from_fork+0x35/0x40

You may reproduce the issue with the following initrd:
  truncate -s 8388313 a
  SECONDFILENAME=bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
  truncate -s 10 $SECONDFILENAME
  echo "a\n$SECONDFILENAME" | cpio -o --format=newc | lz4 -l > initrd.img.lz4

This places the second filename in the cpio near the allocation boundary made
by lz4 decompression and should trigger the bug.

Fixes: 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")

Cc: Jann Horn <jannh@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Pascal Bouchareine <pascal@gandi.net>
Signed-off-by: Arthur Gautier <baloo@gandi.net>
---
 arch/x86/include/asm/uaccess.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 780f2b42c8efe..2c272dc43e05a 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -724,7 +724,9 @@ static __must_check inline bool user_access_begin(const void __user *ptr, size_t
 do {										\
 	int __gu_err;								\
 	__inttype(*(ptr)) __gu_val;						\
+	current->kernel_uaccess_faults_ok++;					\
 	__get_user_size(__gu_val, (ptr), sizeof(*(ptr)), __gu_err, -EFAULT);	\
+	current->kernel_uaccess_faults_ok--;					\
 	(x) = (__force __typeof__(*(ptr)))__gu_val;				\
 	if (unlikely(__gu_err)) goto err_label;					\
 } while (0)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-15 23:59 [PATCH] x86: uaccess: fix regression in unsafe_get_user baloo
@ 2019-02-16  4:20 ` Jann Horn
  2019-02-16 20:18   ` Thomas Gleixner
  0 siblings, 1 reply; 23+ messages in thread
From: Jann Horn @ 2019-02-16  4:20 UTC (permalink / raw)
  To: baloo, Andy Lutomirski
  Cc: the arch/x86 maintainers, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, kernel list, Pascal Bouchareine

+Andy Lutomirski

On Sat, Feb 16, 2019 at 12:59 AM <baloo@gandi.net> wrote:
>
> From: Arthur Gautier <baloo@gandi.net>
>
> When extracting an initramfs, a filename may be near an allocation boundary.
> Should that happen, strncopy_from_user will invoke unsafe_get_user which
> may cross the allocation boundary. Should that happen, unsafe_get_user will
> trigger a page fault, and strncopy_from_user would then bailout to
> byte_at_a_time behavior.
>
> unsafe_get_user is unsafe by nature, and rely on pagefault to detect boundaries.
> After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
> it may no longer rely on pagefault as the new page fault handler would
> trigger a BUG().
>
> This commit allows unsafe_get_user to explicitly trigger pagefaults and
> handle them directly with the error target label.

Oof. So basically the init code is full of things that just call
syscalls instead of using VFS functions (which don't actually exist
for everything), and the VFS syscalls use getname_flags(), which uses
strncpy_from_user(), which can access out-of-bounds pages on
architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
that in summary means that all the init code is potentially prone to
tripping over this?

I don't particularly like this approach to fixing it, but I also don't
have any better ideas, so I guess unless someone else has a bright
idea, this patch might have to go in.

> Kernel bug:
> [    0.965251] Unpacking initramfs...
> [    1.797025] BUG: pagefault on kernel address 0xffffae80c0c7e000 in non-whitelisted uaccess
> [    1.798992] BUG: unable to handle kernel paging request at ffffae80c0c7e000
> [    1.798992] #PF error: [normal kernel read fault]
> [    1.798992] PGD 68526067 P4D 68526067 PUD 68527067 PMD 67f0d067 PTE 0
> [    1.798992] Oops: 0000 [#1] SMP PTI
> [    1.798992] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc6+ #14
> [    1.798992] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
> [    1.798992] RIP: 0010:strncpy_from_user+0x67/0xe0
> [    1.798992] Code: fe fe 48 39 ca 49 ba 80 80 80 80 80 80 80 80 48 0f 46 ca 31 c0 45 31 db 49 89 c8 4c 89 c1 48 29 c1 48 83 f9 07 76 49 44 89 d9 <4c> 8b 0c 06 85 c9 75 3e 49 8d 0c 19 4c 89 0c 07 49 f7 d1 4c 21 c9
> [    1.798992] RSP: 0000:ffffae80c031fc40 EFLAGS: 00050216
> [    1.798992] RAX: 0000000000000040 RBX: fefefefefefefeff RCX: 0000000000000000
> [    1.798992] RDX: 0000000000000fe0 RSI: ffffae80c0c7dfba RDI: ffff8b3d27cce020
> [    1.798992] RBP: 00000000ffffff9c R08: 0000000000000fe0 R09: caccd29190978b86
> [    1.798992] R10: 8080808080808080 R11: 0000000000000000 R12: ffffae80c0c7dfba
> [    1.798992] R13: ffffae80c031fd00 R14: 0000000000000000 R15: 0000000000000000
> [    1.798992] FS:  0000000000000000(0000) GS:ffff8b3d28a00000(0000) knlGS:0000000000000000
> [    1.798992] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    1.798992] CR2: ffffae80c0c7e000 CR3: 000000003a60e001 CR4: 0000000000360eb0
> [    1.798992] Call Trace:
> [    1.798992]  getname_flags+0x69/0x187
> [    1.798992]  user_path_at_empty+0x1e/0x41
> [    1.798992]  vfs_statx+0x70/0xcc
> [    1.798992]  clean_path+0x41/0xa2
> [    1.798992]  ? parse_header+0x40/0x10a
> [    1.798992]  do_name+0x78/0x2b5
> [    1.798992]  write_buffer+0x27/0x37
> [    1.798992]  flush_buffer+0x34/0x8b
> [    1.798992]  ? md_run_setup+0x8a/0x8a
> [    1.798992]  unlz4+0x20b/0x27c
> [    1.798992]  ? write_buffer+0x37/0x37
> [    1.798992]  ? decompress_method+0x80/0x80
> [    1.798992]  unpack_to_rootfs+0x17a/0x2b7
> [    1.798992]  ? md_run_setup+0x8a/0x8a
> [    1.798992]  ? clean_rootfs+0x159/0x159
> [    1.798992]  populate_rootfs+0x5d/0x105
> [    1.798992]  do_one_initcall+0x86/0x169
> [    1.798992]  ? do_early_param+0x8e/0x8e
> [    1.798992]  kernel_init_freeable+0x16a/0x1f4
> [    1.798992]  ? rest_init+0xaa/0xaa
> [    1.798992]  kernel_init+0xa/0xfa
> [    1.798992]  ret_from_fork+0x35/0x40
>
> You may reproduce the issue with the following initrd:
>   truncate -s 8388313 a
>   SECONDFILENAME=bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
>   truncate -s 10 $SECONDFILENAME
>   echo "a\n$SECONDFILENAME" | cpio -o --format=newc | lz4 -l > initrd.img.lz4
>
> This places the second filename in the cpio near the allocation boundary made
> by lz4 decompression and should trigger the bug.
>
> Fixes: 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
>
> Cc: Jann Horn <jannh@google.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: x86@kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Pascal Bouchareine <pascal@gandi.net>
> Signed-off-by: Arthur Gautier <baloo@gandi.net>
> ---
>  arch/x86/include/asm/uaccess.h | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
> index 780f2b42c8efe..2c272dc43e05a 100644
> --- a/arch/x86/include/asm/uaccess.h
> +++ b/arch/x86/include/asm/uaccess.h
> @@ -724,7 +724,9 @@ static __must_check inline bool user_access_begin(const void __user *ptr, size_t
>  do {                                                                           \
>         int __gu_err;                                                           \
>         __inttype(*(ptr)) __gu_val;                                             \
> +       current->kernel_uaccess_faults_ok++;                                    \
>         __get_user_size(__gu_val, (ptr), sizeof(*(ptr)), __gu_err, -EFAULT);    \
> +       current->kernel_uaccess_faults_ok--;                                    \
>         (x) = (__force __typeof__(*(ptr)))__gu_val;                             \
>         if (unlikely(__gu_err)) goto err_label;                                 \
>  } while (0)
> --
> 2.20.1
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-16  4:20 ` Jann Horn
@ 2019-02-16 20:18   ` Thomas Gleixner
  2019-02-16 22:25     ` Thomas Gleixner
  2019-02-16 22:50     ` Andy Lutomirski
  0 siblings, 2 replies; 23+ messages in thread
From: Thomas Gleixner @ 2019-02-16 20:18 UTC (permalink / raw)
  To: Jann Horn
  Cc: baloo, Andy Lutomirski, the arch/x86 maintainers, Ingo Molnar,
	Borislav Petkov, kernel list, Pascal Bouchareine

On Sat, 16 Feb 2019, Jann Horn wrote:
> On Sat, Feb 16, 2019 at 12:59 AM <baloo@gandi.net> wrote:
> > When extracting an initramfs, a filename may be near an allocation boundary.
> > Should that happen, strncopy_from_user will invoke unsafe_get_user which
> > may cross the allocation boundary. Should that happen, unsafe_get_user will
> > trigger a page fault, and strncopy_from_user would then bailout to
> > byte_at_a_time behavior.
> >
> > unsafe_get_user is unsafe by nature, and rely on pagefault to detect boundaries.
> > After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
> > it may no longer rely on pagefault as the new page fault handler would
> > trigger a BUG().
> >
> > This commit allows unsafe_get_user to explicitly trigger pagefaults and
> > handle them directly with the error target label.
> 
> Oof. So basically the init code is full of things that just call
> syscalls instead of using VFS functions (which don't actually exist
> for everything), and the VFS syscalls use getname_flags(), which uses
> strncpy_from_user(), which can access out-of-bounds pages on
> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
> that in summary means that all the init code is potentially prone to
> tripping over this?

Not all init code. It should be only the initramfs decompression.

> I don't particularly like this approach to fixing it, but I also don't
> have any better ideas, so I guess unless someone else has a bright
> idea, this patch might have to go in.

So we know that this happens in the context of decompress() which calls
flush_buffer() for every chunk. flush_buffer() gets the start_address and
the length. We also know that the fault can only happen within:

    start_address <= fault_address < start_address + length + 8;

So something like the untested workaround below should cover the initramfs
oddity and avoid to weaken the protection for all other cases.

Thanks,

	tglx

8<---------------
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -1,5 +1,6 @@
 #include <linux/extable.h>
 #include <linux/uaccess.h>
+#include <linux/initrd.h>
 #include <linux/sched/debug.h>
 #include <xen/xen.h>
 
@@ -161,6 +162,14 @@ static bool bogus_uaccess(struct pt_regs
 	if (current->kernel_uaccess_faults_ok)
 		return false;
 
+	/*
+	 * initramfs decompression can trigger a fault when
+	 * unsafe_get_user() goes over the boundary of the buffer. That's a
+	 * valid case for e.g. strncpy_from_user().
+	 */
+	if (initramfs_fault_in_decompress(fault_addr))
+		return false;
+
 	/* This is bad. Refuse the fixup so that we go into die(). */
 	if (trapnr == X86_TRAP_PF) {
 		pr_emerg("BUG: pagefault on kernel address 0x%lx in non-whitelisted uaccess\n",
--- a/include/linux/initrd.h
+++ b/include/linux/initrd.h
@@ -1,5 +1,8 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
+#ifndef LINUX_INITRD_H
+#define LINUX_INITRD_H
+
 #define INITRD_MINOR 250 /* shouldn't collide with /dev/ram* too soon ... */
 
 /* 1 = load ramdisk, 0 = don't load */
@@ -25,3 +28,14 @@ extern phys_addr_t phys_initrd_start;
 extern unsigned long phys_initrd_size;
 
 extern unsigned int real_root_dev;
+
+#ifdef CONFIG_BLK_DEV_INITRD
+bool initramfs_fault_in_decompress(unsigned long addr);
+#else
+static inline bool initramfs_fault_in_decompress(unsigned long addr)
+{
+	return false;
+}
+#endif
+
+#endif
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -403,13 +403,27 @@ static __initdata int (*actions[])(void)
 	[Reset]		= do_reset,
 };
 
+static unsigned long flush_start;
+static unsigned long flush_length;
+
+bool initramfs_fault_in_decompress(unsigned long addr)
+{
+	return addr >= flush_start && addr < flush_start + flush_length + 8;
+}
+
 static long __init write_buffer(char *buf, unsigned long len)
 {
+	/* Store address and length for uaccess fault handling */
+	flush_start = (unsigned long) buf;
+	flush_length = len;
+
 	byte_count = len;
 	victim = buf;
 
 	while (!actions[state]())
 		;
+	/* Clear the uaccess fault handling region */
+	flush_start = flush_length = 0;
 	return len - byte_count;
 }
 









^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-16 20:18   ` Thomas Gleixner
@ 2019-02-16 22:25     ` Thomas Gleixner
  2019-02-16 22:50     ` Andy Lutomirski
  1 sibling, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2019-02-16 22:25 UTC (permalink / raw)
  To: Jann Horn
  Cc: baloo, Andy Lutomirski, the arch/x86 maintainers, Ingo Molnar,
	Borislav Petkov, kernel list, Pascal Bouchareine

On Sat, 16 Feb 2019, Thomas Gleixner wrote:

> On Sat, 16 Feb 2019, Jann Horn wrote:
> > On Sat, Feb 16, 2019 at 12:59 AM <baloo@gandi.net> wrote:
> > > When extracting an initramfs, a filename may be near an allocation boundary.
> > > Should that happen, strncopy_from_user will invoke unsafe_get_user which
> > > may cross the allocation boundary. Should that happen, unsafe_get_user will
> > > trigger a page fault, and strncopy_from_user would then bailout to
> > > byte_at_a_time behavior.
> > >
> > > unsafe_get_user is unsafe by nature, and rely on pagefault to detect boundaries.
> > > After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
> > > it may no longer rely on pagefault as the new page fault handler would
> > > trigger a BUG().
> > >
> > > This commit allows unsafe_get_user to explicitly trigger pagefaults and
> > > handle them directly with the error target label.
> > 
> > Oof. So basically the init code is full of things that just call
> > syscalls instead of using VFS functions (which don't actually exist
> > for everything), and the VFS syscalls use getname_flags(), which uses
> > strncpy_from_user(), which can access out-of-bounds pages on
> > architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
> > that in summary means that all the init code is potentially prone to
> > tripping over this?
> 
> Not all init code. It should be only the initramfs decompression.
> 
> > I don't particularly like this approach to fixing it, but I also don't
> > have any better ideas, so I guess unless someone else has a bright
> > idea, this patch might have to go in.
> 
> So we know that this happens in the context of decompress() which calls
> flush_buffer() for every chunk. flush_buffer() gets the start_address and
> the length. We also know that the fault can only happen within:
> 
>     start_address <= fault_address < start_address + length + 8;
> 
> So something like the untested workaround below should cover the initramfs
> oddity and avoid to weaken the protection for all other cases.

The other even simpler solution would be to force these functions into the
byte at a time code path during init. Too tired to hack that up now.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-16 20:18   ` Thomas Gleixner
  2019-02-16 22:25     ` Thomas Gleixner
@ 2019-02-16 22:50     ` Andy Lutomirski
  2019-02-16 22:57       ` Andy Lutomirski
  2019-02-16 23:47       ` Al Viro
  1 sibling, 2 replies; 23+ messages in thread
From: Andy Lutomirski @ 2019-02-16 22:50 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jann Horn, baloo, the arch/x86 maintainers, Ingo Molnar,
	Borislav Petkov, kernel list, Pascal Bouchareine



> On Feb 16, 2019, at 12:18 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
>> On Sat, 16 Feb 2019, Jann Horn wrote:
>>> On Sat, Feb 16, 2019 at 12:59 AM <baloo@gandi.net> wrote:
>>> When extracting an initramfs, a filename may be near an allocation boundary.
>>> Should that happen, strncopy_from_user will invoke unsafe_get_user which
>>> may cross the allocation boundary. Should that happen, unsafe_get_user will
>>> trigger a page fault, and strncopy_from_user would then bailout to
>>> byte_at_a_time behavior.
>>> 
>>> unsafe_get_user is unsafe by nature, and rely on pagefault to detect boundaries.
>>> After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
>>> it may no longer rely on pagefault as the new page fault handler would
>>> trigger a BUG().
>>> 
>>> This commit allows unsafe_get_user to explicitly trigger pagefaults and
>>> handle them directly with the error target label.
>> 
>> Oof. So basically the init code is full of things that just call
>> syscalls instead of using VFS functions (which don't actually exist
>> for everything), and the VFS syscalls use getname_flags(), which uses
>> strncpy_from_user(), which can access out-of-bounds pages on
>> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
>> that in summary means that all the init code is potentially prone to
>> tripping over this?
> 
> Not all init code. It should be only the initramfs decompression.
> 
>> I don't particularly like this approach to fixing it, but I also don't
>> have any better ideas, so I guess unless someone else has a bright
>> idea, this patch might have to go in.
> 
> So we know that this happens in the context of decompress() which calls
> flush_buffer() for every chunk. flush_buffer() gets the start_address and
> the length. We also know that the fault can only happen within:
> 
>    start_address <= fault_address < start_address + length + 8;
> 
> So something like the untested workaround below should cover the initramfs
> oddity and avoid to weaken the protection for all other cases.

What is the actual problem?  We’re not actually demand-faulting this data, are we?  Are we just overrunning the buffer because the from_user helpers are too clever?  Can we fix it for real by having the fancy helpers do *aligned* loads so that they don’t overrun the buffer?  Heck, this might be faster, too.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-16 22:50     ` Andy Lutomirski
@ 2019-02-16 22:57       ` Andy Lutomirski
  2019-02-16 23:47       ` Al Viro
  1 sibling, 0 replies; 23+ messages in thread
From: Andy Lutomirski @ 2019-02-16 22:57 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jann Horn, baloo, the arch/x86 maintainers, Ingo Molnar,
	Borislav Petkov, kernel list, Pascal Bouchareine


> On Feb 16, 2019, at 2:50 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> 
> 
> 
>>> On Feb 16, 2019, at 12:18 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
>>> 
>>>> On Sat, 16 Feb 2019, Jann Horn wrote:
>>>> On Sat, Feb 16, 2019 at 12:59 AM <baloo@gandi.net> wrote:
>>>> When extracting an initramfs, a filename may be near an allocation boundary.
>>>> Should that happen, strncopy_from_user will invoke unsafe_get_user which
>>>> may cross the allocation boundary. Should that happen, unsafe_get_user will
>>>> trigger a page fault, and strncopy_from_user would then bailout to
>>>> byte_at_a_time behavior.
>>>> 
>>>> unsafe_get_user is unsafe by nature, and rely on pagefault to detect boundaries.
>>>> After 9da3f2b74054 ("x86/fault: BUG() when uaccess helpers fault on kernel addresses")
>>>> it may no longer rely on pagefault as the new page fault handler would
>>>> trigger a BUG().
>>>> 
>>>> This commit allows unsafe_get_user to explicitly trigger pagefaults and
>>>> handle them directly with the error target label.
>>> 
>>> Oof. So basically the init code is full of things that just call
>>> syscalls instead of using VFS functions (which don't actually exist
>>> for everything), and the VFS syscalls use getname_flags(), which uses
>>> strncpy_from_user(), which can access out-of-bounds pages on
>>> architectures that set CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS, and
>>> that in summary means that all the init code is potentially prone to
>>> tripping over this?
>> 
>> Not all init code. It should be only the initramfs decompression.
>> 
>>> I don't particularly like this approach to fixing it, but I also don't
>>> have any better ideas, so I guess unless someone else has a bright
>>> idea, this patch might have to go in.
>> 
>> So we know that this happens in the context of decompress() which calls
>> flush_buffer() for every chunk. flush_buffer() gets the start_address and
>> the length. We also know that the fault can only happen within:
>> 
>>   start_address <= fault_address < start_address + length + 8;
>> 
>> So something like the untested workaround below should cover the initramfs
>> oddity and avoid to weaken the protection for all other cases.
> 
> What is the actual problem?  We’re not actually demand-faulting this data, are we?  Are we just overrunning the buffer because the from_user helpers are too clever?  Can we fix it for real by having the fancy helpers do *aligned* loads so that they don’t overrun the buffer?  Heck, this might be faster, too.

Indeed.  I would argue that the current code is a bug even in the normal case.  If I lay out my user address space so that I have f,o,o,o,\0 at the end of a page and I have non-side-effect-free memory after it (MMIO, userfaultfd, etc), then passing a pointer to that “fooo” string to a syscall should *not* overrun the buffer.

If I have some time this evening, I’ll see if I can whip up a credible fix.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-16 22:50     ` Andy Lutomirski
  2019-02-16 22:57       ` Andy Lutomirski
@ 2019-02-16 23:47       ` Al Viro
  2019-02-17  0:02         ` Al Viro
                           ` (2 more replies)
  1 sibling, 3 replies; 23+ messages in thread
From: Al Viro @ 2019-02-16 23:47 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Jann Horn, baloo, the arch/x86 maintainers,
	Ingo Molnar, Borislav Petkov, kernel list, Pascal Bouchareine

On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:

> What is the actual problem?  We’re not actually demand-faulting this data, are we?  Are we just overrunning the buffer because the from_user helpers are too clever?  Can we fix it for real by having the fancy helpers do *aligned* loads so that they don’t overrun the buffer?  Heck, this might be faster, too.

Unaligned _stores_ are not any cheaper, and you'd get one hell of
extra arithmetics from trying to avoid both.  Check something
like e.g. memcpy() on alpha, where you really have to keep all
accesses aligned, both on load and on store side.

Can't we just pad the buffers a bit?  Making sure that name_buf
and symlink_buf are _not_ followed by unmapped pages shouldn't
be hard.  Both are allocated by kmalloc(), so...

What am I missing here?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-16 23:47       ` Al Viro
@ 2019-02-17  0:02         ` Al Viro
  2019-02-17  2:36         ` Andy Lutomirski
  2019-02-17  3:41         ` Arthur Gautier
  2 siblings, 0 replies; 23+ messages in thread
From: Al Viro @ 2019-02-17  0:02 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Jann Horn, baloo, the arch/x86 maintainers,
	Ingo Molnar, Borislav Petkov, kernel list, Pascal Bouchareine

On Sat, Feb 16, 2019 at 11:47:02PM +0000, Al Viro wrote:
> On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
> 
> > What is the actual problem?  We’re not actually demand-faulting this data, are we?  Are we just overrunning the buffer because the from_user helpers are too clever?  Can we fix it for real by having the fancy helpers do *aligned* loads so that they don’t overrun the buffer?  Heck, this might be faster, too.
> 
> Unaligned _stores_ are not any cheaper, and you'd get one hell of
> extra arithmetics from trying to avoid both.  Check something
> like e.g. memcpy() on alpha, where you really have to keep all
> accesses aligned, both on load and on store side.
> 
> Can't we just pad the buffers a bit?  Making sure that name_buf
> and symlink_buf are _not_ followed by unmapped pages shouldn't
> be hard.  Both are allocated by kmalloc(), so...
> 
> What am I missing here?

... the fact that read_info() might skip copying if everything it
wants is already in the buffer passed by decompressor ;-/  It's
been a while since I looked into that code...

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-16 23:47       ` Al Viro
  2019-02-17  0:02         ` Al Viro
@ 2019-02-17  2:36         ` Andy Lutomirski
  2019-02-17  3:41         ` Arthur Gautier
  2 siblings, 0 replies; 23+ messages in thread
From: Andy Lutomirski @ 2019-02-17  2:36 UTC (permalink / raw)
  To: Al Viro
  Cc: Thomas Gleixner, Jann Horn, baloo, the arch/x86 maintainers,
	Ingo Molnar, Borislav Petkov, kernel list, Pascal Bouchareine



> On Feb 16, 2019, at 3:47 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> 
>> On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
>> 
>> What is the actual problem?  We’re not actually demand-faulting this data, are we?  Are we just overrunning the buffer because the from_user helpers are too clever?  Can we fix it for real by having the fancy helpers do *aligned* loads so that they don’t overrun the buffer?  Heck, this might be faster, too.
> 
> Unaligned _stores_ are not any cheaper, and you'd get one hell of
> extra arithmetics from trying to avoid both.  Check something
> like e.g. memcpy() on alpha, where you really have to keep all
> accesses aligned, both on load and on store side.

I think we should avoid unaligned loads and do unaligned stores instead.

I would general expect that unaligned stores are a bit cheaper since they don’t need to complete for the comparisons to happen.

But I maintain my claim that this code should not be overrunning its input buffer into the next page, since it could have observable side effects.

> 
> Can't we just pad the buffers a bit?  Making sure that name_buf
> and symlink_buf are _not_ followed by unmapped pages shouldn't
> be hard.  Both are allocated by kmalloc(), so...
> 
> What am I missing here?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-16 23:47       ` Al Viro
  2019-02-17  0:02         ` Al Viro
  2019-02-17  2:36         ` Andy Lutomirski
@ 2019-02-17  3:41         ` Arthur Gautier
  2019-02-17  4:22           ` Al Viro
  2 siblings, 1 reply; 23+ messages in thread
From: Arthur Gautier @ 2019-02-17  3:41 UTC (permalink / raw)
  To: Al Viro
  Cc: Andy Lutomirski, Thomas Gleixner, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, Borislav Petkov,
	kernel list, Pascal Bouchareine

On Sat, Feb 16, 2019 at 11:47:02PM +0000, Al Viro wrote:
> On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
> 
> > What is the actual problem?  We’re not actually demand-faulting this data, are we?  Are we just overrunning the buffer because the from_user helpers are too clever?  Can we fix it for real by having the fancy helpers do *aligned* loads so that they don’t overrun the buffer?  Heck, this might be faster, too.
> 
> Unaligned _stores_ are not any cheaper, and you'd get one hell of
> extra arithmetics from trying to avoid both.  Check something
> like e.g. memcpy() on alpha, where you really have to keep all
> accesses aligned, both on load and on store side.
> 
> Can't we just pad the buffers a bit?  Making sure that name_buf
> and symlink_buf are _not_ followed by unmapped pages shouldn't
> be hard.  Both are allocated by kmalloc(), so...

We cannot change alignment rules here. The input buffer string we're
reading is coming from an cpio formated file and the format is
defined by cpio(5).
Nothing much we can do there I'm afraid. Input buffer is defined to
be 4-byte aligned.

-- 
\o/ Arthur
 G  Gandi.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-17  3:41         ` Arthur Gautier
@ 2019-02-17  4:22           ` Al Viro
  2019-02-18 13:04             ` Thomas Gleixner
  0 siblings, 1 reply; 23+ messages in thread
From: Al Viro @ 2019-02-17  4:22 UTC (permalink / raw)
  To: Arthur Gautier
  Cc: Andy Lutomirski, Thomas Gleixner, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, Borislav Petkov,
	kernel list, Pascal Bouchareine

On Sun, Feb 17, 2019 at 03:41:21AM +0000, Arthur Gautier wrote:
> On Sat, Feb 16, 2019 at 11:47:02PM +0000, Al Viro wrote:
> > On Sat, Feb 16, 2019 at 02:50:15PM -0800, Andy Lutomirski wrote:
> > 
> > > What is the actual problem?  We’re not actually demand-faulting this data, are we?  Are we just overrunning the buffer because the from_user helpers are too clever?  Can we fix it for real by having the fancy helpers do *aligned* loads so that they don’t overrun the buffer?  Heck, this might be faster, too.
> > 
> > Unaligned _stores_ are not any cheaper, and you'd get one hell of
> > extra arithmetics from trying to avoid both.  Check something
> > like e.g. memcpy() on alpha, where you really have to keep all
> > accesses aligned, both on load and on store side.
> > 
> > Can't we just pad the buffers a bit?  Making sure that name_buf
> > and symlink_buf are _not_ followed by unmapped pages shouldn't
> > be hard.  Both are allocated by kmalloc(), so...
> 
> We cannot change alignment rules here. The input buffer string we're
> reading is coming from an cpio formated file and the format is
> defined by cpio(5).
> Nothing much we can do there I'm afraid. Input buffer is defined to
> be 4-byte aligned.

Who says anything about changing the format of the file?  At least
one trivial way to handle that would be this:

diff --git a/init/initramfs.c b/init/initramfs.c
index 7cea802d00ef..edbddfb73106 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -265,8 +265,12 @@ static int __init do_header(void)
 		state = Collect;
 		return 0;
 	}
-	if (S_ISREG(mode) || !body_len)
-		read_into(name_buf, N_ALIGN(name_len), GotName);
+	if (S_ISREG(mode) || !body_len) {
+		collect = collected = name_buf;
+		remains = N_ALIGN(name_len);
+		next_state = GotName;
+		state = Collect;
+	}
 	return 0;
 }
 
Another would be to have the buffer passed to flush_buffer() (i.e.
the callback of decompress_fn) allocated with 4 bytes of padding
past the part where the unpacked piece of data is placed for the
callback to find.  As in,

diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
index 63b4b7eee138..ca3f7ecc9b35 100644
--- a/lib/decompress_inflate.c
+++ b/lib/decompress_inflate.c
@@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
 	rc = -1;
 	if (flush) {
 		out_len = 0x8000; /* 32 K */
-		out_buf = malloc(out_len);
+		out_buf = malloc(out_len + 4);
 	} else {
 		if (!out_len)
 			out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */

for gunzip/decompress and similar ones for bzip2, etc.  The contents
layout doesn't have anything to do with that...

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-17  4:22           ` Al Viro
@ 2019-02-18 13:04             ` Thomas Gleixner
  2019-02-18 19:15               ` Andy Lutomirski
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2019-02-18 13:04 UTC (permalink / raw)
  To: Al Viro
  Cc: Arthur Gautier, Andy Lutomirski, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, Borislav Petkov,
	kernel list, Pascal Bouchareine

On Sun, 17 Feb 2019, Al Viro wrote:
> On Sun, Feb 17, 2019 at 03:41:21AM +0000, Arthur Gautier wrote:
> Who says anything about changing the format of the file?  At least
> one trivial way to handle that would be this:
> 
> diff --git a/init/initramfs.c b/init/initramfs.c
> index 7cea802d00ef..edbddfb73106 100644
> --- a/init/initramfs.c
> +++ b/init/initramfs.c
> @@ -265,8 +265,12 @@ static int __init do_header(void)
>  		state = Collect;
>  		return 0;
>  	}
> -	if (S_ISREG(mode) || !body_len)
> -		read_into(name_buf, N_ALIGN(name_len), GotName);
> +	if (S_ISREG(mode) || !body_len) {
> +		collect = collected = name_buf;
> +		remains = N_ALIGN(name_len);
> +		next_state = GotName;
> +		state = Collect;
> +	}
>  	return 0;
>  }

That does not help much because that is exactly at the end of the
decompressed image and the decompressor is done. So nothing would collect
the remainder anymore.

> Another would be to have the buffer passed to flush_buffer() (i.e.
> the callback of decompress_fn) allocated with 4 bytes of padding
> past the part where the unpacked piece of data is placed for the
> callback to find.  As in,
> 
> diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
> index 63b4b7eee138..ca3f7ecc9b35 100644
> --- a/lib/decompress_inflate.c
> +++ b/lib/decompress_inflate.c
> @@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
>  	rc = -1;
>  	if (flush) {
>  		out_len = 0x8000; /* 32 K */
> -		out_buf = malloc(out_len);
> +		out_buf = malloc(out_len + 4);

  +8 actually.

>  	} else {
>  		if (!out_len)
>  			out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */
> 
> for gunzip/decompress and similar ones for bzip2, etc.  The contents
> layout doesn't have anything to do with that...

Right. That works nicely.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-18 13:04             ` Thomas Gleixner
@ 2019-02-18 19:15               ` Andy Lutomirski
  2019-02-18 21:13                 ` Jann Horn
                                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Andy Lutomirski @ 2019-02-18 19:15 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Al Viro, Arthur Gautier, Jann Horn, the arch/x86 maintainers,
	Ingo Molnar, Borislav Petkov, kernel list, Pascal Bouchareine

[-- Attachment #1: Type: text/plain, Size: 1863 bytes --]

On Mon, Feb 18, 2019 at 5:04 AM Thomas Gleixner <tglx@linutronix.de> wrote:

> > Another would be to have the buffer passed to flush_buffer() (i.e.
> > the callback of decompress_fn) allocated with 4 bytes of padding
> > past the part where the unpacked piece of data is placed for the
> > callback to find.  As in,
> >
> > diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
> > index 63b4b7eee138..ca3f7ecc9b35 100644
> > --- a/lib/decompress_inflate.c
> > +++ b/lib/decompress_inflate.c
> > @@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
> >       rc = -1;
> >       if (flush) {
> >               out_len = 0x8000; /* 32 K */
> > -             out_buf = malloc(out_len);
> > +             out_buf = malloc(out_len + 4);
>
>   +8 actually.
>
> >       } else {
> >               if (!out_len)
> >                       out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */
> >
> > for gunzip/decompress and similar ones for bzip2, etc.  The contents
> > layout doesn't have anything to do with that...
>
> Right. That works nicely.
>

This seems like it's just papering over the underlying problem: with
Jann's new checks in place, strncpy_from_user() is simply buggy.  Does
the patch below look decent?  It's only compile-tested, but it's
conceptually straightforward.  I was hoping I could get rid of the
check-maximum-address stuff, but it's needed for architectures where
the user range is adjacent to the kernel range (i.e. not x86_64).

Jann, I'm still unhappy that this code will write up to sizeof(long)-1
user-controlled garbage bytes in-bounds past the null-terminator in
the kernel buffer.  Do you think that's worth changing, too?  I don't
think it's a bug per se, but it seems like a nifty little wart for an
attacker to try to abuse.

On brief inspection, strnlen_user() does not have an equivalent bug.

[-- Attachment #2: strncpy.patch --]
[-- Type: text/x-patch, Size: 1994 bytes --]

diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index 58eacd41526c..709d6efe0d42 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -10,12 +10,7 @@
 #include <asm/byteorder.h>
 #include <asm/word-at-a-time.h>
 
-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
-#define IS_UNALIGNED(src, dst)	0
-#else
-#define IS_UNALIGNED(src, dst)	\
-	(((long) dst | (long) src) & (sizeof(long) - 1))
-#endif
+#define IS_UNALIGNED(addr) (((long)(addr)) & (sizeof(long) - 1))
 
 /*
  * Do a strncpy, return length of string without final '\0'.
@@ -35,14 +30,39 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
 	if (max > count)
 		max = count;
 
-	if (IS_UNALIGNED(src, dst))
+	/*
+	 * First handle any unaligned prefix of src.
+	 */
+	while (max && IS_UNALIGNED(src+res)) {
+		char c;
+
+		unsafe_get_user(c, src+res, efault);
+		dst[res] = c;
+		if (!c)
+			return res;
+		res++;
+		max--;
+	}
+
+	/*
+	 * Now we know that src + res is aligned.  If dst is unaligned and
+	 * we don't have efficient unaligned access, then keep going one
+	 * byte at a time.  (This could be optimized, but it would make
+	 * the code more complicated.
+	 */
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+	if (IS_UNALIGNED(dst + res))
 		goto byte_at_a_time;
+#endif
 
 	while (max >= sizeof(unsigned long)) {
+		/*
+		 * src + res is aligned, so the reads in this loop will
+		 * not cross a page boundary.
+		 */
 		unsigned long c, data;
 
-		/* Fall back to byte-at-a-time if we get a page fault */
-		unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
+		unsafe_get_user(c, (unsigned long __user *)(src+res), efault);
 
 		*(unsigned long *)(dst+res) = c;
 		if (has_zero(c, &data, &constants)) {
@@ -54,7 +74,9 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
 		max -= sizeof(unsigned long);
 	}
 
-byte_at_a_time:
+	/*
+	 * Finish the job one byte at a time.
+	 */
 	while (max) {
 		char c;
 

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-18 19:15               ` Andy Lutomirski
@ 2019-02-18 21:13                 ` Jann Horn
  2019-02-18 21:51                 ` Arthur Gautier
  2019-09-26 14:08                 ` Borislav Petkov
  2 siblings, 0 replies; 23+ messages in thread
From: Jann Horn @ 2019-02-18 21:13 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Al Viro, Arthur Gautier,
	the arch/x86 maintainers, Ingo Molnar, Borislav Petkov,
	kernel list, Pascal Bouchareine

On Mon, Feb 18, 2019 at 8:15 PM Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, Feb 18, 2019 at 5:04 AM Thomas Gleixner <tglx@linutronix.de> wrote:
> > > Another would be to have the buffer passed to flush_buffer() (i.e.
> > > the callback of decompress_fn) allocated with 4 bytes of padding
> > > past the part where the unpacked piece of data is placed for the
> > > callback to find.  As in,
> > >
> > > diff --git a/lib/decompress_inflate.c b/lib/decompress_inflate.c
> > > index 63b4b7eee138..ca3f7ecc9b35 100644
> > > --- a/lib/decompress_inflate.c
> > > +++ b/lib/decompress_inflate.c
> > > @@ -48,7 +48,7 @@ STATIC int INIT __gunzip(unsigned char *buf, long len,
> > >       rc = -1;
> > >       if (flush) {
> > >               out_len = 0x8000; /* 32 K */
> > > -             out_buf = malloc(out_len);
> > > +             out_buf = malloc(out_len + 4);
> >
> >   +8 actually.
> >
> > >       } else {
> > >               if (!out_len)
> > >                       out_len = ((size_t)~0) - (size_t)out_buf; /* no limit */
> > >
> > > for gunzip/decompress and similar ones for bzip2, etc.  The contents
> > > layout doesn't have anything to do with that...
> >
> > Right. That works nicely.
> >
>
> This seems like it's just papering over the underlying problem: with
> Jann's new checks in place, strncpy_from_user() is simply buggy.  Does
> the patch below look decent?  It's only compile-tested, but it's
> conceptually straightforward.

Nit: The "This could be optimized" comment has unbalanced parentheses around it.
Nit: Maybe switch the order of "max" and "IS_UNALIGNED(src+res)" in
the loop for unaligned prefix, or tell the compiler that "max" is
likely to be non-zero? That might be a tiny bit more efficient in the
aligned case? But I don't know much about making code fast, so feel
free to ignore this.

I think there is still similar code in some other places in arch/x86;
back when I did the conversion to _ASM_EXTABLE_UA, I stumbled over a
few similar-looking things. In particular:

load_unaligned_zeropad() (used in some VFS code that looks pretty
performance-critical) still uses _ASM_EXTABLE for reading from kernel
memory; it has a comment saying:
/*
 * Load an unaligned word from kernel space.
 *
 * In the (very unlikely) case of the word being a page-crosser
 * and the next page not being mapped, take the exception and
 * return zeroes in the non-existing part.
 */

csum_partial_copy_generic() uses PREFETCHT0 together with the
following macro; I think this can be used on both kernel and user
addresses?
        /*
         * No _ASM_EXTABLE_UA; this is used for intentional prefetch on a
         * potentially unmapped kernel address.
         */
        .macro ignore L=.Lignore
30:
        _ASM_EXTABLE(30b, \L)
        .endm
(That comment says "kernel address", but I wrote that, and I think
what I really meant is "kernel or userspace address".)

> I was hoping I could get rid of the
> check-maximum-address stuff, but it's needed for architectures where
> the user range is adjacent to the kernel range (i.e. not x86_64).

> Jann, I'm still unhappy that this code will write up to sizeof(long)-1
> user-controlled garbage bytes in-bounds past the null-terminator in
> the kernel buffer.  Do you think that's worth changing, too?  I don't
> think it's a bug per se, but it seems like a nifty little wart for an
> attacker to try to abuse.

Hm. I guess it might be interesting if some code path first memsets a
kernel buffer to all-zeroes, then uses strncpy_from_user() to copy
into it, and then exposes the entire buffer to a different user task
(or, if strncpy_from_user() was called on kernel memory, to any user
task).

And if the source is kernel memory (might happen e.g. via splice,
there are VFS write handlers that use strncpy_from_user()), it could
potentially be used to make an uninitialized memory read bug
(constrained to a specific field in some slab object, or something
like that) more powerful, by shoving out-of-bounds kernel data around
such that it is aligned properly for the leak.

So, yeah, I think if it's not too much trouble, changing that would be nice.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-18 19:15               ` Andy Lutomirski
  2019-02-18 21:13                 ` Jann Horn
@ 2019-02-18 21:51                 ` Arthur Gautier
  2019-09-26  9:58                   ` Arthur Gautier
  2019-09-26 14:08                 ` Borislav Petkov
  2 siblings, 1 reply; 23+ messages in thread
From: Arthur Gautier @ 2019-02-18 21:51 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Al Viro, Jann Horn, the arch/x86 maintainers,
	Ingo Molnar, Borislav Petkov, kernel list, Pascal Bouchareine

On Mon, Feb 18, 2019 at 11:15:44AM -0800, Andy Lutomirski wrote:
> This seems like it's just papering over the underlying problem: with
> Jann's new checks in place, strncpy_from_user() is simply buggy.  Does
> the patch below look decent?  It's only compile-tested, but it's
> conceptually straightforward.  I was hoping I could get rid of the
> check-maximum-address stuff, but it's needed for architectures where
> the user range is adjacent to the kernel range (i.e. not x86_64).

I'm unable to trigger the BUG I had with my initramfs with this patch
applied. Thanks!

-- 
\o/ Arthur
 G  Gandi.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-18 21:51                 ` Arthur Gautier
@ 2019-09-26  9:58                   ` Arthur Gautier
  2019-09-26 14:09                     ` Borislav Petkov
  0 siblings, 1 reply; 23+ messages in thread
From: Arthur Gautier @ 2019-09-26  9:58 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Al Viro, Jann Horn, the arch/x86 maintainers,
	Ingo Molnar, Borislav Petkov, kernel list

On Mon, Feb 18, 2019 at 09:51:50PM +0000, Arthur Gautier wrote:
> On Mon, Feb 18, 2019 at 11:15:44AM -0800, Andy Lutomirski wrote:
> > This seems like it's just papering over the underlying problem: with
> > Jann's new checks in place, strncpy_from_user() is simply buggy.  Does
> > the patch below look decent?  It's only compile-tested, but it's
> > conceptually straightforward.  I was hoping I could get rid of the
> > check-maximum-address stuff, but it's needed for architectures where
> > the user range is adjacent to the kernel range (i.e. not x86_64).
> 
> I'm unable to trigger the BUG I had with my initramfs with this patch
> applied. Thanks!
> 

Hello All,

Just a followup on this issue, I'm still able to reproduce the original
issue with:
    truncate -s 8388313 a
    SECONDFILENAME=bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
    truncate -s 10 $SECONDFILENAME
    echo "a\n$SECONDFILENAME" | cpio -o --format=newc | lz4 -l > initrd.img.lz4

I think Andy submitted a patch Feb 25 2019, but I was not copied on it
(I believe it was sent to x86@kernel.org) and I don't know which fate it
had.

Any chance we could have a look again?

Thanks a lot!

-- 
\o/ Arthur
 G  Gandi.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-02-18 19:15               ` Andy Lutomirski
  2019-02-18 21:13                 ` Jann Horn
  2019-02-18 21:51                 ` Arthur Gautier
@ 2019-09-26 14:08                 ` Borislav Petkov
  2 siblings, 0 replies; 23+ messages in thread
From: Borislav Petkov @ 2019-09-26 14:08 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, Al Viro, Arthur Gautier, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, kernel list,
	Pascal Bouchareine

On Mon, Feb 18, 2019 at 11:15:44AM -0800, Andy Lutomirski wrote:
> diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
> index 58eacd41526c..709d6efe0d42 100644
> --- a/lib/strncpy_from_user.c
> +++ b/lib/strncpy_from_user.c
> @@ -10,12 +10,7 @@
>  #include <asm/byteorder.h>
>  #include <asm/word-at-a-time.h>
>  
> -#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> -#define IS_UNALIGNED(src, dst)	0
> -#else
> -#define IS_UNALIGNED(src, dst)	\
> -	(((long) dst | (long) src) & (sizeof(long) - 1))
> -#endif
> +#define IS_UNALIGNED(addr) (((long)(addr)) & (sizeof(long) - 1))
>  
>  /*
>   * Do a strncpy, return length of string without final '\0'.
> @@ -35,14 +30,39 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
>  	if (max > count)
>  		max = count;
>  
> -	if (IS_UNALIGNED(src, dst))
> +	/*
> +	 * First handle any unaligned prefix of src.
> +	 */
> +	while (max && IS_UNALIGNED(src+res)) {

put spaces around the '+', below too.

> +		char c;
> +
> +		unsafe_get_user(c, src+res, efault);
> +		dst[res] = c;
> +		if (!c)
> +			return res;
> +		res++;
> +		max--;
> +	}
> +
> +	/*
> +	 * Now we know that src + res is aligned.  If dst is unaligned and
> +	 * we don't have efficient unaligned access, then keep going one
> +	 * byte at a time.  (This could be optimized, but it would make
> +	 * the code more complicated.
> +	 */
> +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
> +	if (IS_UNALIGNED(dst + res))
>  		goto byte_at_a_time;
> +#endif
>  
>  	while (max >= sizeof(unsigned long)) {
> +		/*
> +		 * src + res is aligned, so the reads in this loop will
> +		 * not cross a page boundary.
> +		 */
>  		unsigned long c, data;
>  
> -		/* Fall back to byte-at-a-time if we get a page fault */
> -		unsafe_get_user(c, (unsigned long __user *)(src+res), byte_at_a_time);
> +		unsafe_get_user(c, (unsigned long __user *)(src+res), efault);
>  
>  		*(unsigned long *)(dst+res) = c;
>  		if (has_zero(c, &data, &constants)) {
> @@ -54,7 +74,9 @@ static inline long do_strncpy_from_user(char *dst, const char __user *src, long
>  		max -= sizeof(unsigned long);
>  	}
>  
> -byte_at_a_time:

You can't remove that label - the ifndef above.

> +	/*
> +	 * Finish the job one byte at a time.
> +	 */
>  	while (max) {
>  		char c;

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-09-26  9:58                   ` Arthur Gautier
@ 2019-09-26 14:09                     ` Borislav Petkov
  2019-10-10 16:49                       ` Arthur Gautier
  0 siblings, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2019-09-26 14:09 UTC (permalink / raw)
  To: Arthur Gautier
  Cc: Andy Lutomirski, Thomas Gleixner, Al Viro, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, kernel list

On Thu, Sep 26, 2019 at 09:58:25AM +0000, Arthur Gautier wrote:
> I think Andy submitted a patch Feb 25 2019, but I was not copied on it
> (I believe it was sent to x86@kernel.org) and I don't know which fate it
> had.

I guess we're still waiting for Andy to do v2 with feedback incorporated
and proper commit message. :)

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-09-26 14:09                     ` Borislav Petkov
@ 2019-10-10 16:49                       ` Arthur Gautier
  2019-11-05 14:11                         ` Borislav Petkov
  0 siblings, 1 reply; 23+ messages in thread
From: Arthur Gautier @ 2019-10-10 16:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Thomas Gleixner, Al Viro, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, kernel list

On Thu, Sep 26, 2019 at 04:09:39PM +0200, Borislav Petkov wrote:
> On Thu, Sep 26, 2019 at 09:58:25AM +0000, Arthur Gautier wrote:
> > I think Andy submitted a patch Feb 25 2019, but I was not copied on it
> > (I believe it was sent to x86@kernel.org) and I don't know which fate it
> > had.
> 
> I guess we're still waiting for Andy to do v2 with feedback incorporated
> and proper commit message. :)

Hello All,

I did not receive neither the patch Andy provided, nor the comments made
on it. But I'd be happy to help and/or take over to fix those if someone could
send me both.

I'd really like to have those problems fixed

Thanks!

-- 
\o/ Arthur
 G  Gandi.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-10-10 16:49                       ` Arthur Gautier
@ 2019-11-05 14:11                         ` Borislav Petkov
  2019-11-05 16:05                           ` Arthur Gautier
  0 siblings, 1 reply; 23+ messages in thread
From: Borislav Petkov @ 2019-11-05 14:11 UTC (permalink / raw)
  To: Arthur Gautier
  Cc: Andy Lutomirski, Thomas Gleixner, Al Viro, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, kernel list

On Thu, Oct 10, 2019 at 04:49:51PM +0000, Arthur Gautier wrote:
> I did not receive neither the patch Andy provided, nor the comments made
> on it. But I'd be happy to help and/or take over to fix those if someone could
> send me both.

Yes, please do, it seems Andy's busy. You can find the whole thread here:

https://lore.kernel.org/lkml/20190215235901.23541-1-baloo@gandi.net/

and you can download it in mbox format.

Care to take Andy's patch, work in the comments I made to it, test it,
write a commit message, i.e., productize it?

So that we get this thing moving...

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-11-05 14:11                         ` Borislav Petkov
@ 2019-11-05 16:05                           ` Arthur Gautier
  2019-11-06  0:21                             ` Andy Lutomirski
  0 siblings, 1 reply; 23+ messages in thread
From: Arthur Gautier @ 2019-11-05 16:05 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Thomas Gleixner, Al Viro, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, kernel list

On Tue, Nov 05, 2019 at 03:11:12PM +0100, Borislav Petkov wrote:
> On Thu, Oct 10, 2019 at 04:49:51PM +0000, Arthur Gautier wrote:
> > I did not receive neither the patch Andy provided, nor the comments made
> > on it. But I'd be happy to help and/or take over to fix those if someone could
> > send me both.
> 
> Yes, please do, it seems Andy's busy. You can find the whole thread here:
> 
> https://lore.kernel.org/lkml/20190215235901.23541-1-baloo@gandi.net/
> 
> and you can download it in mbox format.
> 
> Care to take Andy's patch, work in the comments I made to it, test it,
> write a commit message, i.e., productize it?
> 
> So that we get this thing moving...
> 
> Thx.
> 

Hello Boris,

Thank you! But I believe this is the patch I sent, I know Andy sent a
patchset with two patches, I believe privately (not copied to a public
ML) to some of the recepients here. I got a copy of the second patch
but not the first one.

I believe from discussions here, that comments have been made on those
patchset and because I was not Cc-ed on those patches, I do not have
neither the full patchset nor the comments.

I cannot take over the work, nor finish the patchset.

Would anyone have a copy of the thread and could send them my way?

Thank you so much!

-- 
\o/ Arthur
 G  Gandi.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-11-05 16:05                           ` Arthur Gautier
@ 2019-11-06  0:21                             ` Andy Lutomirski
  2019-11-06  2:22                               ` Arthur Gautier
  0 siblings, 1 reply; 23+ messages in thread
From: Andy Lutomirski @ 2019-11-06  0:21 UTC (permalink / raw)
  To: Arthur Gautier
  Cc: Borislav Petkov, Thomas Gleixner, Al Viro, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, kernel list

On Tue, Nov 5, 2019 at 8:05 AM Arthur Gautier <baloo@gandi.net> wrote:
>
> On Tue, Nov 05, 2019 at 03:11:12PM +0100, Borislav Petkov wrote:
> > On Thu, Oct 10, 2019 at 04:49:51PM +0000, Arthur Gautier wrote:
> > > I did not receive neither the patch Andy provided, nor the comments made
> > > on it. But I'd be happy to help and/or take over to fix those if someone could
> > > send me both.
> >
> > Yes, please do, it seems Andy's busy. You can find the whole thread here:
> >
> > https://lore.kernel.org/lkml/20190215235901.23541-1-baloo@gandi.net/
> >
> > and you can download it in mbox format.
> >
> > Care to take Andy's patch, work in the comments I made to it, test it,
> > write a commit message, i.e., productize it?
> >
> > So that we get this thing moving...
> >
> > Thx.
> >
>
> Hello Boris,
>
> Thank you! But I believe this is the patch I sent, I know Andy sent a
> patchset with two patches, I believe privately (not copied to a public
> ML) to some of the recepients here. I got a copy of the second patch
> but not the first one.
>
> I believe from discussions here, that comments have been made on those
> patchset and because I was not Cc-ed on those patches, I do not have
> neither the full patchset nor the comments.
>
> I cannot take over the work, nor finish the patchset.
>
> Would anyone have a copy of the thread and could send them my way?
>

I forwarded it to you.

Here are the patches in git:

https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=task_size

it's "unaligned", "fix test sparse warning", optionally
"strncpy_from_user: Zero any extra output bytes that get written",
"uaccess: Add a selftest for strncpy_from_user()", and
"strncpy_from_user: Don't overrun the input buffer onto the next page"

The basic summary is that Linus didn't like calling it bug fix, but it
might be acceptable as an improvement.  I also thought that the
KERNEL_DS oops was changed so it didn't trigger here.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86: uaccess: fix regression in unsafe_get_user
  2019-11-06  0:21                             ` Andy Lutomirski
@ 2019-11-06  2:22                               ` Arthur Gautier
  0 siblings, 0 replies; 23+ messages in thread
From: Arthur Gautier @ 2019-11-06  2:22 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Thomas Gleixner, Al Viro, Jann Horn,
	the arch/x86 maintainers, Ingo Molnar, kernel list

On Tue, Nov 05, 2019 at 04:21:06PM -0800, Andy Lutomirski wrote:
> On Tue, Nov 5, 2019 at 8:05 AM Arthur Gautier <baloo@gandi.net> wrote:
> >
> > On Tue, Nov 05, 2019 at 03:11:12PM +0100, Borislav Petkov wrote:
> > > On Thu, Oct 10, 2019 at 04:49:51PM +0000, Arthur Gautier wrote:
> > > > I did not receive neither the patch Andy provided, nor the comments made
> > > > on it. But I'd be happy to help and/or take over to fix those if someone could
> > > > send me both.
> > >
> > > Yes, please do, it seems Andy's busy. You can find the whole thread here:
> > >
> > > https://lore.kernel.org/lkml/20190215235901.23541-1-baloo@gandi.net/
> > >
> > > and you can download it in mbox format.
> > >
> > > Care to take Andy's patch, work in the comments I made to it, test it,
> > > write a commit message, i.e., productize it?
> > >
> > > So that we get this thing moving...
> > >
> > > Thx.
> > >
> >
> > Hello Boris,
> >
> > Thank you! But I believe this is the patch I sent, I know Andy sent a
> > patchset with two patches, I believe privately (not copied to a public
> > ML) to some of the recepients here. I got a copy of the second patch
> > but not the first one.
> >
> > I believe from discussions here, that comments have been made on those
> > patchset and because I was not Cc-ed on those patches, I do not have
> > neither the full patchset nor the comments.
> >
> > I cannot take over the work, nor finish the patchset.
> >
> > Would anyone have a copy of the thread and could send them my way?
> >
> 
> I forwarded it to you.
> 
> Here are the patches in git:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/log/?h=task_size
> 
> it's "unaligned", "fix test sparse warning", optionally
> "strncpy_from_user: Zero any extra output bytes that get written",
> "uaccess: Add a selftest for strncpy_from_user()", and
> "strncpy_from_user: Don't overrun the input buffer onto the next page"
> 
> The basic summary is that Linus didn't like calling it bug fix, but it
> might be acceptable as an improvement.  I also thought that the
> KERNEL_DS oops was changed so it didn't trigger here.

Thank you so much!

-- 
\o/ Arthur
 G  Gandi.net

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2019-11-06  2:22 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-15 23:59 [PATCH] x86: uaccess: fix regression in unsafe_get_user baloo
2019-02-16  4:20 ` Jann Horn
2019-02-16 20:18   ` Thomas Gleixner
2019-02-16 22:25     ` Thomas Gleixner
2019-02-16 22:50     ` Andy Lutomirski
2019-02-16 22:57       ` Andy Lutomirski
2019-02-16 23:47       ` Al Viro
2019-02-17  0:02         ` Al Viro
2019-02-17  2:36         ` Andy Lutomirski
2019-02-17  3:41         ` Arthur Gautier
2019-02-17  4:22           ` Al Viro
2019-02-18 13:04             ` Thomas Gleixner
2019-02-18 19:15               ` Andy Lutomirski
2019-02-18 21:13                 ` Jann Horn
2019-02-18 21:51                 ` Arthur Gautier
2019-09-26  9:58                   ` Arthur Gautier
2019-09-26 14:09                     ` Borislav Petkov
2019-10-10 16:49                       ` Arthur Gautier
2019-11-05 14:11                         ` Borislav Petkov
2019-11-05 16:05                           ` Arthur Gautier
2019-11-06  0:21                             ` Andy Lutomirski
2019-11-06  2:22                               ` Arthur Gautier
2019-09-26 14:08                 ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).