All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC
@ 2020-08-27 13:39 Leon Romanovsky
  2020-08-27 15:13 ` Leon Romanovsky
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Leon Romanovsky @ 2020-08-27 13:39 UTC (permalink / raw)
  To: Peter Oberparleiter, Andrew Morton
  Cc: Leon Romanovsky, Colin Ian King, linux-kernel, linux-mm

From: Leon Romanovsky <leonro@nvidia.com>

The kernel compiled with GCC 10.2.1 and KASAN together with GCOV enabled
produces the following splats while reloading modules.

First splat [1] is generated due to the situation that gcov_info can be both
user and kernel pointer, the memcpy() during kmemdup() causes to this.
As a possible solution copy fields manually.

Second splat [2] is seen because n_function provided by GCC through
__gcov_init() is ridiculously high, in my case it was 2698213824.
IMHO it means that this field is not initialized, but I'm not sure.

[1]
 ==================================================================
 BUG: KASAN: global-out-of-bounds in kmemdup+0x43/0x70
 Read of size 120 at addr ffffffffa0d2c780 by task modprobe/296

 CPU: 0 PID: 296 Comm: modprobe Not tainted 5.9.0-rc1+ #1860
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
 Call Trace:
  ? dump_stack+0x128/0x1af
  ? print_address_description.constprop.0+0x2c/0x3f0
  ? _raw_spin_lock_irqsave+0x34/0xa0
  ? __kasan_check_read+0x1d/0x30
  ? kmemdup+0x43/0x70
  ? kmemdup+0x43/0x70
  ? gcov_info_dup+0x2d/0x730
  ? __kasan_check_write+0x20/0x30
  ? __mutex_unlock_slowpath+0x10d/0x740
  ? gcov_event+0x88d/0xd30
  ? gcov_module_notifier+0xe9/0x100
  ? notifier_call_chain+0xeb/0x170
  ? blocking_notifier_call_chain+0x75/0xc0
  ? __x64_sys_delete_module+0x326/0x5a0
  ? do_init_module+0x810/0x810
  ? syscall_enter_from_user_mode+0x40/0x420
  ? trace_hardirqs_on+0x45/0xb0
  ? syscall_enter_from_user_mode+0x40/0x420
  ? do_syscall_64+0x45/0x70
  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9

 The buggy address belongs to the variable:
  __gcov_.uverbs_attr_get_obj+0x60/0xfffffffffff778e0 [mlx5_ib]

 Memory state around the buggy address:
  ffffffffa0d2c680: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9
  ffffffffa0d2c700: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
 >ffffffffa0d2c780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9
                                                              ^
  ffffffffa0d2c800: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
  ffffffffa0d2c880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ==================================================================
 Disabling lock debugging due to kernel taint
 gcov: could not save data for '/home/leonro/src/kernel/drivers/infiniband/hw/mlx5/std_types.gcda' (out o
f memory)

[2]
Colin has similar error [3].

 ------------[ cut here ]------------
 WARNING: CPU: 0 PID: 296 at mm/page_alloc.c:4859 __alloc_pages_nodemask+0x670/0x3190
 Modules linked in: mlx5_ib(-) mlx5_core mlxfw ptp ib_ipoib pps_core rdma_ucm rdma_cm iw_cm ib_cm ib_umad  ib_uverbs ib_core
 CPU: 0 PID: 296 Comm: modprobe Tainted: G    B             5.9.0-rc1+ #1860
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
 RIP: 0010:__alloc_pages_nodemask+0x670/0x3190
 Code: e9 af fc ff ff 48 83 05 fd 28 90 05 01 81 e7 00 20 00 00 48 c7 44 24 28 00 00 00 00 0f 85 fb fd ff  ff 48 83 05 f0 28 90 05 01 <0f> 0b 48 83 05 ee 28 90 05 01 48 83 05 ee 28 90 05 01 e9 dc fd ff
 RSP: 0018:ffff88805f7ffa28 EFLAGS: 00010202
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff1100befff5e
 RDX: 0000000000000000 RSI: 0000000000000017 RDI: 0000000000000000
 RBP: 000000050695a900 R08: ffff888060fc7900 R09: ffff888060fc793b
 R10: ffffed100c1f8f27 R11: ffffed100c1f8f28 R12: 0000000000040dc0
 R13: 000000050695a900 R14: 0000000000000017 R15: 0000000000000001
 FS:  00007f521f695740(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f31b013f000 CR3: 000000006637e001 CR4: 0000000000370eb0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 Call Trace:
  ? __kmalloc_track_caller+0x17a/0x570
  ? gcov_info_dup+0xfe/0x730
  ? gcov_event+0x88d/0xd30
  ? gcov_module_notifier+0xe9/0x100
  ? blocking_notifier_call_chain+0x75/0xc0
  ? __x64_sys_delete_module+0x326/0x5a0
  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
  ? mark_lock+0xba0/0xba0
  ? mark_lock+0xba0/0xba0
  ? notifier_call_chain+0xeb/0x170
  ? blocking_notifier_call_chain+0x75/0xc0
  ? __x64_sys_delete_module+0x326/0x5a0
  ? do_syscall_64+0x45/0x70
  ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
  ? warn_alloc+0x130/0x130
  ? lock_acquire+0x1f2/0xa30
  ? fs_reclaim_acquire+0x1f/0x70
  ? fs_reclaim_release+0x1f/0x50
  ? __kasan_check_read+0x1d/0x30
  ? reacquire_held_locks+0x420/0x420
  ? reacquire_held_locks+0x420/0x420
  kmalloc_order+0x3f/0xc0
  kmalloc_order_trace+0x24/0x220
  __kmalloc+0x41b/0x5a0
  ? gcov_info_dup+0xfe/0x730
  ? memcpy+0x73/0xa0
  gcov_info_dup+0x176/0x730
  gcov_event+0x88d/0xd30
  gcov_module_notifier+0xe9/0x100
  notifier_call_chain+0xeb/0x170
  blocking_notifier_call_chain+0x75/0xc0
  __x64_sys_delete_module+0x326/0x5a0
  ? do_init_module+0x810/0x810
  ? syscall_enter_from_user_mode+0x40/0x420
  ? trace_hardirqs_on+0x45/0xb0
  ? syscall_enter_from_user_mode+0x40/0x420
  do_syscall_64+0x45/0x70
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7f521f7c531b
 Code: 73 01 c3 48 8b 0d 7d 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4d 0b 0c 00 f7 d8 64 89 01 48
 RSP: 002b:00007ffe1bd4af48 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
 RAX: ffffffffffffffda RBX: 0000561a3eae0910 RCX: 00007f521f7c531b
 RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000561a3eae0978
 RBP: 0000561a3eae0910 R08: 1999999999999999 R09: 0000000000000000
 R10: 00007f521f839ac0 R11: 0000000000000206 R12: 0000000000000000
 R13: 0000561a3eae0978 R14: 0000000000000000 R15: 0000561a3eae84d0
 irq event stamp: 326464
 hardirqs last  enabled at (326463): [<ffffffff832ecdde>] _raw_spin_unlock_irqrestore+0x8e/0xb0
 hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
 hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
 softirqs last  enabled at (320794): [<ffffffff83600931>] __do_softirq+0x931/0xbc4
 softirqs last disabled at (320789): [<ffffffff83400f2f>] asm_call_on_stack+0xf/0x20
 ---[ end trace 065ea9cc2ba144a6 ]---

[3] https://bugzilla.kernel.org/show_bug.cgi?id=208885#c1
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
I have a strong feeling that this solution is not correct, but don't
know how to do it right. The problem exists and reproducable in seconds.
---
 kernel/gcov/gcc_4_7.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/kernel/gcov/gcc_4_7.c b/kernel/gcov/gcc_4_7.c
index 908fdf5098c3..357ef839cdd3 100644
--- a/kernel/gcov/gcc_4_7.c
+++ b/kernel/gcov/gcc_4_7.c
@@ -275,20 +275,23 @@ struct gcov_info *gcov_info_dup(struct gcov_info *info)
 	size_t fi_size; /* function info size */
 	size_t cv_size; /* counter values size */

-	dup = kmemdup(info, sizeof(*dup), GFP_KERNEL);
+	dup = kzalloc(sizeof(*dup), GFP_KERNEL);
 	if (!dup)
 		return NULL;

-	dup->next = NULL;
-	dup->filename = NULL;
-	dup->functions = NULL;
+	dup->version = info->version;
+	dup->stamp = info->stamp;
+	for (fi_idx = 0; i < GCOV_COUNTERS; i++)
+		dup->merge[i] = info->merge[i];
+	dup->n_functions = info->n_functions;

-	dup->filename = kstrdup(info->filename, GFP_KERNEL);
+	dup->filename = kstrdup_const(info->filename, GFP_KERNEL);
 	if (!dup->filename)
 		goto err_free;

-	dup->functions = kcalloc(info->n_functions,
-				 sizeof(struct gcov_fn_info *), GFP_KERNEL);
+	dup->functions =
+		kcalloc(info->n_functions, sizeof(struct gcov_fn_info *),
+			GFP_KERNEL | __GFP_NOWARN);
 	if (!dup->functions)
 		goto err_free;

@@ -359,7 +362,7 @@ void gcov_info_free(struct gcov_info *info)

 free_info:
 	kfree(info->functions);
-	kfree(info->filename);
+	kfree_const(info->filename);
 	kfree(info);
 }

--
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC
  2020-08-27 13:39 [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC Leon Romanovsky
@ 2020-08-27 15:13 ` Leon Romanovsky
  2020-08-27 21:08 ` kernel test robot
  2020-08-29 14:12 ` Colin Ian King
  2 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2020-08-27 15:13 UTC (permalink / raw)
  To: Peter Oberparleiter, Andrew Morton; +Cc: Colin Ian King, linux-kernel, linux-mm

On Thu, Aug 27, 2020 at 04:39:32PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
>
> The kernel compiled with GCC 10.2.1 and KASAN together with GCOV enabled
> produces the following splats while reloading modules.
>
> First splat [1] is generated due to the situation that gcov_info can be both
> user and kernel pointer, the memcpy() during kmemdup() causes to this.
> As a possible solution copy fields manually.
>
> Second splat [2] is seen because n_function provided by GCC through
> __gcov_init() is ridiculously high, in my case it was 2698213824.
> IMHO it means that this field is not initialized, but I'm not sure.
>
> [1]
>  ==================================================================
>  BUG: KASAN: global-out-of-bounds in kmemdup+0x43/0x70
>  Read of size 120 at addr ffffffffa0d2c780 by task modprobe/296
>
>  CPU: 0 PID: 296 Comm: modprobe Not tainted 5.9.0-rc1+ #1860
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
>  Call Trace:
>   ? dump_stack+0x128/0x1af
>   ? print_address_description.constprop.0+0x2c/0x3f0
>   ? _raw_spin_lock_irqsave+0x34/0xa0
>   ? __kasan_check_read+0x1d/0x30
>   ? kmemdup+0x43/0x70
>   ? kmemdup+0x43/0x70
>   ? gcov_info_dup+0x2d/0x730
>   ? __kasan_check_write+0x20/0x30
>   ? __mutex_unlock_slowpath+0x10d/0x740
>   ? gcov_event+0x88d/0xd30
>   ? gcov_module_notifier+0xe9/0x100
>   ? notifier_call_chain+0xeb/0x170
>   ? blocking_notifier_call_chain+0x75/0xc0
>   ? __x64_sys_delete_module+0x326/0x5a0
>   ? do_init_module+0x810/0x810
>   ? syscall_enter_from_user_mode+0x40/0x420
>   ? trace_hardirqs_on+0x45/0xb0
>   ? syscall_enter_from_user_mode+0x40/0x420
>   ? do_syscall_64+0x45/0x70
>   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
>  The buggy address belongs to the variable:
>   __gcov_.uverbs_attr_get_obj+0x60/0xfffffffffff778e0 [mlx5_ib]
>
>  Memory state around the buggy address:
>   ffffffffa0d2c680: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9
>   ffffffffa0d2c700: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
>  >ffffffffa0d2c780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9
>                                                               ^
>   ffffffffa0d2c800: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
>   ffffffffa0d2c880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ==================================================================
>  Disabling lock debugging due to kernel taint
>  gcov: could not save data for '/home/leonro/src/kernel/drivers/infiniband/hw/mlx5/std_types.gcda' (out o
> f memory)
>
> [2]
> Colin has similar error [3].
>
>  ------------[ cut here ]------------
>  WARNING: CPU: 0 PID: 296 at mm/page_alloc.c:4859 __alloc_pages_nodemask+0x670/0x3190
>  Modules linked in: mlx5_ib(-) mlx5_core mlxfw ptp ib_ipoib pps_core rdma_ucm rdma_cm iw_cm ib_cm ib_umad  ib_uverbs ib_core
>  CPU: 0 PID: 296 Comm: modprobe Tainted: G    B             5.9.0-rc1+ #1860
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
>  RIP: 0010:__alloc_pages_nodemask+0x670/0x3190
>  Code: e9 af fc ff ff 48 83 05 fd 28 90 05 01 81 e7 00 20 00 00 48 c7 44 24 28 00 00 00 00 0f 85 fb fd ff  ff 48 83 05 f0 28 90 05 01 <0f> 0b 48 83 05 ee 28 90 05 01 48 83 05 ee 28 90 05 01 e9 dc fd ff
>  RSP: 0018:ffff88805f7ffa28 EFLAGS: 00010202
>  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff1100befff5e
>  RDX: 0000000000000000 RSI: 0000000000000017 RDI: 0000000000000000
>  RBP: 000000050695a900 R08: ffff888060fc7900 R09: ffff888060fc793b
>  R10: ffffed100c1f8f27 R11: ffffed100c1f8f28 R12: 0000000000040dc0
>  R13: 000000050695a900 R14: 0000000000000017 R15: 0000000000000001
>  FS:  00007f521f695740(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 00007f31b013f000 CR3: 000000006637e001 CR4: 0000000000370eb0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  Call Trace:
>   ? __kmalloc_track_caller+0x17a/0x570
>   ? gcov_info_dup+0xfe/0x730
>   ? gcov_event+0x88d/0xd30
>   ? gcov_module_notifier+0xe9/0x100
>   ? blocking_notifier_call_chain+0x75/0xc0
>   ? __x64_sys_delete_module+0x326/0x5a0
>   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   ? mark_lock+0xba0/0xba0
>   ? mark_lock+0xba0/0xba0
>   ? notifier_call_chain+0xeb/0x170
>   ? blocking_notifier_call_chain+0x75/0xc0
>   ? __x64_sys_delete_module+0x326/0x5a0
>   ? do_syscall_64+0x45/0x70
>   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   ? warn_alloc+0x130/0x130
>   ? lock_acquire+0x1f2/0xa30
>   ? fs_reclaim_acquire+0x1f/0x70
>   ? fs_reclaim_release+0x1f/0x50
>   ? __kasan_check_read+0x1d/0x30
>   ? reacquire_held_locks+0x420/0x420
>   ? reacquire_held_locks+0x420/0x420
>   kmalloc_order+0x3f/0xc0
>   kmalloc_order_trace+0x24/0x220
>   __kmalloc+0x41b/0x5a0
>   ? gcov_info_dup+0xfe/0x730
>   ? memcpy+0x73/0xa0
>   gcov_info_dup+0x176/0x730
>   gcov_event+0x88d/0xd30
>   gcov_module_notifier+0xe9/0x100
>   notifier_call_chain+0xeb/0x170
>   blocking_notifier_call_chain+0x75/0xc0
>   __x64_sys_delete_module+0x326/0x5a0
>   ? do_init_module+0x810/0x810
>   ? syscall_enter_from_user_mode+0x40/0x420
>   ? trace_hardirqs_on+0x45/0xb0
>   ? syscall_enter_from_user_mode+0x40/0x420
>   do_syscall_64+0x45/0x70
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>  RIP: 0033:0x7f521f7c531b
>  Code: 73 01 c3 48 8b 0d 7d 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4d 0b 0c 00 f7 d8 64 89 01 48
>  RSP: 002b:00007ffe1bd4af48 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>  RAX: ffffffffffffffda RBX: 0000561a3eae0910 RCX: 00007f521f7c531b
>  RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000561a3eae0978
>  RBP: 0000561a3eae0910 R08: 1999999999999999 R09: 0000000000000000
>  R10: 00007f521f839ac0 R11: 0000000000000206 R12: 0000000000000000
>  R13: 0000561a3eae0978 R14: 0000000000000000 R15: 0000561a3eae84d0
>  irq event stamp: 326464
>  hardirqs last  enabled at (326463): [<ffffffff832ecdde>] _raw_spin_unlock_irqrestore+0x8e/0xb0
>  hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
>  hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
>  softirqs last  enabled at (320794): [<ffffffff83600931>] __do_softirq+0x931/0xbc4
>  softirqs last disabled at (320789): [<ffffffff83400f2f>] asm_call_on_stack+0xf/0x20
>  ---[ end trace 065ea9cc2ba144a6 ]---
>
> [3] https://bugzilla.kernel.org/show_bug.cgi?id=208885#c1
> Cc: Colin Ian King <colin.king@canonical.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
> I have a strong feeling that this solution is not correct, but don't
> know how to do it right. The problem exists and reproducable in seconds.
> ---
>  kernel/gcov/gcc_4_7.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/gcov/gcc_4_7.c b/kernel/gcov/gcc_4_7.c
> index 908fdf5098c3..357ef839cdd3 100644
> --- a/kernel/gcov/gcc_4_7.c
> +++ b/kernel/gcov/gcc_4_7.c
> @@ -275,20 +275,23 @@ struct gcov_info *gcov_info_dup(struct gcov_info *info)
>  	size_t fi_size; /* function info size */
>  	size_t cv_size; /* counter values size */
>
> -	dup = kmemdup(info, sizeof(*dup), GFP_KERNEL);
> +	dup = kzalloc(sizeof(*dup), GFP_KERNEL);
>  	if (!dup)
>  		return NULL;
>
> -	dup->next = NULL;
> -	dup->filename = NULL;
> -	dup->functions = NULL;
> +	dup->version = info->version;
> +	dup->stamp = info->stamp;
> +	for (fi_idx = 0; i < GCOV_COUNTERS; i++)
> +		dup->merge[i] = info->merge[i];

And of course "i" should be replaced to be "fi_idx".
But I'm confident that the solution is not right anyway.

Thanks

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC
  2020-08-27 13:39 [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC Leon Romanovsky
  2020-08-27 15:13 ` Leon Romanovsky
@ 2020-08-27 21:08 ` kernel test robot
  2020-08-29 14:12 ` Colin Ian King
  2 siblings, 0 replies; 6+ messages in thread
From: kernel test robot @ 2020-08-27 21:08 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4564 bytes --]

Hi Leon,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on hnaz-linux-mm/master]
[also build test ERROR on linux/master linus/master v5.9-rc2 next-20200827]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Leon-Romanovsky/gcov-Protect-from-uninitialized-number-of-functions-provided-by-GCC/20200827-214008
base:   https://github.com/hnaz/linux-mm master
config: m68k-randconfig-s032-20200827 (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # apt-get install sparse
        # sparse version: v0.6.2-191-g10164920-dirty
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=m68k 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   kernel/gcov/gcc_4_7.c: In function 'gcov_info_dup':
>> kernel/gcov/gcc_4_7.c:284:19: error: 'i' undeclared (first use in this function)
     284 |  for (fi_idx = 0; i < GCOV_COUNTERS; i++)
         |                   ^
   kernel/gcov/gcc_4_7.c:284:19: note: each undeclared identifier is reported only once for each function it appears in

# https://github.com/0day-ci/linux/commit/9087d4a9ebf3be8ae595ecef237b57270d78b4cb
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Leon-Romanovsky/gcov-Protect-from-uninitialized-number-of-functions-provided-by-GCC/20200827-214008
git checkout 9087d4a9ebf3be8ae595ecef237b57270d78b4cb
vim +/i +284 kernel/gcov/gcc_4_7.c

   260	
   261	/**
   262	 * gcov_info_dup - duplicate profiling data set
   263	 * @info: profiling data set to duplicate
   264	 *
   265	 * Return newly allocated duplicate on success, %NULL on error.
   266	 */
   267	struct gcov_info *gcov_info_dup(struct gcov_info *info)
   268	{
   269		struct gcov_info *dup;
   270		struct gcov_ctr_info *dci_ptr; /* dst counter info */
   271		struct gcov_ctr_info *sci_ptr; /* src counter info */
   272		unsigned int active;
   273		unsigned int fi_idx; /* function info idx */
   274		unsigned int ct_idx; /* counter type idx */
   275		size_t fi_size; /* function info size */
   276		size_t cv_size; /* counter values size */
   277	
   278		dup = kzalloc(sizeof(*dup), GFP_KERNEL);
   279		if (!dup)
   280			return NULL;
   281	
   282		dup->version = info->version;
   283		dup->stamp = info->stamp;
 > 284		for (fi_idx = 0; i < GCOV_COUNTERS; i++)
   285			dup->merge[i] = info->merge[i];
   286		dup->n_functions = info->n_functions;
   287	
   288		dup->filename = kstrdup_const(info->filename, GFP_KERNEL);
   289		if (!dup->filename)
   290			goto err_free;
   291	
   292		dup->functions =
   293			kcalloc(info->n_functions, sizeof(struct gcov_fn_info *),
   294				GFP_KERNEL | __GFP_NOWARN);
   295		if (!dup->functions)
   296			goto err_free;
   297	
   298		active = num_counter_active(info);
   299		fi_size = sizeof(struct gcov_fn_info);
   300		fi_size += sizeof(struct gcov_ctr_info) * active;
   301	
   302		for (fi_idx = 0; fi_idx < info->n_functions; fi_idx++) {
   303			dup->functions[fi_idx] = kzalloc(fi_size, GFP_KERNEL);
   304			if (!dup->functions[fi_idx])
   305				goto err_free;
   306	
   307			*(dup->functions[fi_idx]) = *(info->functions[fi_idx]);
   308	
   309			sci_ptr = info->functions[fi_idx]->ctrs;
   310			dci_ptr = dup->functions[fi_idx]->ctrs;
   311	
   312			for (ct_idx = 0; ct_idx < active; ct_idx++) {
   313	
   314				cv_size = sizeof(gcov_type) * sci_ptr->num;
   315	
   316				dci_ptr->values = vmalloc(cv_size);
   317	
   318				if (!dci_ptr->values)
   319					goto err_free;
   320	
   321				dci_ptr->num = sci_ptr->num;
   322				memcpy(dci_ptr->values, sci_ptr->values, cv_size);
   323	
   324				sci_ptr++;
   325				dci_ptr++;
   326			}
   327		}
   328	
   329		return dup;
   330	err_free:
   331		gcov_info_free(dup);
   332		return NULL;
   333	}
   334	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 22846 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC
  2020-08-27 13:39 [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC Leon Romanovsky
  2020-08-27 15:13 ` Leon Romanovsky
  2020-08-27 21:08 ` kernel test robot
@ 2020-08-29 14:12 ` Colin Ian King
  2020-08-30  6:45     ` Leon Romanovsky
  2 siblings, 1 reply; 6+ messages in thread
From: Colin Ian King @ 2020-08-29 14:12 UTC (permalink / raw)
  To: Leon Romanovsky, Peter Oberparleiter, Andrew Morton
  Cc: Leon Romanovsky, linux-kernel, linux-mm

On 27/08/2020 14:39, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> The kernel compiled with GCC 10.2.1 and KASAN together with GCOV enabled
> produces the following splats while reloading modules.
> 
> First splat [1] is generated due to the situation that gcov_info can be both
> user and kernel pointer, the memcpy() during kmemdup() causes to this.
> As a possible solution copy fields manually.
> 
> Second splat [2] is seen because n_function provided by GCC through
> __gcov_init() is ridiculously high, in my case it was 2698213824.
> IMHO it means that this field is not initialized, but I'm not sure.
> 
> [1]
>  ==================================================================
>  BUG: KASAN: global-out-of-bounds in kmemdup+0x43/0x70
>  Read of size 120 at addr ffffffffa0d2c780 by task modprobe/296
> 
>  CPU: 0 PID: 296 Comm: modprobe Not tainted 5.9.0-rc1+ #1860
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
>  Call Trace:
>   ? dump_stack+0x128/0x1af
>   ? print_address_description.constprop.0+0x2c/0x3f0
>   ? _raw_spin_lock_irqsave+0x34/0xa0
>   ? __kasan_check_read+0x1d/0x30
>   ? kmemdup+0x43/0x70
>   ? kmemdup+0x43/0x70
>   ? gcov_info_dup+0x2d/0x730
>   ? __kasan_check_write+0x20/0x30
>   ? __mutex_unlock_slowpath+0x10d/0x740
>   ? gcov_event+0x88d/0xd30
>   ? gcov_module_notifier+0xe9/0x100
>   ? notifier_call_chain+0xeb/0x170
>   ? blocking_notifier_call_chain+0x75/0xc0
>   ? __x64_sys_delete_module+0x326/0x5a0
>   ? do_init_module+0x810/0x810
>   ? syscall_enter_from_user_mode+0x40/0x420
>   ? trace_hardirqs_on+0x45/0xb0
>   ? syscall_enter_from_user_mode+0x40/0x420
>   ? do_syscall_64+0x45/0x70
>   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
>  The buggy address belongs to the variable:
>   __gcov_.uverbs_attr_get_obj+0x60/0xfffffffffff778e0 [mlx5_ib]
> 
>  Memory state around the buggy address:
>   ffffffffa0d2c680: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9
>   ffffffffa0d2c700: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
>  >ffffffffa0d2c780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9
>                                                               ^
>   ffffffffa0d2c800: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
>   ffffffffa0d2c880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ==================================================================
>  Disabling lock debugging due to kernel taint
>  gcov: could not save data for '/home/leonro/src/kernel/drivers/infiniband/hw/mlx5/std_types.gcda' (out o
> f memory)
> 
> [2]
> Colin has similar error [3].
> 
>  ------------[ cut here ]------------
>  WARNING: CPU: 0 PID: 296 at mm/page_alloc.c:4859 __alloc_pages_nodemask+0x670/0x3190
>  Modules linked in: mlx5_ib(-) mlx5_core mlxfw ptp ib_ipoib pps_core rdma_ucm rdma_cm iw_cm ib_cm ib_umad  ib_uverbs ib_core
>  CPU: 0 PID: 296 Comm: modprobe Tainted: G    B             5.9.0-rc1+ #1860
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
>  RIP: 0010:__alloc_pages_nodemask+0x670/0x3190
>  Code: e9 af fc ff ff 48 83 05 fd 28 90 05 01 81 e7 00 20 00 00 48 c7 44 24 28 00 00 00 00 0f 85 fb fd ff  ff 48 83 05 f0 28 90 05 01 <0f> 0b 48 83 05 ee 28 90 05 01 48 83 05 ee 28 90 05 01 e9 dc fd ff
>  RSP: 0018:ffff88805f7ffa28 EFLAGS: 00010202
>  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff1100befff5e
>  RDX: 0000000000000000 RSI: 0000000000000017 RDI: 0000000000000000
>  RBP: 000000050695a900 R08: ffff888060fc7900 R09: ffff888060fc793b
>  R10: ffffed100c1f8f27 R11: ffffed100c1f8f28 R12: 0000000000040dc0
>  R13: 000000050695a900 R14: 0000000000000017 R15: 0000000000000001
>  FS:  00007f521f695740(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: 00007f31b013f000 CR3: 000000006637e001 CR4: 0000000000370eb0
>  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>  Call Trace:
>   ? __kmalloc_track_caller+0x17a/0x570
>   ? gcov_info_dup+0xfe/0x730
>   ? gcov_event+0x88d/0xd30
>   ? gcov_module_notifier+0xe9/0x100
>   ? blocking_notifier_call_chain+0x75/0xc0
>   ? __x64_sys_delete_module+0x326/0x5a0
>   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   ? mark_lock+0xba0/0xba0
>   ? mark_lock+0xba0/0xba0
>   ? notifier_call_chain+0xeb/0x170
>   ? blocking_notifier_call_chain+0x75/0xc0
>   ? __x64_sys_delete_module+0x326/0x5a0
>   ? do_syscall_64+0x45/0x70
>   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
>   ? warn_alloc+0x130/0x130
>   ? lock_acquire+0x1f2/0xa30
>   ? fs_reclaim_acquire+0x1f/0x70
>   ? fs_reclaim_release+0x1f/0x50
>   ? __kasan_check_read+0x1d/0x30
>   ? reacquire_held_locks+0x420/0x420
>   ? reacquire_held_locks+0x420/0x420
>   kmalloc_order+0x3f/0xc0
>   kmalloc_order_trace+0x24/0x220
>   __kmalloc+0x41b/0x5a0
>   ? gcov_info_dup+0xfe/0x730
>   ? memcpy+0x73/0xa0
>   gcov_info_dup+0x176/0x730
>   gcov_event+0x88d/0xd30
>   gcov_module_notifier+0xe9/0x100
>   notifier_call_chain+0xeb/0x170
>   blocking_notifier_call_chain+0x75/0xc0
>   __x64_sys_delete_module+0x326/0x5a0
>   ? do_init_module+0x810/0x810
>   ? syscall_enter_from_user_mode+0x40/0x420
>   ? trace_hardirqs_on+0x45/0xb0
>   ? syscall_enter_from_user_mode+0x40/0x420
>   do_syscall_64+0x45/0x70
>   entry_SYSCALL_64_after_hwframe+0x44/0xa9
>  RIP: 0033:0x7f521f7c531b
>  Code: 73 01 c3 48 8b 0d 7d 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4d 0b 0c 00 f7 d8 64 89 01 48
>  RSP: 002b:00007ffe1bd4af48 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>  RAX: ffffffffffffffda RBX: 0000561a3eae0910 RCX: 00007f521f7c531b
>  RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000561a3eae0978
>  RBP: 0000561a3eae0910 R08: 1999999999999999 R09: 0000000000000000
>  R10: 00007f521f839ac0 R11: 0000000000000206 R12: 0000000000000000
>  R13: 0000561a3eae0978 R14: 0000000000000000 R15: 0000561a3eae84d0
>  irq event stamp: 326464
>  hardirqs last  enabled at (326463): [<ffffffff832ecdde>] _raw_spin_unlock_irqrestore+0x8e/0xb0
>  hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
>  hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
>  softirqs last  enabled at (320794): [<ffffffff83600931>] __do_softirq+0x931/0xbc4
>  softirqs last disabled at (320789): [<ffffffff83400f2f>] asm_call_on_stack+0xf/0x20
>  ---[ end trace 065ea9cc2ba144a6 ]---
> 
> [3] https://bugzilla.kernel.org/show_bug.cgi?id=208885#c1
> Cc: Colin Ian King <colin.king@canonical.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
> I have a strong feeling that this solution is not correct, but don't
> know how to do it right. The problem exists and reproducable in seconds.
> ---
>  kernel/gcov/gcc_4_7.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/gcov/gcc_4_7.c b/kernel/gcov/gcc_4_7.c
> index 908fdf5098c3..357ef839cdd3 100644
> --- a/kernel/gcov/gcc_4_7.c
> +++ b/kernel/gcov/gcc_4_7.c
> @@ -275,20 +275,23 @@ struct gcov_info *gcov_info_dup(struct gcov_info *info)
>  	size_t fi_size; /* function info size */
>  	size_t cv_size; /* counter values size */
> 
> -	dup = kmemdup(info, sizeof(*dup), GFP_KERNEL);
> +	dup = kzalloc(sizeof(*dup), GFP_KERNEL);
>  	if (!dup)
>  		return NULL;
> 
> -	dup->next = NULL;
> -	dup->filename = NULL;
> -	dup->functions = NULL;
> +	dup->version = info->version;
> +	dup->stamp = info->stamp;
> +	for (fi_idx = 0; i < GCOV_COUNTERS; i++)

This loop refers to i as a counter but it's not declared.


> +		dup->merge[i] = info->merge[i];
> +	dup->n_functions = info->n_functions;
> 
> -	dup->filename = kstrdup(info->filename, GFP_KERNEL);
> +	dup->filename = kstrdup_const(info->filename, GFP_KERNEL);
>  	if (!dup->filename)
>  		goto err_free;
> 
> -	dup->functions = kcalloc(info->n_functions,
> -				 sizeof(struct gcov_fn_info *), GFP_KERNEL);
> +	dup->functions =
> +		kcalloc(info->n_functions, sizeof(struct gcov_fn_info *),
> +			GFP_KERNEL | __GFP_NOWARN);
>  	if (!dup->functions)
>  		goto err_free;
> 
> @@ -359,7 +362,7 @@ void gcov_info_free(struct gcov_info *info)
> 
>  free_info:
>  	kfree(info->functions);
> -	kfree(info->filename);
> +	kfree_const(info->filename);
>  	kfree(info);
>  }
> 
> --
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC
  2020-08-29 14:12 ` Colin Ian King
@ 2020-08-30  6:45     ` Leon Romanovsky
  0 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2020-08-30  6:45 UTC (permalink / raw)
  To: Colin Ian King
  Cc: Leon Romanovsky, Peter Oberparleiter, Andrew Morton,
	Leon Romanovsky, linux-kernel, Linux-MM

On Sat, Aug 29, 2020 at 5:12 PM Colin Ian King <colin.king@canonical.com> wrote:
>
> On 27/08/2020 14:39, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > The kernel compiled with GCC 10.2.1 and KASAN together with GCOV enabled
> > produces the following splats while reloading modules.
> >
> > First splat [1] is generated due to the situation that gcov_info can be both
> > user and kernel pointer, the memcpy() during kmemdup() causes to this.
> > As a possible solution copy fields manually.
> >
> > Second splat [2] is seen because n_function provided by GCC through
> > __gcov_init() is ridiculously high, in my case it was 2698213824.
> > IMHO it means that this field is not initialized, but I'm not sure.
> >
> > [1]
> >  ==================================================================
> >  BUG: KASAN: global-out-of-bounds in kmemdup+0x43/0x70
> >  Read of size 120 at addr ffffffffa0d2c780 by task modprobe/296
> >
> >  CPU: 0 PID: 296 Comm: modprobe Not tainted 5.9.0-rc1+ #1860
> >  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
> >  Call Trace:
> >   ? dump_stack+0x128/0x1af
> >   ? print_address_description.constprop.0+0x2c/0x3f0
> >   ? _raw_spin_lock_irqsave+0x34/0xa0
> >   ? __kasan_check_read+0x1d/0x30
> >   ? kmemdup+0x43/0x70
> >   ? kmemdup+0x43/0x70
> >   ? gcov_info_dup+0x2d/0x730
> >   ? __kasan_check_write+0x20/0x30
> >   ? __mutex_unlock_slowpath+0x10d/0x740
> >   ? gcov_event+0x88d/0xd30
> >   ? gcov_module_notifier+0xe9/0x100
> >   ? notifier_call_chain+0xeb/0x170
> >   ? blocking_notifier_call_chain+0x75/0xc0
> >   ? __x64_sys_delete_module+0x326/0x5a0
> >   ? do_init_module+0x810/0x810
> >   ? syscall_enter_from_user_mode+0x40/0x420
> >   ? trace_hardirqs_on+0x45/0xb0
> >   ? syscall_enter_from_user_mode+0x40/0x420
> >   ? do_syscall_64+0x45/0x70
> >   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> >  The buggy address belongs to the variable:
> >   __gcov_.uverbs_attr_get_obj+0x60/0xfffffffffff778e0 [mlx5_ib]
> >
> >  Memory state around the buggy address:
> >   ffffffffa0d2c680: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9
> >   ffffffffa0d2c700: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
> >  >ffffffffa0d2c780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9
> >                                                               ^
> >   ffffffffa0d2c800: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
> >   ffffffffa0d2c880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >  ==================================================================
> >  Disabling lock debugging due to kernel taint
> >  gcov: could not save data for '/home/leonro/src/kernel/drivers/infiniband/hw/mlx5/std_types.gcda' (out o
> > f memory)
> >
> > [2]
> > Colin has similar error [3].
> >
> >  ------------[ cut here ]------------
> >  WARNING: CPU: 0 PID: 296 at mm/page_alloc.c:4859 __alloc_pages_nodemask+0x670/0x3190
> >  Modules linked in: mlx5_ib(-) mlx5_core mlxfw ptp ib_ipoib pps_core rdma_ucm rdma_cm iw_cm ib_cm ib_umad  ib_uverbs ib_core
> >  CPU: 0 PID: 296 Comm: modprobe Tainted: G    B             5.9.0-rc1+ #1860
> >  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
> >  RIP: 0010:__alloc_pages_nodemask+0x670/0x3190
> >  Code: e9 af fc ff ff 48 83 05 fd 28 90 05 01 81 e7 00 20 00 00 48 c7 44 24 28 00 00 00 00 0f 85 fb fd ff  ff 48 83 05 f0 28 90 05 01 <0f> 0b 48 83 05 ee 28 90 05 01 48 83 05 ee 28 90 05 01 e9 dc fd ff
> >  RSP: 0018:ffff88805f7ffa28 EFLAGS: 00010202
> >  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff1100befff5e
> >  RDX: 0000000000000000 RSI: 0000000000000017 RDI: 0000000000000000
> >  RBP: 000000050695a900 R08: ffff888060fc7900 R09: ffff888060fc793b
> >  R10: ffffed100c1f8f27 R11: ffffed100c1f8f28 R12: 0000000000040dc0
> >  R13: 000000050695a900 R14: 0000000000000017 R15: 0000000000000001
> >  FS:  00007f521f695740(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >  CR2: 00007f31b013f000 CR3: 000000006637e001 CR4: 0000000000370eb0
> >  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >  Call Trace:
> >   ? __kmalloc_track_caller+0x17a/0x570
> >   ? gcov_info_dup+0xfe/0x730
> >   ? gcov_event+0x88d/0xd30
> >   ? gcov_module_notifier+0xe9/0x100
> >   ? blocking_notifier_call_chain+0x75/0xc0
> >   ? __x64_sys_delete_module+0x326/0x5a0
> >   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >   ? mark_lock+0xba0/0xba0
> >   ? mark_lock+0xba0/0xba0
> >   ? notifier_call_chain+0xeb/0x170
> >   ? blocking_notifier_call_chain+0x75/0xc0
> >   ? __x64_sys_delete_module+0x326/0x5a0
> >   ? do_syscall_64+0x45/0x70
> >   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >   ? warn_alloc+0x130/0x130
> >   ? lock_acquire+0x1f2/0xa30
> >   ? fs_reclaim_acquire+0x1f/0x70
> >   ? fs_reclaim_release+0x1f/0x50
> >   ? __kasan_check_read+0x1d/0x30
> >   ? reacquire_held_locks+0x420/0x420
> >   ? reacquire_held_locks+0x420/0x420
> >   kmalloc_order+0x3f/0xc0
> >   kmalloc_order_trace+0x24/0x220
> >   __kmalloc+0x41b/0x5a0
> >   ? gcov_info_dup+0xfe/0x730
> >   ? memcpy+0x73/0xa0
> >   gcov_info_dup+0x176/0x730
> >   gcov_event+0x88d/0xd30
> >   gcov_module_notifier+0xe9/0x100
> >   notifier_call_chain+0xeb/0x170
> >   blocking_notifier_call_chain+0x75/0xc0
> >   __x64_sys_delete_module+0x326/0x5a0
> >   ? do_init_module+0x810/0x810
> >   ? syscall_enter_from_user_mode+0x40/0x420
> >   ? trace_hardirqs_on+0x45/0xb0
> >   ? syscall_enter_from_user_mode+0x40/0x420
> >   do_syscall_64+0x45/0x70
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >  RIP: 0033:0x7f521f7c531b
> >  Code: 73 01 c3 48 8b 0d 7d 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4d 0b 0c 00 f7 d8 64 89 01 48
> >  RSP: 002b:00007ffe1bd4af48 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> >  RAX: ffffffffffffffda RBX: 0000561a3eae0910 RCX: 00007f521f7c531b
> >  RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000561a3eae0978
> >  RBP: 0000561a3eae0910 R08: 1999999999999999 R09: 0000000000000000
> >  R10: 00007f521f839ac0 R11: 0000000000000206 R12: 0000000000000000
> >  R13: 0000561a3eae0978 R14: 0000000000000000 R15: 0000561a3eae84d0
> >  irq event stamp: 326464
> >  hardirqs last  enabled at (326463): [<ffffffff832ecdde>] _raw_spin_unlock_irqrestore+0x8e/0xb0
> >  hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
> >  hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
> >  softirqs last  enabled at (320794): [<ffffffff83600931>] __do_softirq+0x931/0xbc4
> >  softirqs last disabled at (320789): [<ffffffff83400f2f>] asm_call_on_stack+0xf/0x20
> >  ---[ end trace 065ea9cc2ba144a6 ]---
> >
> > [3] https://bugzilla.kernel.org/show_bug.cgi?id=208885#c1
> > Cc: Colin Ian King <colin.king@canonical.com>
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> > I have a strong feeling that this solution is not correct, but don't
> > know how to do it right. The problem exists and reproducable in seconds.
> > ---
> >  kernel/gcov/gcc_4_7.c | 19 +++++++++++--------
> >  1 file changed, 11 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/gcov/gcc_4_7.c b/kernel/gcov/gcc_4_7.c
> > index 908fdf5098c3..357ef839cdd3 100644
> > --- a/kernel/gcov/gcc_4_7.c
> > +++ b/kernel/gcov/gcc_4_7.c
> > @@ -275,20 +275,23 @@ struct gcov_info *gcov_info_dup(struct gcov_info *info)
> >       size_t fi_size; /* function info size */
> >       size_t cv_size; /* counter values size */
> >
> > -     dup = kmemdup(info, sizeof(*dup), GFP_KERNEL);
> > +     dup = kzalloc(sizeof(*dup), GFP_KERNEL);
> >       if (!dup)
> >               return NULL;
> >
> > -     dup->next = NULL;
> > -     dup->filename = NULL;
> > -     dup->functions = NULL;
> > +     dup->version = info->version;
> > +     dup->stamp = info->stamp;
> > +     for (fi_idx = 0; i < GCOV_COUNTERS; i++)
>
> This loop refers to i as a counter but it's not declared.

Thanks, I wrote it above.
https://lore.kernel.org/lkml/20200827151333.GB2909436@unreal

The thing is that I would expect to hear any feedback if this GCC bug
or not and how to solve it.
We tried to GCC 9 and gcov worked there.

Thanks

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC
@ 2020-08-30  6:45     ` Leon Romanovsky
  0 siblings, 0 replies; 6+ messages in thread
From: Leon Romanovsky @ 2020-08-30  6:45 UTC (permalink / raw)
  To: Colin Ian King
  Cc: Leon Romanovsky, Peter Oberparleiter, Andrew Morton,
	Leon Romanovsky, linux-kernel, Linux-MM

On Sat, Aug 29, 2020 at 5:12 PM Colin Ian King <colin.king@canonical.com> wrote:
>
> On 27/08/2020 14:39, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> >
> > The kernel compiled with GCC 10.2.1 and KASAN together with GCOV enabled
> > produces the following splats while reloading modules.
> >
> > First splat [1] is generated due to the situation that gcov_info can be both
> > user and kernel pointer, the memcpy() during kmemdup() causes to this.
> > As a possible solution copy fields manually.
> >
> > Second splat [2] is seen because n_function provided by GCC through
> > __gcov_init() is ridiculously high, in my case it was 2698213824.
> > IMHO it means that this field is not initialized, but I'm not sure.
> >
> > [1]
> >  ==================================================================
> >  BUG: KASAN: global-out-of-bounds in kmemdup+0x43/0x70
> >  Read of size 120 at addr ffffffffa0d2c780 by task modprobe/296
> >
> >  CPU: 0 PID: 296 Comm: modprobe Not tainted 5.9.0-rc1+ #1860
> >  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
> >  Call Trace:
> >   ? dump_stack+0x128/0x1af
> >   ? print_address_description.constprop.0+0x2c/0x3f0
> >   ? _raw_spin_lock_irqsave+0x34/0xa0
> >   ? __kasan_check_read+0x1d/0x30
> >   ? kmemdup+0x43/0x70
> >   ? kmemdup+0x43/0x70
> >   ? gcov_info_dup+0x2d/0x730
> >   ? __kasan_check_write+0x20/0x30
> >   ? __mutex_unlock_slowpath+0x10d/0x740
> >   ? gcov_event+0x88d/0xd30
> >   ? gcov_module_notifier+0xe9/0x100
> >   ? notifier_call_chain+0xeb/0x170
> >   ? blocking_notifier_call_chain+0x75/0xc0
> >   ? __x64_sys_delete_module+0x326/0x5a0
> >   ? do_init_module+0x810/0x810
> >   ? syscall_enter_from_user_mode+0x40/0x420
> >   ? trace_hardirqs_on+0x45/0xb0
> >   ? syscall_enter_from_user_mode+0x40/0x420
> >   ? do_syscall_64+0x45/0x70
> >   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> >  The buggy address belongs to the variable:
> >   __gcov_.uverbs_attr_get_obj+0x60/0xfffffffffff778e0 [mlx5_ib]
> >
> >  Memory state around the buggy address:
> >   ffffffffa0d2c680: 00 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9
> >   ffffffffa0d2c700: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
> >  >ffffffffa0d2c780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f9 f9
> >                                                               ^
> >   ffffffffa0d2c800: f9 f9 f9 f9 00 00 00 00 00 f9 f9 f9 f9 f9 f9 f9
> >   ffffffffa0d2c880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >  ==================================================================
> >  Disabling lock debugging due to kernel taint
> >  gcov: could not save data for '/home/leonro/src/kernel/drivers/infiniband/hw/mlx5/std_types.gcda' (out o
> > f memory)
> >
> > [2]
> > Colin has similar error [3].
> >
> >  ------------[ cut here ]------------
> >  WARNING: CPU: 0 PID: 296 at mm/page_alloc.c:4859 __alloc_pages_nodemask+0x670/0x3190
> >  Modules linked in: mlx5_ib(-) mlx5_core mlxfw ptp ib_ipoib pps_core rdma_ucm rdma_cm iw_cm ib_cm ib_umad  ib_uverbs ib_core
> >  CPU: 0 PID: 296 Comm: modprobe Tainted: G    B             5.9.0-rc1+ #1860
> >  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04 /01/2014
> >  RIP: 0010:__alloc_pages_nodemask+0x670/0x3190
> >  Code: e9 af fc ff ff 48 83 05 fd 28 90 05 01 81 e7 00 20 00 00 48 c7 44 24 28 00 00 00 00 0f 85 fb fd ff  ff 48 83 05 f0 28 90 05 01 <0f> 0b 48 83 05 ee 28 90 05 01 48 83 05 ee 28 90 05 01 e9 dc fd ff
> >  RSP: 0018:ffff88805f7ffa28 EFLAGS: 00010202
> >  RAX: 0000000000000000 RBX: 0000000000000000 RCX: 1ffff1100befff5e
> >  RDX: 0000000000000000 RSI: 0000000000000017 RDI: 0000000000000000
> >  RBP: 000000050695a900 R08: ffff888060fc7900 R09: ffff888060fc793b
> >  R10: ffffed100c1f8f27 R11: ffffed100c1f8f28 R12: 0000000000040dc0
> >  R13: 000000050695a900 R14: 0000000000000017 R15: 0000000000000001
> >  FS:  00007f521f695740(0000) GS:ffff88806ce00000(0000) knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >  CR2: 00007f31b013f000 CR3: 000000006637e001 CR4: 0000000000370eb0
> >  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >  Call Trace:
> >   ? __kmalloc_track_caller+0x17a/0x570
> >   ? gcov_info_dup+0xfe/0x730
> >   ? gcov_event+0x88d/0xd30
> >   ? gcov_module_notifier+0xe9/0x100
> >   ? blocking_notifier_call_chain+0x75/0xc0
> >   ? __x64_sys_delete_module+0x326/0x5a0
> >   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >   ? mark_lock+0xba0/0xba0
> >   ? mark_lock+0xba0/0xba0
> >   ? notifier_call_chain+0xeb/0x170
> >   ? blocking_notifier_call_chain+0x75/0xc0
> >   ? __x64_sys_delete_module+0x326/0x5a0
> >   ? do_syscall_64+0x45/0x70
> >   ? entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >   ? warn_alloc+0x130/0x130
> >   ? lock_acquire+0x1f2/0xa30
> >   ? fs_reclaim_acquire+0x1f/0x70
> >   ? fs_reclaim_release+0x1f/0x50
> >   ? __kasan_check_read+0x1d/0x30
> >   ? reacquire_held_locks+0x420/0x420
> >   ? reacquire_held_locks+0x420/0x420
> >   kmalloc_order+0x3f/0xc0
> >   kmalloc_order_trace+0x24/0x220
> >   __kmalloc+0x41b/0x5a0
> >   ? gcov_info_dup+0xfe/0x730
> >   ? memcpy+0x73/0xa0
> >   gcov_info_dup+0x176/0x730
> >   gcov_event+0x88d/0xd30
> >   gcov_module_notifier+0xe9/0x100
> >   notifier_call_chain+0xeb/0x170
> >   blocking_notifier_call_chain+0x75/0xc0
> >   __x64_sys_delete_module+0x326/0x5a0
> >   ? do_init_module+0x810/0x810
> >   ? syscall_enter_from_user_mode+0x40/0x420
> >   ? trace_hardirqs_on+0x45/0xb0
> >   ? syscall_enter_from_user_mode+0x40/0x420
> >   do_syscall_64+0x45/0x70
> >   entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >  RIP: 0033:0x7f521f7c531b
> >  Code: 73 01 c3 48 8b 0d 7d 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4d 0b 0c 00 f7 d8 64 89 01 48
> >  RSP: 002b:00007ffe1bd4af48 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
> >  RAX: ffffffffffffffda RBX: 0000561a3eae0910 RCX: 00007f521f7c531b
> >  RDX: 0000000000000000 RSI: 0000000000000800 RDI: 0000561a3eae0978
> >  RBP: 0000561a3eae0910 R08: 1999999999999999 R09: 0000000000000000
> >  R10: 00007f521f839ac0 R11: 0000000000000206 R12: 0000000000000000
> >  R13: 0000561a3eae0978 R14: 0000000000000000 R15: 0000561a3eae84d0
> >  irq event stamp: 326464
> >  hardirqs last  enabled at (326463): [<ffffffff832ecdde>] _raw_spin_unlock_irqrestore+0x8e/0xb0
> >  hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
> >  hardirqs last disabled at (326464): [<ffffffff832ec994>] _raw_spin_lock_irqsave+0x34/0xa0
> >  softirqs last  enabled at (320794): [<ffffffff83600931>] __do_softirq+0x931/0xbc4
> >  softirqs last disabled at (320789): [<ffffffff83400f2f>] asm_call_on_stack+0xf/0x20
> >  ---[ end trace 065ea9cc2ba144a6 ]---
> >
> > [3] https://bugzilla.kernel.org/show_bug.cgi?id=208885#c1
> > Cc: Colin Ian King <colin.king@canonical.com>
> > Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> > ---
> > I have a strong feeling that this solution is not correct, but don't
> > know how to do it right. The problem exists and reproducable in seconds.
> > ---
> >  kernel/gcov/gcc_4_7.c | 19 +++++++++++--------
> >  1 file changed, 11 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/gcov/gcc_4_7.c b/kernel/gcov/gcc_4_7.c
> > index 908fdf5098c3..357ef839cdd3 100644
> > --- a/kernel/gcov/gcc_4_7.c
> > +++ b/kernel/gcov/gcc_4_7.c
> > @@ -275,20 +275,23 @@ struct gcov_info *gcov_info_dup(struct gcov_info *info)
> >       size_t fi_size; /* function info size */
> >       size_t cv_size; /* counter values size */
> >
> > -     dup = kmemdup(info, sizeof(*dup), GFP_KERNEL);
> > +     dup = kzalloc(sizeof(*dup), GFP_KERNEL);
> >       if (!dup)
> >               return NULL;
> >
> > -     dup->next = NULL;
> > -     dup->filename = NULL;
> > -     dup->functions = NULL;
> > +     dup->version = info->version;
> > +     dup->stamp = info->stamp;
> > +     for (fi_idx = 0; i < GCOV_COUNTERS; i++)
>
> This loop refers to i as a counter but it's not declared.

Thanks, I wrote it above.
https://lore.kernel.org/lkml/20200827151333.GB2909436@unreal

The thing is that I would expect to hear any feedback if this GCC bug
or not and how to solve it.
We tried to GCC 9 and gcov worked there.

Thanks


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-08-30  6:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-27 13:39 [RFC PATCH -rc] gcov: Protect from uninitialized number of functions provided by GCC Leon Romanovsky
2020-08-27 15:13 ` Leon Romanovsky
2020-08-27 21:08 ` kernel test robot
2020-08-29 14:12 ` Colin Ian King
2020-08-30  6:45   ` Leon Romanovsky
2020-08-30  6:45     ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.