All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-rc 0/2] Two IB/core fixes
@ 2021-10-24  6:08 Leon Romanovsky
  2021-10-24  6:08 ` [PATCH rdma-rc 1/2] RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string Leon Romanovsky
  2021-10-24  6:08 ` [PATCH rdma-rc 2/2] RDMA/core: Initialize lock when allocate a rdma_hw_stats structure Leon Romanovsky
  0 siblings, 2 replies; 8+ messages in thread
From: Leon Romanovsky @ 2021-10-24  6:08 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, Ira Weiny, John Fleck, Kaike Wan, linux-kernel,
	linux-rdma, Maor Gottlieb, Mark Bloch, Mark Bloch, Mark Zhang

From: Leon Romanovsky <leonro@nvidia.com>

Two IB/core fixes from Mark.

Mark Zhang (2):
  RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
  RDMA/core: Initialize lock when allocate a rdma_hw_stats structure

 drivers/infiniband/core/sa_query.c | 5 +++--
 drivers/infiniband/core/sysfs.c    | 2 --
 drivers/infiniband/core/verbs.c    | 1 +
 3 files changed, 4 insertions(+), 4 deletions(-)

-- 
2.31.1


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH rdma-rc 1/2] RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
  2021-10-24  6:08 [PATCH rdma-rc 0/2] Two IB/core fixes Leon Romanovsky
@ 2021-10-24  6:08 ` Leon Romanovsky
  2021-10-25 17:02   ` Jason Gunthorpe
  2021-10-24  6:08 ` [PATCH rdma-rc 2/2] RDMA/core: Initialize lock when allocate a rdma_hw_stats structure Leon Romanovsky
  1 sibling, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2021-10-24  6:08 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Mark Zhang, Ira Weiny, John Fleck, Kaike Wan, linux-kernel,
	linux-rdma, Mark Bloch, Mark Bloch

From: Mark Zhang <markzhang@nvidia.com>

When copy the device name, the length of data memcpy copied exceeds
the length of the source buffer, which cause the KASAN issue below.
Use strscpy_pad instead.

 BUG: KASAN: slab-out-of-bounds in ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
 Read of size 64 at addr ffff88811a10f5e0 by task rping/140263
 CPU: 3 PID: 140263 Comm: rping Not tainted 5.15.0-rc1+ #1
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 Call Trace:
 dump_stack_lvl+0x57/0x7d
 print_address_description.constprop.0+0x1d/0xa0
 kasan_report+0xcb/0x110
 ? lock_downgrade+0xb0/0xc0
 ? ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
 kasan_check_range+0x13d/0x180
 memcpy+0x20/0x60
 ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
 ? init_mad+0xf0/0xf0 [ib_core]
 ? __nlmsg_put+0x9a/0xb0
 ? ibnl_put_msg+0x90/0xd0 [ib_core]
 ib_nl_make_request+0x1c6/0x380 [ib_core]
 ? ib_nl_set_path_rec_attrs+0x320/0x320 [ib_core]
 ? netlink_has_listeners+0x114/0x210
 send_mad+0x20a/0x220 [ib_core]
 ? ib_nl_make_request+0x380/0x380 [ib_core]
 ? memcpy+0x39/0x60
 ? value_read+0x20/0x80 [ib_core]
 ? ib_pack+0x140/0x2a0 [ib_core]
 ib_sa_path_rec_get+0x3e3/0x800 [ib_core]
 ? alloc_mad+0x390/0x390 [ib_core]
 ? __kasan_kmalloc+0x7c/0x90
 ? rdma_resolve_route+0x37b/0x3e0 [rdma_cm]
 ? ucma_resolve_route+0xe1/0x150 [rdma_ucm]
 ? ucma_write+0x17b/0x1f0 [rdma_ucm]
 ? vfs_write+0x142/0x4d0
 ? ksys_write+0x133/0x160
 ? do_syscall_64+0x43/0x90
 ? entry_SYSCALL_64_after_hwframe+0x44/0xae
 ? print_usage_bug+0x50/0x50
 ? lock_downgrade+0xc0/0xc0
 cma_query_ib_route+0x29b/0x390 [rdma_cm]
 ? rdma_set_ib_path+0x150/0x150 [rdma_cm]
 ? lockdep_hardirqs_on_prepare+0x12e/0x200
 ? rdma_create_user_id+0x80/0x80 [rdma_cm]
 ? rdma_resolve_route+0x37b/0x3e0 [rdma_cm]
 ? rdma_resolve_route+0x308/0x3e0 [rdma_cm]
 rdma_resolve_route+0x308/0x3e0 [rdma_cm]
 ucma_resolve_route+0xe1/0x150 [rdma_ucm]
 ? ucma_disconnect+0x140/0x140 [rdma_ucm]
 ucma_write+0x17b/0x1f0 [rdma_ucm]
 ? ucma_copy_ib_route+0x1a0/0x1a0 [rdma_ucm]
 ? __fget_files+0x146/0x240
 vfs_write+0x142/0x4d0
 ksys_write+0x133/0x160
 ? __ia32_sys_read+0x50/0x50
 ? lockdep_hardirqs_on_prepare+0x12e/0x200
 ? syscall_enter_from_user_mode+0x1d/0x50
 do_syscall_64+0x43/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f26499aa90f
 Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 5c fd ff ff 48
 RSP: 002b:00007f26495f2dc0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
 RAX: ffffffffffffffda RBX: 00000000000007d0 RCX: 00007f26499aa90f
 RDX: 0000000000000010 RSI: 00007f26495f2e00 RDI: 0000000000000003
 RBP: 00005632a8315440 R08: 0000000000000000 R09: 0000000000000001
 R10: 0000000000000000 R11: 0000000000000293 R12: 00007f26495f2e00
 R13: 00005632a83154e0 R14: 00005632a8315440 R15: 00005632a830a810

 Allocated by task 131419:
 kasan_save_stack+0x1b/0x40
 __kasan_kmalloc+0x7c/0x90
 proc_self_get_link+0x8b/0x100
 pick_link+0x4f1/0x5c0
 step_into+0x2eb/0x3d0
 walk_component+0xc8/0x2c0
 link_path_walk+0x3b8/0x580
 path_openat+0x101/0x230
 do_filp_open+0x12e/0x240
 do_sys_openat2+0x115/0x280
 __x64_sys_openat+0xce/0x140
 do_syscall_64+0x43/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

 The buggy address belongs to the object at ffff88811a10f5e0
 kmalloc-16 of size 16
 The buggy address is located 0 bytes inside of
10f5e0, ffff88811a10f5f0)
 The buggy address belongs to the page:
 page:000000007b6da7b1 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88811a10f1e0 pfn:0x11a10f
 flags: 0x8000000000000200(slab|zone=2)
 raw: 8000000000000200 ffffea0004463040 0000001200000012 ffff8881000423c0
 raw: ffff88811a10f1e0 000000008080007f 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
 ffff88811a10f480: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
 ffff88811a10f500: fa fb fc fc fa fb fc fc 00 00 fc fc 00 00 fc fc
 >ffff88811a10f580: 00 00 fc fc fa fb fc fc 00 00 fc fc 00 00 fc fc
 ^
 ffff88811a10f600: 00 00 fc fc fa fb fc fc fa fb fc fc 00 00 fc fc
 ffff88811a10f680: 00 00 fc fc 00 00 fc fc fa fb fc fc 00 00 fc fc

Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/sa_query.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 4220a545387f..74ecd7456a11 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -706,8 +706,9 @@ static void ib_nl_set_path_rec_attrs(struct sk_buff *skb,
 
 	/* Construct the family header first */
 	header = skb_put(skb, NLMSG_ALIGN(sizeof(*header)));
-	memcpy(header->device_name, dev_name(&query->port->agent->device->dev),
-	       LS_DEVICE_NAME_MAX);
+	strscpy_pad(header->device_name,
+		    dev_name(&query->port->agent->device->dev),
+		    LS_DEVICE_NAME_MAX);
 	header->port_num = query->port->port_num;
 
 	if ((comp_mask & IB_SA_PATH_REC_REVERSIBLE) &&
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH rdma-rc 2/2] RDMA/core: Initialize lock when allocate a rdma_hw_stats structure
  2021-10-24  6:08 [PATCH rdma-rc 0/2] Two IB/core fixes Leon Romanovsky
  2021-10-24  6:08 ` [PATCH rdma-rc 1/2] RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string Leon Romanovsky
@ 2021-10-24  6:08 ` Leon Romanovsky
  2021-10-25 14:50   ` Jason Gunthorpe
  1 sibling, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2021-10-24  6:08 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Mark Zhang, Ira Weiny, John Fleck, Kaike Wan, linux-kernel,
	linux-rdma, Maor Gottlieb, Mark Bloch, Mark Bloch

From: Mark Zhang <markzhang@nvidia.com>

Initialize the rdma_hw_stats "lock" field when do allocation, to fix the
warning below. Then we don't need to initialize it in sysfs, remove it.

 [ 1460.517202] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
 [ 1460.517223] WARNING: CPU: 4 PID: 64464 at kernel/locking/mutex.c:575 __mutex_lock+0x9c3/0x12b0
 [ 1460.520539] Modules linked in: bonding ip_gre mlx5_ib geneve ib_ipoib ip6_gre gre nf_tables ip6_tunnel tunnel6 ib_umad mlx5_core rdma_ucm ib_uverbs ipip tunnel4 openvswitch nsh xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core overlay fuse [last unloaded: nf_tables]
 [ 1460.528622] CPU: 4 PID: 64464 Comm: rdma Not tainted 5.15.0-rc5_for_upstream_debug_2021_10_14_11_06 #1
 [ 1460.530762] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
 [ 1460.533280] RIP: 0010:__mutex_lock+0x9c3/0x12b0
 [ 1460.534385] Code: 08 84 d2 0f 85 cd 08 00 00 8b 05 00 7e 70 01 85 c0 0f 85 51 f7 ff ff 48 c7 c6 00 a4 89 83 48 c7 c7 c0 a1 89 83 e8 b8 97 f0 ff <0f> 0b e9 37 f7 ff ff 48 8b 44 24 40 48 8d b8 f0 06 00 00 48 89 f8
 [ 1460.538488] RSP: 0018:ffff88822d016d18 EFLAGS: 00010282
 [ 1460.539733] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
 [ 1460.541308] RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffffed1045a02d95
 [ 1460.542868] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff8884d322f8fb
 [ 1460.544554] R10: ffffed109a645f1f R11: 0000000000000001 R12: dffffc0000000000
 [ 1460.546323] R13: ffff88819f846000 R14: ffff888102ad6060 R15: ffff88819f846000
 [ 1460.548037] FS:  00007f149943f800(0000) GS:ffff8884d3200000(0000) knlGS:0000000000000000
 [ 1460.550036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [ 1460.551446] CR2: 00007fff4fd8cff8 CR3: 000000017c108001 CR4: 0000000000370ea0
 [ 1460.553112] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 [ 1460.554836] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
 [ 1460.556517] Call Trace:
 [ 1460.557231]  ? fill_res_counter_entry+0x6ee/0x1020 [ib_core]
 [ 1460.558885]  ? mutex_lock_io_nested+0x1130/0x1130
 [ 1460.560082]  ? fill_res_counter_entry+0x5c7/0x1020 [ib_core]
 [ 1460.561587]  ? lock_downgrade+0x6e0/0x6e0
 [ 1460.562646]  ? memcpy+0x39/0x60
 [ 1460.563529]  ? nla_put+0x15f/0x1c0
 [ 1460.564447]  fill_res_counter_entry+0x6ee/0x1020 [ib_core]
 [ 1460.565906]  ? fill_res_srq_entry+0x940/0x940 [ib_core]
 [ 1460.567346]  ? rdma_restrack_count+0x440/0x440 [ib_core]
 [ 1460.568763]  ? memcpy+0x39/0x60
 [ 1460.569441]  ? nla_put+0x15f/0x1c0
 [ 1460.570226]  res_get_common_dumpit+0x907/0x10a0 [ib_core]
 [ 1460.571251]  ? fill_res_srq_entry+0x940/0x940 [ib_core]
 [ 1460.572292]  ? _nldev_get_dumpit+0x4c0/0x4c0 [ib_core]
 [ 1460.573526]  ? mark_lock+0xf7/0x2e60
 [ 1460.574451]  ? nla_get_range_signed+0x540/0x540
 [ 1460.575505]  ? mark_lock+0xf7/0x2e60
 [ 1460.576192]  ? lock_chain_count+0x20/0x20
 [ 1460.576935]  nldev_stat_get_dumpit+0x20a/0x290 [ib_core]
 [ 1460.577948]  ? res_get_common_dumpit+0x10a0/0x10a0 [ib_core]
 [ 1460.579026]  ? memset+0x20/0x40
 [ 1460.579642]  ? __build_skb_around+0x1f8/0x2b0
 [ 1460.580452]  ? netlink_skb_set_owner_r+0xc6/0x1e0
 [ 1460.581327]  netlink_dump+0x451/0x1040
 [ 1460.582032]  ? netlink_deliver_tap+0xb10/0xb10
 [ 1460.582863]  ? __netlink_dump_start+0x222/0x830
 [ 1460.583686]  __netlink_dump_start+0x583/0x830
 [ 1460.584487]  rdma_nl_rcv_msg+0x3f3/0x7c0 [ib_core]
 [ 1460.585445]  ? rdma_nl_chk_listeners+0xb0/0xb0 [ib_core]
 [ 1460.595904]  ? res_get_common_dumpit+0x10a0/0x10a0 [ib_core]
 [ 1460.596962]  ? netlink_deliver_tap+0xbc/0xb10
 [ 1460.597765]  rdma_nl_rcv+0x264/0x410 [ib_core]
 [ 1460.598631]  ? rdma_nl_rcv_msg+0x7c0/0x7c0 [ib_core]
 [ 1460.599612]  ? netlink_deliver_tap+0x140/0xb10
 [ 1460.600415]  ? netlink_deliver_tap+0x14c/0xb10
 [ 1460.601210]  ? _copy_from_iter+0x282/0xbe0
 [ 1460.601964]  netlink_unicast+0x433/0x700
 [ 1460.602691]  ? netlink_attachskb+0x740/0x740
 [ 1460.603486]  ? __alloc_skb+0x117/0x2c0
 [ 1460.604184]  netlink_sendmsg+0x707/0xbf0
 [ 1460.604912]  ? netlink_unicast+0x700/0x700
 [ 1460.605669]  ? netlink_unicast+0x700/0x700
 [ 1460.606414]  sock_sendmsg+0xb0/0xe0
 [ 1460.607097]  __sys_sendto+0x193/0x240
 [ 1460.607788]  ? __x64_sys_getpeername+0xb0/0xb0
 [ 1460.608589]  ? _copy_to_user+0x94/0xb0
 [ 1460.609292]  ? __x64_sys_connect+0xb0/0xb0
 [ 1460.610039]  ? __x64_sys_socketpair+0xf0/0xf0
 [ 1460.610864]  ? __sys_socket+0x11a/0x1a0
 [ 1460.611594]  ? sock_ioctl+0x610/0x610
 [ 1460.612297]  __x64_sys_sendto+0xdd/0x1b0
 [ 1460.613041]  ? syscall_enter_from_user_mode+0x1d/0x50
 [ 1460.613965]  do_syscall_64+0x3d/0x90
 [ 1460.614648]  entry_SYSCALL_64_after_hwframe+0x44/0xae
 [ 1460.615582] RIP: 0033:0x7f1499622cba
 [ 1460.616275] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 76 c3 0f 1f 44 00 00 55 48 83 ec 30 44 89 4c
 [ 1460.619476] RSP: 002b:00007fff4fd8e9a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
 [ 1460.620843] RAX: ffffffffffffffda RBX: 00007fff4fd8ec70 RCX: 00007f1499622cba
 [ 1460.622088] RDX: 0000000000000028 RSI: 0000000000c43910 RDI: 0000000000000004
 [ 1460.623341] RBP: 00007fff4fd8ec70 R08: 00007f14996ee000 R09: 000000000000000c
 [ 1460.624577] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000c44940
 [ 1460.625818] R13: 0000000000000000 R14: 00007fff4fd8ec90 R15: 00007fff4fd8ec70
 [ 1460.627074] irq event stamp: 44063
 [ 1460.627736] hardirqs last  enabled at (44063): [<ffffffff83349157>] _raw_spin_unlock_irqrestore+0x47/0x50
 [ 1460.629415] hardirqs last disabled at (44062): [<ffffffff83348f50>] _raw_spin_lock_irqsave+0x50/0x60
 [ 1460.631072] softirqs last  enabled at (43932): [<ffffffff82a74532>] netlink_insert+0x172/0x11d0
 [ 1460.632707] softirqs last disabled at (43930): [<ffffffff828490db>] release_sock+0x1b/0x170
 [ 1460.634227] ---[ end trace 48c63b35e252a166 ]---

Fixes: e945130b52be ("IB/core: Protect against concurrent access to hardware stats")
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/core/sysfs.c | 2 --
 drivers/infiniband/core/verbs.c | 1 +
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index 8626dfbf2199..a3f84b50c46a 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -911,7 +911,6 @@ alloc_hw_stats_device(struct ib_device *ibdev)
 	if (!data->group.attrs)
 		goto err_free_data;
 
-	mutex_init(&stats->lock);
 	data->group.name = "hw_counters";
 	data->stats = stats;
 	return data;
@@ -1018,7 +1017,6 @@ alloc_hw_stats_port(struct ib_port *port, struct attribute_group *group)
 	if (!group->attrs)
 		goto err_free_data;
 
-	mutex_init(&stats->lock);
 	group->name = "hw_counters";
 	data->stats = stats;
 	return data;
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 47cf273d0678..692d5ff657df 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -3002,6 +3002,7 @@ struct rdma_hw_stats *rdma_alloc_hw_stats_struct(
 	stats->descs = descs;
 	stats->num_counters = num_counters;
 	stats->lifespan = msecs_to_jiffies(lifespan);
+	mutex_init(&stats->lock);
 
 	return stats;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc 2/2] RDMA/core: Initialize lock when allocate a rdma_hw_stats structure
  2021-10-24  6:08 ` [PATCH rdma-rc 2/2] RDMA/core: Initialize lock when allocate a rdma_hw_stats structure Leon Romanovsky
@ 2021-10-25 14:50   ` Jason Gunthorpe
  2021-10-26  8:27     ` Leon Romanovsky
  0 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2021-10-25 14:50 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Mark Zhang, Ira Weiny, John Fleck, Kaike Wan,
	linux-kernel, linux-rdma, Maor Gottlieb, Mark Bloch, Mark Bloch

On Sun, Oct 24, 2021 at 09:08:21AM +0300, Leon Romanovsky wrote:
> From: Mark Zhang <markzhang@nvidia.com>
> 
> Initialize the rdma_hw_stats "lock" field when do allocation, to fix the
> warning below. Then we don't need to initialize it in sysfs, remove it.

This is a fine cleanup, but this does not describe the bug properly,
or have the right fixes line..

The issue is here:

static struct rdma_counter *alloc_and_bind(struct ib_device *dev, u32 port,
					   struct ib_qp *qp,
					   enum rdma_nl_counter_mode mode)
{
	counter->stats = dev->ops.counter_alloc_stats(counter);
	if (!counter->stats)
		goto err_stats;

Which does not init counter->stat's mutex.

And trim the oops reports, don't include the usless ? fns, timestamps
or other junk.

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc 1/2] RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string
  2021-10-24  6:08 ` [PATCH rdma-rc 1/2] RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string Leon Romanovsky
@ 2021-10-25 17:02   ` Jason Gunthorpe
  0 siblings, 0 replies; 8+ messages in thread
From: Jason Gunthorpe @ 2021-10-25 17:02 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Mark Zhang, Ira Weiny, John Fleck, Kaike Wan,
	linux-kernel, linux-rdma, Mark Bloch, Mark Bloch

On Sun, Oct 24, 2021 at 09:08:20AM +0300, Leon Romanovsky wrote:
> From: Mark Zhang <markzhang@nvidia.com>
> 
> When copy the device name, the length of data memcpy copied exceeds
> the length of the source buffer, which cause the KASAN issue below.
> Use strscpy_pad instead.
> 
>  BUG: KASAN: slab-out-of-bounds in ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
>  Read of size 64 at addr ffff88811a10f5e0 by task rping/140263
>  CPU: 3 PID: 140263 Comm: rping Not tainted 5.15.0-rc1+ #1
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>  Call Trace:
>  dump_stack_lvl+0x57/0x7d
>  print_address_description.constprop.0+0x1d/0xa0
>  kasan_report+0xcb/0x110
>  ? lock_downgrade+0xb0/0xc0
>  ? ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
>  kasan_check_range+0x13d/0x180
>  memcpy+0x20/0x60
>  ib_nl_set_path_rec_attrs+0x136/0x320 [ib_core]
>  ? init_mad+0xf0/0xf0 [ib_core]
>  ? __nlmsg_put+0x9a/0xb0
>  ? ibnl_put_msg+0x90/0xd0 [ib_core]
>  ib_nl_make_request+0x1c6/0x380 [ib_core]
>  ? ib_nl_set_path_rec_attrs+0x320/0x320 [ib_core]
>  ? netlink_has_listeners+0x114/0x210
>  send_mad+0x20a/0x220 [ib_core]
>  ? ib_nl_make_request+0x380/0x380 [ib_core]
>  ? memcpy+0x39/0x60
>  ? value_read+0x20/0x80 [ib_core]
>  ? ib_pack+0x140/0x2a0 [ib_core]
>  ib_sa_path_rec_get+0x3e3/0x800 [ib_core]
>  ? alloc_mad+0x390/0x390 [ib_core]
>  ? __kasan_kmalloc+0x7c/0x90
>  ? rdma_resolve_route+0x37b/0x3e0 [rdma_cm]
>  ? ucma_resolve_route+0xe1/0x150 [rdma_ucm]
>  ? ucma_write+0x17b/0x1f0 [rdma_ucm]
>  ? vfs_write+0x142/0x4d0
>  ? ksys_write+0x133/0x160
>  ? do_syscall_64+0x43/0x90
>  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
>  ? print_usage_bug+0x50/0x50
>  ? lock_downgrade+0xc0/0xc0
>  cma_query_ib_route+0x29b/0x390 [rdma_cm]
>  ? rdma_set_ib_path+0x150/0x150 [rdma_cm]
>  ? lockdep_hardirqs_on_prepare+0x12e/0x200
>  ? rdma_create_user_id+0x80/0x80 [rdma_cm]
>  ? rdma_resolve_route+0x37b/0x3e0 [rdma_cm]
>  ? rdma_resolve_route+0x308/0x3e0 [rdma_cm]
>  rdma_resolve_route+0x308/0x3e0 [rdma_cm]
>  ucma_resolve_route+0xe1/0x150 [rdma_ucm]
>  ? ucma_disconnect+0x140/0x140 [rdma_ucm]
>  ucma_write+0x17b/0x1f0 [rdma_ucm]
>  ? ucma_copy_ib_route+0x1a0/0x1a0 [rdma_ucm]
>  ? __fget_files+0x146/0x240
>  vfs_write+0x142/0x4d0
>  ksys_write+0x133/0x160
>  ? __ia32_sys_read+0x50/0x50
>  ? lockdep_hardirqs_on_prepare+0x12e/0x200
>  ? syscall_enter_from_user_mode+0x1d/0x50
>  do_syscall_64+0x43/0x90
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
>  RIP: 0033:0x7f26499aa90f
>  Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 29 fd ff ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 5c fd ff ff 48
>  RSP: 002b:00007f26495f2dc0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
>  RAX: ffffffffffffffda RBX: 00000000000007d0 RCX: 00007f26499aa90f
>  RDX: 0000000000000010 RSI: 00007f26495f2e00 RDI: 0000000000000003
>  RBP: 00005632a8315440 R08: 0000000000000000 R09: 0000000000000001
>  R10: 0000000000000000 R11: 0000000000000293 R12: 00007f26495f2e00
>  R13: 00005632a83154e0 R14: 00005632a8315440 R15: 00005632a830a810
> 
>  Allocated by task 131419:
>  kasan_save_stack+0x1b/0x40
>  __kasan_kmalloc+0x7c/0x90
>  proc_self_get_link+0x8b/0x100
>  pick_link+0x4f1/0x5c0
>  step_into+0x2eb/0x3d0
>  walk_component+0xc8/0x2c0
>  link_path_walk+0x3b8/0x580
>  path_openat+0x101/0x230
>  do_filp_open+0x12e/0x240
>  do_sys_openat2+0x115/0x280
>  __x64_sys_openat+0xce/0x140
>  do_syscall_64+0x43/0x90
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
>  The buggy address belongs to the object at ffff88811a10f5e0
>  kmalloc-16 of size 16
>  The buggy address is located 0 bytes inside of
> 10f5e0, ffff88811a10f5f0)
>  The buggy address belongs to the page:
>  page:000000007b6da7b1 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88811a10f1e0 pfn:0x11a10f
>  flags: 0x8000000000000200(slab|zone=2)
>  raw: 8000000000000200 ffffea0004463040 0000001200000012 ffff8881000423c0
>  raw: ffff88811a10f1e0 000000008080007f 00000001ffffffff 0000000000000000
>  page dumped because: kasan: bad access detected
> 
>  Memory state around the buggy address:
>  ffff88811a10f480: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
>  ffff88811a10f500: fa fb fc fc fa fb fc fc 00 00 fc fc 00 00 fc fc
>  >ffff88811a10f580: 00 00 fc fc fa fb fc fc 00 00 fc fc 00 00 fc fc
>  ^
>  ffff88811a10f600: 00 00 fc fc fa fb fc fc fa fb fc fc 00 00 fc fc
>  ffff88811a10f680: 00 00 fc fc 00 00 fc fc fa fb fc fc 00 00 fc fc
> 
> Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink")
> Signed-off-by: Mark Zhang <markzhang@nvidia.com>
> Reviewed-by: Mark Bloch <mbloch@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---
>  drivers/infiniband/core/sa_query.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

Applied to for-rc, thanks

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc 2/2] RDMA/core: Initialize lock when allocate a rdma_hw_stats structure
  2021-10-25 14:50   ` Jason Gunthorpe
@ 2021-10-26  8:27     ` Leon Romanovsky
  2021-10-26 12:05       ` Jason Gunthorpe
  0 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2021-10-26  8:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, Mark Zhang, Ira Weiny, John Fleck, Kaike Wan,
	linux-kernel, linux-rdma, Maor Gottlieb, Mark Bloch, Mark Bloch

On Mon, Oct 25, 2021 at 11:50:43AM -0300, Jason Gunthorpe wrote:
> On Sun, Oct 24, 2021 at 09:08:21AM +0300, Leon Romanovsky wrote:
> > From: Mark Zhang <markzhang@nvidia.com>
> > 
> > Initialize the rdma_hw_stats "lock" field when do allocation, to fix the
> > warning below. Then we don't need to initialize it in sysfs, remove it.
> 
> This is a fine cleanup, but this does not describe the bug properly,
> or have the right fixes line..

I think that this Fixes line should be instead.
Fixes: 0a0800ce2a6a ("RDMA/core: Add a helper API rdma_free_hw_stats_struct")

> 
> The issue is here:
> 
> static struct rdma_counter *alloc_and_bind(struct ib_device *dev, u32 port,
> 					   struct ib_qp *qp,
> 					   enum rdma_nl_counter_mode mode)
> {
> 	counter->stats = dev->ops.counter_alloc_stats(counter);
> 	if (!counter->stats)
> 		goto err_stats;
> 
> Which does not init counter->stat's mutex.


This is exactly what Mark is doing here.

alloc_and_bind()
 -> dev->ops.counter_alloc_stats
  -> mlx5_ib_counter_alloc_stats
   -> do_alloc_stats()
    -> rdma_alloc_hw_stats_struct()
     -> mutex_init(&stats->lock); <- Mark's change.

> 
> And trim the oops reports, don't include the usless ? fns, timestamps
> or other junk.

I don't like when people "beatify" kernel reports, many times whey are
removing too much information.

Thanks

> 
> Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc 2/2] RDMA/core: Initialize lock when allocate a rdma_hw_stats structure
  2021-10-26  8:27     ` Leon Romanovsky
@ 2021-10-26 12:05       ` Jason Gunthorpe
  2021-10-26 12:21         ` Leon Romanovsky
  0 siblings, 1 reply; 8+ messages in thread
From: Jason Gunthorpe @ 2021-10-26 12:05 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Mark Zhang, Ira Weiny, John Fleck, Kaike Wan,
	linux-kernel, linux-rdma, Maor Gottlieb, Mark Bloch, Mark Bloch

On Tue, Oct 26, 2021 at 11:27:57AM +0300, Leon Romanovsky wrote:
> On Mon, Oct 25, 2021 at 11:50:43AM -0300, Jason Gunthorpe wrote:
> > On Sun, Oct 24, 2021 at 09:08:21AM +0300, Leon Romanovsky wrote:
> > > From: Mark Zhang <markzhang@nvidia.com>
> > > 
> > > Initialize the rdma_hw_stats "lock" field when do allocation, to fix the
> > > warning below. Then we don't need to initialize it in sysfs, remove it.
> > 
> > This is a fine cleanup, but this does not describe the bug properly,
> > or have the right fixes line..
> 
> I think that this Fixes line should be instead.
> Fixes: 0a0800ce2a6a ("RDMA/core: Add a helper API rdma_free_hw_stats_struct")

No, I don't think so, it should be the commit that added
alloc_and_bind()

> > The issue is here:
> > 
> > static struct rdma_counter *alloc_and_bind(struct ib_device *dev, u32 port,
> > 					   struct ib_qp *qp,
> > 					   enum rdma_nl_counter_mode mode)
> > {
> > 	counter->stats = dev->ops.counter_alloc_stats(counter);
> > 	if (!counter->stats)
> > 		goto err_stats;
> > 
> > Which does not init counter->stat's mutex.
> 
> This is exactly what Mark is doing here.
> 
> alloc_and_bind()
>  -> dev->ops.counter_alloc_stats
>   -> mlx5_ib_counter_alloc_stats
>    -> do_alloc_stats()
>     -> rdma_alloc_hw_stats_struct()
>      -> mutex_init(&stats->lock); <- Mark's change.

Yes, I know, the patch is fine, the commit message just needs to be
accurate
 
> > And trim the oops reports, don't include the usless ? fns, timestamps
> > or other junk.
> 
> I don't like when people "beatify" kernel reports, many times whey are
> removing too much information.

There is too much junk in the raw oops messages

Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH rdma-rc 2/2] RDMA/core: Initialize lock when allocate a rdma_hw_stats structure
  2021-10-26 12:05       ` Jason Gunthorpe
@ 2021-10-26 12:21         ` Leon Romanovsky
  0 siblings, 0 replies; 8+ messages in thread
From: Leon Romanovsky @ 2021-10-26 12:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, Mark Zhang, Ira Weiny, John Fleck, Kaike Wan,
	linux-kernel, linux-rdma, Maor Gottlieb, Mark Bloch, Mark Bloch

On Tue, Oct 26, 2021 at 09:05:54AM -0300, Jason Gunthorpe wrote:
> On Tue, Oct 26, 2021 at 11:27:57AM +0300, Leon Romanovsky wrote:
> > On Mon, Oct 25, 2021 at 11:50:43AM -0300, Jason Gunthorpe wrote:
> > > On Sun, Oct 24, 2021 at 09:08:21AM +0300, Leon Romanovsky wrote:
> > > > From: Mark Zhang <markzhang@nvidia.com>
> > > > 
> > > > Initialize the rdma_hw_stats "lock" field when do allocation, to fix the
> > > > warning below. Then we don't need to initialize it in sysfs, remove it.
> > > 
> > > This is a fine cleanup, but this does not describe the bug properly,
> > > or have the right fixes line..
> > 
> > I think that this Fixes line should be instead.
> > Fixes: 0a0800ce2a6a ("RDMA/core: Add a helper API rdma_free_hw_stats_struct")
> 
> No, I don't think so, it should be the commit that added
> alloc_and_bind()

The alloc_and_bind() is a merge between other functions. The issue existed
even before. It is worth to stop at some point to dig and IMHO proposed Fixes
line is good enough (at least for me).

> 
> > > The issue is here:
> > > 
> > > static struct rdma_counter *alloc_and_bind(struct ib_device *dev, u32 port,
> > > 					   struct ib_qp *qp,
> > > 					   enum rdma_nl_counter_mode mode)
> > > {
> > > 	counter->stats = dev->ops.counter_alloc_stats(counter);
> > > 	if (!counter->stats)
> > > 		goto err_stats;
> > > 
> > > Which does not init counter->stat's mutex.
> > 
> > This is exactly what Mark is doing here.
> > 
> > alloc_and_bind()
> >  -> dev->ops.counter_alloc_stats
> >   -> mlx5_ib_counter_alloc_stats
> >    -> do_alloc_stats()
> >     -> rdma_alloc_hw_stats_struct()
> >      -> mutex_init(&stats->lock); <- Mark's change.
> 
> Yes, I know, the patch is fine, the commit message just needs to be
> accurate
>  
> > > And trim the oops reports, don't include the usless ? fns, timestamps
> > > or other junk.
> > 
> > I don't like when people "beatify" kernel reports, many times whey are
> > removing too much information.
> 
> There is too much junk in the raw oops messages

Right, but it is better to have more info than to much trimmed one.

Thanks

> 
> Jason

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-26 12:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-24  6:08 [PATCH rdma-rc 0/2] Two IB/core fixes Leon Romanovsky
2021-10-24  6:08 ` [PATCH rdma-rc 1/2] RDMA/sa_query: Use strscpy_pad instead of memcpy to copy a string Leon Romanovsky
2021-10-25 17:02   ` Jason Gunthorpe
2021-10-24  6:08 ` [PATCH rdma-rc 2/2] RDMA/core: Initialize lock when allocate a rdma_hw_stats structure Leon Romanovsky
2021-10-25 14:50   ` Jason Gunthorpe
2021-10-26  8:27     ` Leon Romanovsky
2021-10-26 12:05       ` Jason Gunthorpe
2021-10-26 12:21         ` Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.