linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [syzbot] [rdma?] INFO: trying to register non-static key in rxe_cleanup_task (2)
@ 2023-02-28 15:14 syzbot
  2023-04-01  2:09 ` Zhu Yanjun
  0 siblings, 1 reply; 7+ messages in thread
From: syzbot @ 2023-02-28 15:14 UTC (permalink / raw)
  To: jgg, leon, linux-kernel, linux-rdma, syzkaller-bugs, zyjzyj2000

Hello,

syzbot found the following issue on:

HEAD commit:    982818426a0f Merge tag 'arm-fixes-6.3-1' of git://git.kern..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1194800f480000
kernel config:  https://syzkaller.appspot.com/x/.config?x=90a4de3f96747e3f
dashboard link: https://syzkaller.appspot.com/bug?extid=cfcc1a3c85be15a40cba
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/225d8c8e9264/disk-98281842.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/87a9e2a89842/vmlinux-98281842.xz
kernel image: https://storage.googleapis.com/syzbot-assets/39bdeb741f2e/bzImage-98281842.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com

Node 1 hugepages_total=2 hugepages_free=2 hugepages_surp=0 hugepages_size=2048kB
14586 total pagecache pages
0 pages in swap cache
Free swap  = 0kB
Total swap = 0kB
2097051 pages RAM
0 pages HighMem/MovableOnly
392145 pages reserved
0 pages cma reserved
INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 1 PID: 5486 Comm: syz-executor.2 Not tainted 6.2.0-syzkaller-12765-g982818426a0f #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
 assign_lock_key kernel/locking/lockdep.c:982 [inline]
 register_lock_class+0xdb6/0x1120 kernel/locking/lockdep.c:1295
 __lock_acquire+0x108/0x5d40 kernel/locking/lockdep.c:4935
 lock_acquire kernel/locking/lockdep.c:5669 [inline]
 lock_acquire+0x1e3/0x670 kernel/locking/lockdep.c:5634
 __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
 _raw_spin_lock_bh+0x33/0x40 kernel/locking/spinlock.c:178
 spin_lock_bh include/linux/spinlock.h:355 [inline]
 rxe_cleanup_task+0x73/0xc0 drivers/infiniband/sw/rxe/rxe_task.c:119
 rxe_qp_do_cleanup+0x8c/0x7c0 drivers/infiniband/sw/rxe/rxe_qp.c:776
 execute_in_process_context+0x3b/0x150 kernel/workqueue.c:3458
 __rxe_cleanup+0x21e/0x370 drivers/infiniband/sw/rxe/rxe_pool.c:233
 rxe_create_qp+0x2c2/0x340 drivers/infiniband/sw/rxe/rxe_verbs.c:430
 create_qp+0x5ac/0x970 drivers/infiniband/core/verbs.c:1233
 ib_create_qp_kernel+0xa1/0x310 drivers/infiniband/core/verbs.c:1344
 ib_create_qp include/rdma/ib_verbs.h:3743 [inline]
 create_mad_qp+0x177/0x380 drivers/infiniband/core/mad.c:2905
 ib_mad_port_open drivers/infiniband/core/mad.c:2986 [inline]
 ib_mad_init_device+0xf40/0x1a90 drivers/infiniband/core/mad.c:3077
 add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:721
 enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1332
 ib_register_device drivers/infiniband/core/device.c:1420 [inline]
 ib_register_device+0x8b1/0xbc0 drivers/infiniband/core/device.c:1366
 rxe_register_device+0x317/0x3f0 drivers/infiniband/sw/rxe/rxe_verbs.c:1096
 rxe_net_add+0x90/0xf0 drivers/infiniband/sw/rxe/rxe_net.c:524
 rxe_newlink+0xd5/0x140 drivers/infiniband/sw/rxe/rxe.c:195
 nldev_newlink+0x332/0x5e0 drivers/infiniband/core/nldev.c:1731
 rdma_nl_rcv_msg+0x371/0x6a0 drivers/infiniband/core/netlink.c:195
 rdma_nl_rcv_skb.constprop.0.isra.0+0x2fc/0x440 drivers/infiniband/core/netlink.c:239
 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
 netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365
 netlink_sendmsg+0x925/0xe30 net/netlink/af_netlink.c:1942
 sock_sendmsg_nosec net/socket.c:722 [inline]
 sock_sendmsg+0xde/0x190 net/socket.c:745
 ____sys_sendmsg+0x71c/0x900 net/socket.c:2504
 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2558
 __sys_sendmsg+0xf7/0x1c0 net/socket.c:2587
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fd95868c0f9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fd9594b3168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fd9587abf80 RCX: 00007fd95868c0f9
RDX: 0000000000000000 RSI: 0000000020000200 RDI: 0000000000000003
RBP: 00007fd9586e7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fff9db7c9ef R14: 00007fd9594b3300 R15: 0000000000022000
 </TASK>
infiniband syz1: Couldn't create ib_mad QP1
infiniband syz1: Couldn't open port 1
RDS/IB: syz1: added
smc: adding ib device syz1 with port count 1
smc:    ib device syz1 port 1 has pnetid 
syz-executor.2 (5486) used greatest stack depth: 22840 bytes left


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [syzbot] [rdma?] INFO: trying to register non-static key in rxe_cleanup_task (2)
  2023-02-28 15:14 [syzbot] [rdma?] INFO: trying to register non-static key in rxe_cleanup_task (2) syzbot
@ 2023-04-01  2:09 ` Zhu Yanjun
  2023-04-01  2:44   ` [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Zhu Yanjun
  0 siblings, 1 reply; 7+ messages in thread
From: Zhu Yanjun @ 2023-04-01  2:09 UTC (permalink / raw)
  To: syzbot, jgg, leon, linux-kernel, linux-rdma, syzkaller-bugs, zyjzyj2000

在 2023/2/28 23:14, syzbot 写道:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    982818426a0f Merge tag 'arm-fixes-6.3-1' of git://git.kern..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1194800f480000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=90a4de3f96747e3f
> dashboard link: https://syzkaller.appspot.com/bug?extid=cfcc1a3c85be15a40cba
> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/225d8c8e9264/disk-98281842.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/87a9e2a89842/vmlinux-98281842.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/39bdeb741f2e/bzImage-98281842.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
> 
> Node 1 hugepages_total=2 hugepages_free=2 hugepages_surp=0 hugepages_size=2048kB
> 14586 total pagecache pages
> 0 pages in swap cache
> Free swap  = 0kB
> Total swap = 0kB
> 2097051 pages RAM
> 0 pages HighMem/MovableOnly
> 392145 pages reserved
> 0 pages cma reserved
> INFO: trying to register non-static key.
> The code is fine but needs lockdep annotation, or maybe
> you didn't initialize this object before use?
> turning off the locking correctness validator.
> CPU: 1 PID: 5486 Comm: syz-executor.2 Not tainted 6.2.0-syzkaller-12765-g982818426a0f #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/16/2023
> Call Trace:
>   <TASK>
>   __dump_stack lib/dump_stack.c:88 [inline]
>   dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
>   assign_lock_key kernel/locking/lockdep.c:982 [inline]
>   register_lock_class+0xdb6/0x1120 kernel/locking/lockdep.c:1295
>   __lock_acquire+0x108/0x5d40 kernel/locking/lockdep.c:4935
>   lock_acquire kernel/locking/lockdep.c:5669 [inline]
>   lock_acquire+0x1e3/0x670 kernel/locking/lockdep.c:5634
>   __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
>   _raw_spin_lock_bh+0x33/0x40 kernel/locking/spinlock.c:178
>   spin_lock_bh include/linux/spinlock.h:355 [inline]
>   rxe_cleanup_task+0x73/0xc0 drivers/infiniband/sw/rxe/rxe_task.c:119
>   rxe_qp_do_cleanup+0x8c/0x7c0 drivers/infiniband/sw/rxe/rxe_qp.c:776
>   execute_in_process_context+0x3b/0x150 kernel/workqueue.c:3458
>   __rxe_cleanup+0x21e/0x370 drivers/infiniband/sw/rxe/rxe_pool.c:233
>   rxe_create_qp+0x2c2/0x340 drivers/infiniband/sw/rxe/rxe_verbs.c:430

It took me a lot of time to reproduce this problem. And finally I can 
reproduce this problem.

The root cause is:

In this function rxe_create_qp, if some problem occur in 
rxe_qp_init_req, rxe_init_task functions are not executed.
But error handler of rxe_create_qp still calls rxe_cleanup_task. This 
problem will occur.

static int rxe_create_qp
{
   ...
   err = rxe_qp_from_init {
         ...
         err = rxe_qp_init_req {
         ... some errors occur <---This will go to the error handler of 
rxe_create_qp
         rxe_init_task(&qp->req.task, qp, rxe_requester); <---It is not 
executed.
         rxe_init_task(&qp->comp.task, qp, rxe_completer);<---It is not 
executed.
         ...
         }
         if (err)
           goto err1;
         ...
         err = rxe_qp_init_resp{
         ...
         rxe_init_task(&qp->resp.task, qp, rxe_responder);<---It is not 
executed.
         ...
         }
         if (err)
            goto err2;
         }
         if (err)
                 goto qp_init;
...
qp_init:
         rxe_cleanup(qp);  ---> rxe_cleanup_task
         return err;
}

Zhu Yanjun

>   create_qp+0x5ac/0x970 drivers/infiniband/core/verbs.c:1233
>   ib_create_qp_kernel+0xa1/0x310 drivers/infiniband/core/verbs.c:1344
>   ib_create_qp include/rdma/ib_verbs.h:3743 [inline]
>   create_mad_qp+0x177/0x380 drivers/infiniband/core/mad.c:2905
>   ib_mad_port_open drivers/infiniband/core/mad.c:2986 [inline]
>   ib_mad_init_device+0xf40/0x1a90 drivers/infiniband/core/mad.c:3077
>   add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:721
>   enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1332
>   ib_register_device drivers/infiniband/core/device.c:1420 [inline]
>   ib_register_device+0x8b1/0xbc0 drivers/infiniband/core/device.c:1366
>   rxe_register_device+0x317/0x3f0 drivers/infiniband/sw/rxe/rxe_verbs.c:1096
>   rxe_net_add+0x90/0xf0 drivers/infiniband/sw/rxe/rxe_net.c:524
>   rxe_newlink+0xd5/0x140 drivers/infiniband/sw/rxe/rxe.c:195
>   nldev_newlink+0x332/0x5e0 drivers/infiniband/core/nldev.c:1731
>   rdma_nl_rcv_msg+0x371/0x6a0 drivers/infiniband/core/netlink.c:195
>   rdma_nl_rcv_skb.constprop.0.isra.0+0x2fc/0x440 drivers/infiniband/core/netlink.c:239
>   netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
>   netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365
>   netlink_sendmsg+0x925/0xe30 net/netlink/af_netlink.c:1942
>   sock_sendmsg_nosec net/socket.c:722 [inline]
>   sock_sendmsg+0xde/0x190 net/socket.c:745
>   ____sys_sendmsg+0x71c/0x900 net/socket.c:2504
>   ___sys_sendmsg+0x110/0x1b0 net/socket.c:2558
>   __sys_sendmsg+0xf7/0x1c0 net/socket.c:2587
>   do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>   do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
>   entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7fd95868c0f9
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007fd9594b3168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> RAX: ffffffffffffffda RBX: 00007fd9587abf80 RCX: 00007fd95868c0f9
> RDX: 0000000000000000 RSI: 0000000020000200 RDI: 0000000000000003
> RBP: 00007fd9586e7ae9 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007fff9db7c9ef R14: 00007fd9594b3300 R15: 0000000000022000
>   </TASK>
> infiniband syz1: Couldn't create ib_mad QP1
> infiniband syz1: Couldn't open port 1
> RDS/IB: syz1: added
> smc: adding ib device syz1 with port count 1
> smc:    ib device syz1 port 1 has pnetid
> syz-executor.2 (5486) used greatest stack depth: 22840 bytes left
> 
> 
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  2023-04-01  2:09 ` Zhu Yanjun
@ 2023-04-01  2:44   ` Zhu Yanjun
  2023-04-03 18:10     ` Leon Romanovsky
  0 siblings, 1 reply; 7+ messages in thread
From: Zhu Yanjun @ 2023-04-01  2:44 UTC (permalink / raw)
  To: zyjzyj2000, jgg, leon, linux-rdma; +Cc: Zhu Yanjun, syzbot+cfcc1a3c85be15a40cba

From: Zhu Yanjun <yanjun.zhu@linux.dev>

In the function rxe_create_qp(), rxe_qp_from_init() is called to
initialize qp, internally things like rxe_init_task are not setup until
rxe_qp_init_req().

If an error occures before this point then the unwind will call
rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
which will oops when trying to access the uninitialized spinlock.

If rxe_init_task is not executed, rxe_cleanup_task will not be called.

Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
---
 drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index ab72db68b58f..7856c02c1b46 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
 	spin_lock_init(&qp->rq.producer_lock);
 	spin_lock_init(&qp->rq.consumer_lock);
 
+	memset(&qp->req.task, 0, sizeof(struct rxe_task));
+	memset(&qp->comp.task, 0, sizeof(struct rxe_task));
+	memset(&qp->resp.task, 0, sizeof(struct rxe_task));
+
 	atomic_set(&qp->ssn, 0);
 	atomic_set(&qp->skb_out, 0);
 }
@@ -773,15 +777,20 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
 
 	qp->valid = 0;
 	qp->qp_timeout_jiffies = 0;
-	rxe_cleanup_task(&qp->resp.task);
+
+	if (qp->resp.task.func)
+		rxe_cleanup_task(&qp->resp.task);
 
 	if (qp_type(qp) == IB_QPT_RC) {
 		del_timer_sync(&qp->retrans_timer);
 		del_timer_sync(&qp->rnr_nak_timer);
 	}
 
-	rxe_cleanup_task(&qp->req.task);
-	rxe_cleanup_task(&qp->comp.task);
+	if (qp->req.task.func)
+		rxe_cleanup_task(&qp->req.task);
+
+	if (qp->comp.task.func)
+		rxe_cleanup_task(&qp->comp.task);
 
 	/* flush out any receive wr's or pending requests */
 	if (qp->req.task.func)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  2023-04-01  2:44   ` [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Zhu Yanjun
@ 2023-04-03 18:10     ` Leon Romanovsky
  2023-04-04  0:13       ` Zhu Yanjun
  0 siblings, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2023-04-03 18:10 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: zyjzyj2000, jgg, linux-rdma, Zhu Yanjun, syzbot+cfcc1a3c85be15a40cba

On Sat, Apr 01, 2023 at 10:44:17AM +0800, Zhu Yanjun wrote:
> From: Zhu Yanjun <yanjun.zhu@linux.dev>
> 
> In the function rxe_create_qp(), rxe_qp_from_init() is called to
> initialize qp, internally things like rxe_init_task are not setup until
> rxe_qp_init_req().
> 
> If an error occures before this point then the unwind will call
> rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
> which will oops when trying to access the uninitialized spinlock.
> 
> If rxe_init_task is not executed, rxe_cleanup_task will not be called.
> 
> Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
> Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8
> 
> Fixes: 8700e3e7c485 ("Soft RoCE driver")
> Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> ---
>  drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
> index ab72db68b58f..7856c02c1b46 100644
> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
> @@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
>  	spin_lock_init(&qp->rq.producer_lock);
>  	spin_lock_init(&qp->rq.consumer_lock);
>  
> +	memset(&qp->req.task, 0, sizeof(struct rxe_task));
> +	memset(&qp->comp.task, 0, sizeof(struct rxe_task));
> +	memset(&qp->resp.task, 0, sizeof(struct rxe_task));

IMHO QP is already zeroed here.

Please don't send patches as reply-to.

Thanks

> +
>  	atomic_set(&qp->ssn, 0);
>  	atomic_set(&qp->skb_out, 0);
>  }
> @@ -773,15 +777,20 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
>  
>  	qp->valid = 0;
>  	qp->qp_timeout_jiffies = 0;
> -	rxe_cleanup_task(&qp->resp.task);
> +
> +	if (qp->resp.task.func)
> +		rxe_cleanup_task(&qp->resp.task);
>  
>  	if (qp_type(qp) == IB_QPT_RC) {
>  		del_timer_sync(&qp->retrans_timer);
>  		del_timer_sync(&qp->rnr_nak_timer);
>  	}
>  
> -	rxe_cleanup_task(&qp->req.task);
> -	rxe_cleanup_task(&qp->comp.task);
> +	if (qp->req.task.func)
> +		rxe_cleanup_task(&qp->req.task);
> +
> +	if (qp->comp.task.func)
> +		rxe_cleanup_task(&qp->comp.task);
>  
>  	/* flush out any receive wr's or pending requests */
>  	if (qp->req.task.func)
> -- 
> 2.27.0
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  2023-04-03 18:10     ` Leon Romanovsky
@ 2023-04-04  0:13       ` Zhu Yanjun
  2023-04-04  5:58         ` Leon Romanovsky
  0 siblings, 1 reply; 7+ messages in thread
From: Zhu Yanjun @ 2023-04-04  0:13 UTC (permalink / raw)
  To: Leon Romanovsky, Zhu Yanjun
  Cc: zyjzyj2000, jgg, linux-rdma, syzbot+cfcc1a3c85be15a40cba


在 2023/4/4 2:10, Leon Romanovsky 写道:
> On Sat, Apr 01, 2023 at 10:44:17AM +0800, Zhu Yanjun wrote:
>> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>>
>> In the function rxe_create_qp(), rxe_qp_from_init() is called to
>> initialize qp, internally things like rxe_init_task are not setup until
>> rxe_qp_init_req().
>>
>> If an error occures before this point then the unwind will call
>> rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
>> which will oops when trying to access the uninitialized spinlock.
>>
>> If rxe_init_task is not executed, rxe_cleanup_task will not be called.
>>
>> Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
>> Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8
>>
>> Fixes: 8700e3e7c485 ("Soft RoCE driver")
>> Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>> ---
>>   drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++---
>>   1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
>> index ab72db68b58f..7856c02c1b46 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
>> @@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
>>   	spin_lock_init(&qp->rq.producer_lock);
>>   	spin_lock_init(&qp->rq.consumer_lock);
>>   
>> +	memset(&qp->req.task, 0, sizeof(struct rxe_task));
>> +	memset(&qp->comp.task, 0, sizeof(struct rxe_task));
>> +	memset(&qp->resp.task, 0, sizeof(struct rxe_task));
> IMHO QP is already zeroed here.

Sure. Exactly. Here I just confirm that req.task, comp.task and 
resp.task are zeroed explicitly.

If you think it had better remove these memset functions, I will follow 
your advice.

Please let me know your advice.

> Please don't send patches as reply-to.

Got it. I will follow your advice.

Thanks,

Zhu Yanjun

>
> Thanks
>
>> +
>>   	atomic_set(&qp->ssn, 0);
>>   	atomic_set(&qp->skb_out, 0);
>>   }
>> @@ -773,15 +777,20 @@ static void rxe_qp_do_cleanup(struct work_struct *work)
>>   
>>   	qp->valid = 0;
>>   	qp->qp_timeout_jiffies = 0;
>> -	rxe_cleanup_task(&qp->resp.task);
>> +
>> +	if (qp->resp.task.func)
>> +		rxe_cleanup_task(&qp->resp.task);
>>   
>>   	if (qp_type(qp) == IB_QPT_RC) {
>>   		del_timer_sync(&qp->retrans_timer);
>>   		del_timer_sync(&qp->rnr_nak_timer);
>>   	}
>>   
>> -	rxe_cleanup_task(&qp->req.task);
>> -	rxe_cleanup_task(&qp->comp.task);
>> +	if (qp->req.task.func)
>> +		rxe_cleanup_task(&qp->req.task);
>> +
>> +	if (qp->comp.task.func)
>> +		rxe_cleanup_task(&qp->comp.task);
>>   
>>   	/* flush out any receive wr's or pending requests */
>>   	if (qp->req.task.func)
>> -- 
>> 2.27.0
>>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  2023-04-04  0:13       ` Zhu Yanjun
@ 2023-04-04  5:58         ` Leon Romanovsky
  2023-04-04  6:42           ` Zhu Yanjun
  0 siblings, 1 reply; 7+ messages in thread
From: Leon Romanovsky @ 2023-04-04  5:58 UTC (permalink / raw)
  To: Zhu Yanjun
  Cc: Zhu Yanjun, zyjzyj2000, jgg, linux-rdma, syzbot+cfcc1a3c85be15a40cba

On Tue, Apr 04, 2023 at 08:13:22AM +0800, Zhu Yanjun wrote:
> 
> 在 2023/4/4 2:10, Leon Romanovsky 写道:
> > On Sat, Apr 01, 2023 at 10:44:17AM +0800, Zhu Yanjun wrote:
> > > From: Zhu Yanjun <yanjun.zhu@linux.dev>
> > > 
> > > In the function rxe_create_qp(), rxe_qp_from_init() is called to
> > > initialize qp, internally things like rxe_init_task are not setup until
> > > rxe_qp_init_req().
> > > 
> > > If an error occures before this point then the unwind will call
> > > rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
> > > which will oops when trying to access the uninitialized spinlock.
> > > 
> > > If rxe_init_task is not executed, rxe_cleanup_task will not be called.
> > > 
> > > Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
> > > Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8
> > > 
> > > Fixes: 8700e3e7c485 ("Soft RoCE driver")
> > > Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
> > > Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
> > > ---
> > >   drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++---
> > >   1 file changed, 12 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
> > > index ab72db68b58f..7856c02c1b46 100644
> > > --- a/drivers/infiniband/sw/rxe/rxe_qp.c
> > > +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
> > > @@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
> > >   	spin_lock_init(&qp->rq.producer_lock);
> > >   	spin_lock_init(&qp->rq.consumer_lock);
> > > +	memset(&qp->req.task, 0, sizeof(struct rxe_task));
> > > +	memset(&qp->comp.task, 0, sizeof(struct rxe_task));
> > > +	memset(&qp->resp.task, 0, sizeof(struct rxe_task));
> > IMHO QP is already zeroed here.
> 
> Sure. Exactly. Here I just confirm that req.task, comp.task and resp.task
> are zeroed explicitly.

There is no need to do so. It is quite misleading to read the code and
see these memset() functions as they give false impression that QP is
not zeroed.

> 
> If you think it had better remove these memset functions, I will follow your
> advice.

Yes, please.

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  2023-04-04  5:58         ` Leon Romanovsky
@ 2023-04-04  6:42           ` Zhu Yanjun
  0 siblings, 0 replies; 7+ messages in thread
From: Zhu Yanjun @ 2023-04-04  6:42 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Zhu Yanjun, zyjzyj2000, jgg, linux-rdma, syzbot+cfcc1a3c85be15a40cba


在 2023/4/4 13:58, Leon Romanovsky 写道:
> On Tue, Apr 04, 2023 at 08:13:22AM +0800, Zhu Yanjun wrote:
>> 在 2023/4/4 2:10, Leon Romanovsky 写道:
>>> On Sat, Apr 01, 2023 at 10:44:17AM +0800, Zhu Yanjun wrote:
>>>> From: Zhu Yanjun <yanjun.zhu@linux.dev>
>>>>
>>>> In the function rxe_create_qp(), rxe_qp_from_init() is called to
>>>> initialize qp, internally things like rxe_init_task are not setup until
>>>> rxe_qp_init_req().
>>>>
>>>> If an error occures before this point then the unwind will call
>>>> rxe_cleanup() and eventually to rxe_qp_do_cleanup()/rxe_cleanup_task()
>>>> which will oops when trying to access the uninitialized spinlock.
>>>>
>>>> If rxe_init_task is not executed, rxe_cleanup_task will not be called.
>>>>
>>>> Reported-by: syzbot+cfcc1a3c85be15a40cba@syzkaller.appspotmail.com
>>>> Link: https://syzkaller.appspot.com/bug?id=fd85757b74b3eb59f904138486f755f71e090df8
>>>>
>>>> Fixes: 8700e3e7c485 ("Soft RoCE driver")
>>>> Fixes: 2d4b21e0a291 ("IB/rxe: Prevent from completer to operate on non valid QP")
>>>> Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
>>>> ---
>>>>    drivers/infiniband/sw/rxe/rxe_qp.c | 15 ++++++++++++---
>>>>    1 file changed, 12 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
>>>> index ab72db68b58f..7856c02c1b46 100644
>>>> --- a/drivers/infiniband/sw/rxe/rxe_qp.c
>>>> +++ b/drivers/infiniband/sw/rxe/rxe_qp.c
>>>> @@ -176,6 +176,10 @@ static void rxe_qp_init_misc(struct rxe_dev *rxe, struct rxe_qp *qp,
>>>>    	spin_lock_init(&qp->rq.producer_lock);
>>>>    	spin_lock_init(&qp->rq.consumer_lock);
>>>> +	memset(&qp->req.task, 0, sizeof(struct rxe_task));
>>>> +	memset(&qp->comp.task, 0, sizeof(struct rxe_task));
>>>> +	memset(&qp->resp.task, 0, sizeof(struct rxe_task));
>>> IMHO QP is already zeroed here.
>> Sure. Exactly. Here I just confirm that req.task, comp.task and resp.task
>> are zeroed explicitly.
> There is no need to do so. It is quite misleading to read the code and
> see these memset() functions as they give false impression that QP is
> not zeroed.

I will remove these memset function in the latest commit.

Thanks,

Zhu Yanjun

>
>> If you think it had better remove these memset functions, I will follow your
>> advice.
> Yes, please.
>
> Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-04-04  6:42 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-28 15:14 [syzbot] [rdma?] INFO: trying to register non-static key in rxe_cleanup_task (2) syzbot
2023-04-01  2:09 ` Zhu Yanjun
2023-04-01  2:44   ` [PATCH 1/1] RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task" Zhu Yanjun
2023-04-03 18:10     ` Leon Romanovsky
2023-04-04  0:13       ` Zhu Yanjun
2023-04-04  5:58         ` Leon Romanovsky
2023-04-04  6:42           ` Zhu Yanjun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).