* [PATCH] nvmet-tcp: finish receiving before send back response if nvmet_req_init() failed. @ 2021-03-15 10:09 Hou Pu 2021-03-15 17:32 ` Sagi Grimberg 0 siblings, 1 reply; 3+ messages in thread From: Hou Pu @ 2021-03-15 10:09 UTC (permalink / raw) To: hch, sagi, chaitanya.kulkarni; +Cc: linux-nvme, Hou Pu, stable When receiving a pdu, if nvmet_req_init() failed (for example a ns is not found), the queue->rcv_state could be moved to NVMET_TCP_RECV_DATA by nvmet_tcp_handle_req_failure(). We should return 0 here to continue to consume the possible remaining inline write out data in nvmet_tcp_try_recv_one(). Otherwise, the response to this request would be sent and iov would be freed. Next time in nvmet_tcp_try_recv_one(), we would go to the receiving data phase and the iov is used again. A panic happend with a 5.4 kernel installed as below: [ 169.906573] nvmet_tcp: failed cmd 0000000027717054 id 106 opcode 1, data_len: 1024 [ 169.908131] general protection fault: 0000 [#1] SMP NOPTI [ 169.908884] CPU: 0 PID: 122 Comm: kworker/0:1H Kdump: loaded Tainted: G OE [...] [ 169.910328] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), [...] [ 169.911943] Workqueue: nvmet_tcp_wq nvmet_tcp_io_work [nvmet_tcp] [ 169.912815] RIP: 0010:__memcpy+0x12/0x20 [ 169.913393] Code: e3 97 ff 0f 31 48 c1 e2 20 48 09 d0 48 31 c3 e9 79 ff [...] [ 169.915819] RSP: 0018:ffffc9000026bbc8 EFLAGS: 00010246 [ 169.916547] RAX: ebf4958c4fda661b RBX: ffff888119613096 RCX: 0000000000000080 [ 169.917511] RDX: 0000000000000000 RSI: ffff888119613096 RDI: ebf4958c4fda661b [ 169.918469] RBP: 0000000000000400 R08: 0000000000000000 R09: 0000000000000400 [ 169.919435] R10: 0000000000000000 R11: 000000000000003e R12: ffff888114244068 [ 169.920398] R13: 0000000000000400 R14: 0000000000000400 R15: ffff888118c37cb0 [ 169.921378] FS: 0000000000000000(0000) GS:ffff88813fc00000(0000) knlGS:0000000000000000 [ 169.922473] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 169.923269] CR2: 0000555eb19943e0 CR3: 00000001186fc000 CR4: 00000000000006f0 [ 169.924245] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 169.925214] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 169.926184] Call Trace: [ 169.926569] _copy_to_iter+0x26f/0x440 [ 169.927112] ? __check_object_size+0x70/0x230 [ 169.927724] __skb_datagram_iter+0x15a/0x290 [ 169.928327] ? skb_kill_datagram+0x60/0x60 [ 169.928921] skb_copy_datagram_iter+0x3b/0x90 [ 169.929545] tcp_recvmsg+0x759/0xc50 [ 169.930081] ? ksoftirqd_running+0x2c/0x30 [ 169.930668] ? free_unref_page_commit+0x95/0x120 [ 169.931331] inet_recvmsg+0x55/0xc0 [ 169.931834] nvmet_tcp_io_work+0x685/0xb23 [nvmet_tcp] [ 169.932549] process_one_work+0x18c/0x370 [ 169.933118] worker_thread+0x4f/0x3b0 [ 169.933654] ? rescuer_thread+0x340/0x340 [ 169.934223] kthread+0xf6/0x130 [ 169.934682] ? kthread_create_worker_on_cpu+0x70/0x70 [ 169.935393] ret_from_fork+0x1f/0x30 Cc: <stable@vger.kernel.org> # 5.0 Signed-off-by: Hou Pu <houpu.main@gmail.com> --- drivers/nvme/target/tcp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 8b0485ada315..da1c667e21ba 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -961,7 +961,7 @@ static int nvmet_tcp_done_recv_pdu(struct nvmet_tcp_queue *queue) le32_to_cpu(req->cmd->common.dptr.sgl.length)); nvmet_tcp_handle_req_failure(queue, queue->cmd, req); - return -EAGAIN; + return 0; } ret = nvmet_tcp_map_data(queue->cmd); -- 2.28.0 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] nvmet-tcp: finish receiving before send back response if nvmet_req_init() failed. 2021-03-15 10:09 [PATCH] nvmet-tcp: finish receiving before send back response if nvmet_req_init() failed Hou Pu @ 2021-03-15 17:32 ` Sagi Grimberg 2021-03-16 5:24 ` Hou Pu 0 siblings, 1 reply; 3+ messages in thread From: Sagi Grimberg @ 2021-03-15 17:32 UTC (permalink / raw) To: Hou Pu, hch, chaitanya.kulkarni; +Cc: linux-nvme, stable > When receiving a pdu, if nvmet_req_init() failed (for example a ns is > not found), the queue->rcv_state could be moved to NVMET_TCP_RECV_DATA > by nvmet_tcp_handle_req_failure(). We should return 0 here to continue > to consume the possible remaining inline write out data in > nvmet_tcp_try_recv_one(). Otherwise, the response to this request would > be sent and iov would be freed. Next time in nvmet_tcp_try_recv_one(), > we would go to the receiving data phase and the iov is used again. > > A panic happend with a 5.4 kernel installed as below: Can you please try to reproduce this with upstream? and with latest stable 5.4? there have been some fixes in this area. We may need to help backported patches to stable if needed. > > [ 169.906573] nvmet_tcp: failed cmd 0000000027717054 id 106 opcode 1, data_len: 1024 > [ 169.908131] general protection fault: 0000 [#1] SMP NOPTI > [ 169.908884] CPU: 0 PID: 122 Comm: kworker/0:1H Kdump: loaded Tainted: G OE [...] > [ 169.910328] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), [...] > [ 169.911943] Workqueue: nvmet_tcp_wq nvmet_tcp_io_work [nvmet_tcp] > [ 169.912815] RIP: 0010:__memcpy+0x12/0x20 > [ 169.913393] Code: e3 97 ff 0f 31 48 c1 e2 20 48 09 d0 48 31 c3 e9 79 ff [...] > [ 169.915819] RSP: 0018:ffffc9000026bbc8 EFLAGS: 00010246 > [ 169.916547] RAX: ebf4958c4fda661b RBX: ffff888119613096 RCX: 0000000000000080 > [ 169.917511] RDX: 0000000000000000 RSI: ffff888119613096 RDI: ebf4958c4fda661b > [ 169.918469] RBP: 0000000000000400 R08: 0000000000000000 R09: 0000000000000400 > [ 169.919435] R10: 0000000000000000 R11: 000000000000003e R12: ffff888114244068 > [ 169.920398] R13: 0000000000000400 R14: 0000000000000400 R15: ffff888118c37cb0 > [ 169.921378] FS: 0000000000000000(0000) GS:ffff88813fc00000(0000) knlGS:0000000000000000 > [ 169.922473] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 169.923269] CR2: 0000555eb19943e0 CR3: 00000001186fc000 CR4: 00000000000006f0 > [ 169.924245] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 169.925214] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 169.926184] Call Trace: > [ 169.926569] _copy_to_iter+0x26f/0x440 > [ 169.927112] ? __check_object_size+0x70/0x230 > [ 169.927724] __skb_datagram_iter+0x15a/0x290 > [ 169.928327] ? skb_kill_datagram+0x60/0x60 > [ 169.928921] skb_copy_datagram_iter+0x3b/0x90 > [ 169.929545] tcp_recvmsg+0x759/0xc50 > [ 169.930081] ? ksoftirqd_running+0x2c/0x30 > [ 169.930668] ? free_unref_page_commit+0x95/0x120 > [ 169.931331] inet_recvmsg+0x55/0xc0 > [ 169.931834] nvmet_tcp_io_work+0x685/0xb23 [nvmet_tcp] > [ 169.932549] process_one_work+0x18c/0x370 > [ 169.933118] worker_thread+0x4f/0x3b0 > [ 169.933654] ? rescuer_thread+0x340/0x340 > [ 169.934223] kthread+0xf6/0x130 > [ 169.934682] ? kthread_create_worker_on_cpu+0x70/0x70 > [ 169.935393] ret_from_fork+0x1f/0x30 > > Cc: <stable@vger.kernel.org> # 5.0 > Signed-off-by: Hou Pu <houpu.main@gmail.com> > --- > drivers/nvme/target/tcp.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c > index 8b0485ada315..da1c667e21ba 100644 > --- a/drivers/nvme/target/tcp.c > +++ b/drivers/nvme/target/tcp.c > @@ -961,7 +961,7 @@ static int nvmet_tcp_done_recv_pdu(struct nvmet_tcp_queue *queue) > le32_to_cpu(req->cmd->common.dptr.sgl.length)); > > nvmet_tcp_handle_req_failure(queue, queue->cmd, req); > - return -EAGAIN; > + return 0; What guarantees that you will actually have more to consume? _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] nvmet-tcp: finish receiving before send back response if nvmet_req_init() failed. 2021-03-15 17:32 ` Sagi Grimberg @ 2021-03-16 5:24 ` Hou Pu 0 siblings, 0 replies; 3+ messages in thread From: Hou Pu @ 2021-03-16 5:24 UTC (permalink / raw) To: Sagi Grimberg, hch, chaitanya.kulkarni; +Cc: linux-nvme, stable On 2021/3/16 1:32 AM, Sagi Grimberg wrote: > >> When receiving a pdu, if nvmet_req_init() failed (for example a ns is >> not found), the queue->rcv_state could be moved to NVMET_TCP_RECV_DATA >> by nvmet_tcp_handle_req_failure(). We should return 0 here to continue >> to consume the possible remaining inline write out data in >> nvmet_tcp_try_recv_one(). Otherwise, the response to this request would >> be sent and iov would be freed. Next time in nvmet_tcp_try_recv_one(), >> we would go to the receiving data phase and the iov is used again. >> >> A panic happend with a 5.4 kernel installed as below: > > Can you please try to reproduce this with upstream? and with latest > stable 5.4? there have been some fixes in this area. We may need > to help backported patches to stable if needed. This could be reproduced on the latest stable (5.4.105). I tried with upstream (5.12-rc3), it could not be reproduced. But I thought the bug still exist in upstream, then I added the following changes to catch such use after problems. diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 8b0485ada315..46847ccf4395 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -217,6 +217,16 @@ static inline void nvmet_tcp_put_cmd(struct nvmet_tcp_cmd *cmd) list_add_tail(&cmd->entry, &cmd->queue->free_list); } +static inline bool nvmet_tcp_cmd_freed(struct nvmet_tcp_cmd *cmd) +{ + /* When a cmd is first geted, list_del_init() is called, it could be + * used to tell if it is on free list + */ + if (cmd->entry.next == &cmd->entry && cmd->entry.prev == &cmd->entry) + return false; + return true; +} + static inline int queue_cpu(struct nvmet_tcp_queue *queue) { return queue->sock->sk->sk_incoming_cpu; @@ -1088,6 +1098,9 @@ static int nvmet_tcp_try_recv_data(struct nvmet_tcp_queue *queue) struct nvmet_tcp_cmd *cmd = queue->cmd; int ret; + if (nvmet_tcp_cmd_freed(cmd)) + pr_err("cmd=%p which is freed is used again\n", cmd); + while (msg_data_left(&cmd->recv_msg)) { ret = sock_recvmsg(cmd->queue->sock, &cmd->recv_msg, cmd->recv_msg.msg_flags); The dmesg shows: [ 80.058535] nvmet_tcp: failed cmd 0000000074710fda id 83 opcode 1, data_len: 1024 [ 80.060225] nvmet_tcp: cmd=00000000b470d96a which is freed is used again [ 80.060903] nvmet_tcp: failed cmd 000000007d648e5e id 84 opcode 1, data_len: 1024 [ 80.061377] nvmet_tcp: cmd=000000002f3194cb which is freed is used again [ 80.061656] nvmet_tcp: failed cmd 0000000047cc966a id 85 opcode 2, data_len: 1024 [ 80.066657] nvmet_tcp: failed cmd 000000005319c9b2 id 86 opcode 1, data_len: 1024 [ 80.067325] nvmet_tcp: cmd=00000000245c70d6 which is freed is used again So, I think the upstream also has this bug. How to reproduce: 1. Runing fio from initiator: fio -bs=1024 -filename=/dev/nvme1n1 -rw=randrw -direct=1 -ioengine=libaio -numjobs=4 -time_based=1 -runtime=600 -iodepth=1 -name=t 2. disable namespace 1 from target by : echo 0 > /sys/kernel/config/nvmet/subsystems/mysub/namespaces/1/enable [..] >> >> drivers/nvme/target/tcp.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c >> index 8b0485ada315..da1c667e21ba 100644 >> --- a/drivers/nvme/target/tcp.c >> +++ b/drivers/nvme/target/tcp.c >> @@ -961,7 +961,7 @@ static int nvmet_tcp_done_recv_pdu(struct >> nvmet_tcp_queue *queue) >> le32_to_cpu(req->cmd->common.dptr.sgl.length)); >> nvmet_tcp_handle_req_failure(queue, queue->cmd, req); >> - return -EAGAIN; >> + return 0; > > What guarantees that you will actually have more to consume? In my case, write 1024 bytes request need receive inline data. After receive cmd pdu, the following inline data still need to be read from the socket. Thanks, Hou _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-03-16 5:24 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-15 10:09 [PATCH] nvmet-tcp: finish receiving before send back response if nvmet_req_init() failed Hou Pu 2021-03-15 17:32 ` Sagi Grimberg 2021-03-16 5:24 ` Hou Pu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).