All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel panic in nvmet_tcp when receiving malformed admin command
@ 2020-09-29 15:45 Anner, Ran
  2020-09-29 20:47 ` Sagi Grimberg
  0 siblings, 1 reply; 7+ messages in thread
From: Anner, Ran @ 2020-09-29 15:45 UTC (permalink / raw)
  To: linux-nvme; +Cc: Zinger, Eldad, Grupi, Elad, Engel, Amit

Hello all.

We are investigating kernel panic when a malformed admin command with inline write data is sent from host to target using nvmeof tcp.
Following a code review of nvmet_tcp from latest kernel and adding debug logs along the way I came up with the following flow: 

nvmet_tcp_try_recv_pdu sends an IO with inline data to nvmet_req_init.
nvmet_req_init send to parse which fails (ex. host sends malformed command).
nvmet_req_init calls __nvmet_req_complete, and in turn calls queue_response.
nvmet_tcp_queue_response add the command to queue response list.
nvmet_req_init return error status and nvmet_tcp_handle_req_failure is called.
nvmet_tcp_handle_req_failure checks if the command has inline data and will allocate pages and iovs.
nvmet_tcp_handle_req_failure return and nvmet_tcp_done_recv_pdu returns -EAGAIN.
nvmet_tcp_try_recv_pdu returns -EAGAIN and in that case nvmet_tcp_try_recv_one returns 0.
In case nvmet_tcp_try_recv_one returns 0, nvmet_tcp_try_recv breaks from loop and return zero as well.
In nvmet_tcp_io_work we continue to nvmet_tcp_try_send in case the return value is zero.
nvmet_tcp_try_send_one will call nvmet_tcp_fetch_cmd which can assign snd_cmd as the failed command.
nvmet_tcp_need_data_out and nvmet_tcp_need_data_in return false so we call nvmet_setup_response_pdu.
nvmet_setup_response_pdu set the cmd state to NVMET_TCP_SEND_RESPONSE and return.
Back to nvmet_tcp_try_send_one, cmd->state == NVMET_TCP_SEND_RESPONSE so nvmet_try_send_response is called.
After kernel_sendpage we free the cmd iov and sgl.
Eventually nvmet_tcp_try_recv_data will be called but the iov was already removed and general protection fault is triggered.

I was wondering if this is a knows issue or if someone else encountered this before.
It seems to me a fix for this would be to not queue the response before receiving all data in this specific flow.
Added below snippet of the kernel trace.
Alternatively if we do not want to delay the response, free the iov after both flows are done, maybe using reference counting to do so.
I will be glad to hear your thoughts and comments.

Thank you,
Ran.

[13538.046364] general protection fault: 0000 [#1] SMP NOPTI
[13538.051762] CPU: 10 PID: 603 Comm: kworker/10:1H Kdump: loaded Tainted: G        W  OE     4.19.134-coreos-r9999.1600774095-1610 #1
[13538.063574] Hardware name: EMC 900-532-002/110-551-910C-01, BIOS 16.35 04/08/2020
[13538.071049] Workqueue: nvmet_tcp_wq nvmet_tcp_io_work [nvmet_tcp]
[13538.077142] RIP: 0010:memcpy_erms+0x6/0x10
[13538.081237] Code: 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e>
[13538.099981] RSP: 0018:ffffc9000781bc70 EFLAGS: 00010202
[13538.105199] RAX: 31d247ca1958f63f RBX: 0000000000000510 RCX: 0000000000000510
[13538.112322] RDX: 0000000000000510 RSI: ffff8886445f60ca RDI: 31d247ca1958f63f
[13538.119446] RBP: ffff888308c41920 R08: 0000000000000510 R09: ffff88850b96e2bc
[13538.126570] R10: ffff88850b96de48 R11: ffff8886445f60ca R12: 0000000000000510
[13538.133693] R13: 0000000000000510 R14: ffff8886445f65da R15: ffff8886902ddd90
[13538.140819] FS:  0000000000000000(0000) GS:ffff8897dfe80000(0000) knlGS:0000000000000000
[13538.148906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[13538.154650] CR2: 00007f547169beb0 CR3: 000000000220a001 CR4: 00000000007606e0
[13538.161772] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[13538.168896] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[13538.176022] PKRU: 55555554
[13538.178726] Call Trace:
[13538.181171]  _copy_to_iter+0x30e/0x3b0
[13538.184925]  ? wake_up_klogd.part.24+0x30/0x40
[13538.189372]  skb_copy_datagram_iter+0x7b/0x260
[13538.193815]  tcp_recvmsg+0x238/0xc80
[13538.197395]  ? nvmet_tcp_io_work+0x400/0xa80 [nvmet_tcp]
[13538.202707]  ? release_sock+0x43/0x90
[13538.206373]  ? tcp_sendpage+0x41/0x50
[13538.210038]  inet_recvmsg+0x5b/0xd0
[13538.213522]  nvmet_tcp_io_work+0xea/0xa80 [nvmet_tcp]
[13538.218570]  process_one_work+0x206/0x400
[13538.222580]  worker_thread+0x2d/0x3e0
[13538.226243]  ? process_one_work+0x400/0x400
[13538.230421]  kthread+0x112/0x130
[13538.233647]  ? kthread_bind+0x20/0x20
[13538.237312]  ret_from_fork+0x1f/0x40

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel panic in nvmet_tcp when receiving malformed admin command
  2020-09-29 15:45 kernel panic in nvmet_tcp when receiving malformed admin command Anner, Ran
@ 2020-09-29 20:47 ` Sagi Grimberg
  2020-09-30 11:28   ` Anner, Ran
  0 siblings, 1 reply; 7+ messages in thread
From: Sagi Grimberg @ 2020-09-29 20:47 UTC (permalink / raw)
  To: Anner, Ran, linux-nvme; +Cc: Zinger, Eldad, Grupi, Elad, Engel, Amit


> Hello all.

Hey Ran, thanks for reporting

> 
> We are investigating kernel panic when a malformed admin command with inline write data is sent from host to target using nvmeof tcp.
> Following a code review of nvmet_tcp from latest kernel and adding debug logs along the way I came up with the following flow:
> 
> nvmet_tcp_try_recv_pdu sends an IO with inline data to nvmet_req_init.
> nvmet_req_init send to parse which fails (ex. host sends malformed command).
> nvmet_req_init calls __nvmet_req_complete, and in turn calls queue_response.
> nvmet_tcp_queue_response add the command to queue response list.
> nvmet_req_init return error status and nvmet_tcp_handle_req_failure is called.
> nvmet_tcp_handle_req_failure checks if the command has inline data and will allocate pages and iovs.
> nvmet_tcp_handle_req_failure return and nvmet_tcp_done_recv_pdu returns -EAGAIN.
> nvmet_tcp_try_recv_pdu returns -EAGAIN and in that case nvmet_tcp_try_recv_one returns 0.
> In case nvmet_tcp_try_recv_one returns 0, nvmet_tcp_try_recv breaks from loop and return zero as well.
> In nvmet_tcp_io_work we continue to nvmet_tcp_try_send in case the return value is zero.
> nvmet_tcp_try_send_one will call nvmet_tcp_fetch_cmd which can assign snd_cmd as the failed command.
> nvmet_tcp_need_data_out and nvmet_tcp_need_data_in return false so we call nvmet_setup_response_pdu.
> nvmet_setup_response_pdu set the cmd state to NVMET_TCP_SEND_RESPONSE and return.
> Back to nvmet_tcp_try_send_one, cmd->state == NVMET_TCP_SEND_RESPONSE so nvmet_try_send_response is called.
> After kernel_sendpage we free the cmd iov and sgl.
> Eventually nvmet_tcp_try_recv_data will be called but the iov was already removed and general protection fault is triggered.
> 
> I was wondering if this is a knows issue or if someone else encountered this before.
> It seems to me a fix for this would be to not queue the response before receiving all data in this specific flow.
> Added below snippet of the kernel trace.
> Alternatively if we do not want to delay the response, free the iov after both flows are done, maybe using reference counting to do so.
> I will be glad to hear your thoughts and comments.

This is indeed a bug. We definitely need a refcount here. I recall that
this used to work though..

What we probably need to do is to defer the response send until we
complete fetching the entire incoming data and then safely free the
command.

I'll try to spin a patch soon.

> 
> Thank you,
> Ran.
> 
> [13538.046364] general protection fault: 0000 [#1] SMP NOPTI
> [13538.051762] CPU: 10 PID: 603 Comm: kworker/10:1H Kdump: loaded Tainted: G        W  OE     4.19.134-coreos-r9999.1600774095-1610 #1
> [13538.063574] Hardware name: EMC 900-532-002/110-551-910C-01, BIOS 16.35 04/08/2020
> [13538.071049] Workqueue: nvmet_tcp_wq nvmet_tcp_io_work [nvmet_tcp]
> [13538.077142] RIP: 0010:memcpy_erms+0x6/0x10
> [13538.081237] Code: 90 90 90 90 eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e>
> [13538.099981] RSP: 0018:ffffc9000781bc70 EFLAGS: 00010202
> [13538.105199] RAX: 31d247ca1958f63f RBX: 0000000000000510 RCX: 0000000000000510
> [13538.112322] RDX: 0000000000000510 RSI: ffff8886445f60ca RDI: 31d247ca1958f63f
> [13538.119446] RBP: ffff888308c41920 R08: 0000000000000510 R09: ffff88850b96e2bc
> [13538.126570] R10: ffff88850b96de48 R11: ffff8886445f60ca R12: 0000000000000510
> [13538.133693] R13: 0000000000000510 R14: ffff8886445f65da R15: ffff8886902ddd90
> [13538.140819] FS:  0000000000000000(0000) GS:ffff8897dfe80000(0000) knlGS:0000000000000000
> [13538.148906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [13538.154650] CR2: 00007f547169beb0 CR3: 000000000220a001 CR4: 00000000007606e0
> [13538.161772] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [13538.168896] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [13538.176022] PKRU: 55555554
> [13538.178726] Call Trace:
> [13538.181171]  _copy_to_iter+0x30e/0x3b0
> [13538.184925]  ? wake_up_klogd.part.24+0x30/0x40
> [13538.189372]  skb_copy_datagram_iter+0x7b/0x260
> [13538.193815]  tcp_recvmsg+0x238/0xc80
> [13538.197395]  ? nvmet_tcp_io_work+0x400/0xa80 [nvmet_tcp]
> [13538.202707]  ? release_sock+0x43/0x90
> [13538.206373]  ? tcp_sendpage+0x41/0x50
> [13538.210038]  inet_recvmsg+0x5b/0xd0
> [13538.213522]  nvmet_tcp_io_work+0xea/0xa80 [nvmet_tcp]
> [13538.218570]  process_one_work+0x206/0x400
> [13538.222580]  worker_thread+0x2d/0x3e0
> [13538.226243]  ? process_one_work+0x400/0x400
> [13538.230421]  kthread+0x112/0x130
> [13538.233647]  ? kthread_bind+0x20/0x20
> [13538.237312]  ret_from_fork+0x1f/0x40
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
> 

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: kernel panic in nvmet_tcp when receiving malformed admin command
  2020-09-29 20:47 ` Sagi Grimberg
@ 2020-09-30 11:28   ` Anner, Ran
  2021-01-03  8:40     ` Engel, Amit
  0 siblings, 1 reply; 7+ messages in thread
From: Anner, Ran @ 2020-09-30 11:28 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme; +Cc: Zinger, Eldad, Grupi, Elad, Engel, Amit

Thanks you Sagi.

-----Original Message-----
From: Sagi Grimberg <sagi@grimberg.me> 
Sent: Tuesday, September 29, 2020 11:48 PM
To: Anner, Ran; linux-nvme@lists.infradead.org
Cc: Zinger, Eldad; Grupi, Elad; Engel, Amit
Subject: Re: kernel panic in nvmet_tcp when receiving malformed admin command


[EXTERNAL EMAIL] 


> Hello all.

Hey Ran, thanks for reporting

> 
> We are investigating kernel panic when a malformed admin command with inline write data is sent from host to target using nvmeof tcp.
> Following a code review of nvmet_tcp from latest kernel and adding debug logs along the way I came up with the following flow:
> 
> nvmet_tcp_try_recv_pdu sends an IO with inline data to nvmet_req_init.
> nvmet_req_init send to parse which fails (ex. host sends malformed command).
> nvmet_req_init calls __nvmet_req_complete, and in turn calls queue_response.
> nvmet_tcp_queue_response add the command to queue response list.
> nvmet_req_init return error status and nvmet_tcp_handle_req_failure is called.
> nvmet_tcp_handle_req_failure checks if the command has inline data and will allocate pages and iovs.
> nvmet_tcp_handle_req_failure return and nvmet_tcp_done_recv_pdu returns -EAGAIN.
> nvmet_tcp_try_recv_pdu returns -EAGAIN and in that case nvmet_tcp_try_recv_one returns 0.
> In case nvmet_tcp_try_recv_one returns 0, nvmet_tcp_try_recv breaks from loop and return zero as well.
> In nvmet_tcp_io_work we continue to nvmet_tcp_try_send in case the return value is zero.
> nvmet_tcp_try_send_one will call nvmet_tcp_fetch_cmd which can assign snd_cmd as the failed command.
> nvmet_tcp_need_data_out and nvmet_tcp_need_data_in return false so we call nvmet_setup_response_pdu.
> nvmet_setup_response_pdu set the cmd state to NVMET_TCP_SEND_RESPONSE and return.
> Back to nvmet_tcp_try_send_one, cmd->state == NVMET_TCP_SEND_RESPONSE so nvmet_try_send_response is called.
> After kernel_sendpage we free the cmd iov and sgl.
> Eventually nvmet_tcp_try_recv_data will be called but the iov was already removed and general protection fault is triggered.
> 
> I was wondering if this is a knows issue or if someone else encountered this before.
> It seems to me a fix for this would be to not queue the response before receiving all data in this specific flow.
> Added below snippet of the kernel trace.
> Alternatively if we do not want to delay the response, free the iov after both flows are done, maybe using reference counting to do so.
> I will be glad to hear your thoughts and comments.

This is indeed a bug. We definitely need a refcount here. I recall that this used to work though..

What we probably need to do is to defer the response send until we complete fetching the entire incoming data and then safely free the command.

I'll try to spin a patch soon.

> 
> Thank you,
> Ran.
> 
> [13538.046364] general protection fault: 0000 [#1] SMP NOPTI 
> [13538.051762] CPU: 10 PID: 603 Comm: kworker/10:1H Kdump: loaded 
> Tainted: G        W  OE     4.19.134-coreos-r9999.1600774095-1610 #1 
> [13538.063574] Hardware name: EMC 900-532-002/110-551-910C-01, BIOS 
> 16.35 04/08/2020 [13538.071049] Workqueue: nvmet_tcp_wq 
> nvmet_tcp_io_work [nvmet_tcp] [13538.077142] RIP: 
> 0010:memcpy_erms+0x6/0x10 [13538.081237] Code: 90 90 90 90 eb 1e 0f 1f 
> 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 
> 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 
> f8 48 83 fa 20 72 7e> [13538.099981] RSP: 0018:ffffc9000781bc70 
> EFLAGS: 00010202 [13538.105199] RAX: 31d247ca1958f63f RBX: 
> 0000000000000510 RCX: 0000000000000510 [13538.112322] RDX: 
> 0000000000000510 RSI: ffff8886445f60ca RDI: 31d247ca1958f63f 
> [13538.119446] RBP: ffff888308c41920 R08: 0000000000000510 R09: 
> ffff88850b96e2bc [13538.126570] R10: ffff88850b96de48 R11: 
> ffff8886445f60ca R12: 0000000000000510 [13538.133693] R13: 
> 0000000000000510 R14: ffff8886445f65da R15: ffff8886902ddd90 
> [13538.140819] FS:  0000000000000000(0000) GS:ffff8897dfe80000(0000) 
> knlGS:0000000000000000 [13538.148906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [13538.154650] CR2: 00007f547169beb0 CR3: 000000000220a001 CR4: 00000000007606e0 [13538.161772] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [13538.168896] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [13538.176022] PKRU: 55555554 [13538.178726] Call Trace:
> [13538.181171]  _copy_to_iter+0x30e/0x3b0 [13538.184925]  ? 
> wake_up_klogd.part.24+0x30/0x40 [13538.189372]  
> skb_copy_datagram_iter+0x7b/0x260 [13538.193815]  
> tcp_recvmsg+0x238/0xc80 [13538.197395]  ? 
> nvmet_tcp_io_work+0x400/0xa80 [nvmet_tcp] [13538.202707]  ? 
> release_sock+0x43/0x90 [13538.206373]  ? tcp_sendpage+0x41/0x50 
> [13538.210038]  inet_recvmsg+0x5b/0xd0 [13538.213522]  
> nvmet_tcp_io_work+0xea/0xa80 [nvmet_tcp] [13538.218570]  
> process_one_work+0x206/0x400 [13538.222580]  worker_thread+0x2d/0x3e0 
> [13538.226243]  ? process_one_work+0x400/0x400 [13538.230421]  
> kthread+0x112/0x130 [13538.233647]  ? kthread_bind+0x20/0x20 
> [13538.237312]  ret_from_fork+0x1f/0x40
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
> 
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: kernel panic in nvmet_tcp when receiving malformed admin command
  2020-09-30 11:28   ` Anner, Ran
@ 2021-01-03  8:40     ` Engel, Amit
  2021-01-05 19:45       ` Sagi Grimberg
  0 siblings, 1 reply; 7+ messages in thread
From: Engel, Amit @ 2021-01-03  8:40 UTC (permalink / raw)
  To: Anner, Ran, Sagi Grimberg, linux-nvme; +Cc: Zinger, Eldad, Grupi, Elad

Hi Sagi, is the patch addressing the below issue is already out there? 
-----Original Message-----
From: Anner, Ran <Ran.Anner@dell.com> 
Sent: Wednesday, September 30, 2020 2:28 PM
To: Sagi Grimberg; linux-nvme@lists.infradead.org
Cc: Zinger, Eldad; Grupi, Elad; Engel, Amit
Subject: RE: kernel panic in nvmet_tcp when receiving malformed admin command

Thanks you Sagi.

-----Original Message-----
From: Sagi Grimberg <sagi@grimberg.me> 
Sent: Tuesday, September 29, 2020 11:48 PM
To: Anner, Ran; linux-nvme@lists.infradead.org
Cc: Zinger, Eldad; Grupi, Elad; Engel, Amit
Subject: Re: kernel panic in nvmet_tcp when receiving malformed admin command


[EXTERNAL EMAIL] 


> Hello all.

Hey Ran, thanks for reporting

> 
> We are investigating kernel panic when a malformed admin command with inline write data is sent from host to target using nvmeof tcp.
> Following a code review of nvmet_tcp from latest kernel and adding debug logs along the way I came up with the following flow:
> 
> nvmet_tcp_try_recv_pdu sends an IO with inline data to nvmet_req_init.
> nvmet_req_init send to parse which fails (ex. host sends malformed command).
> nvmet_req_init calls __nvmet_req_complete, and in turn calls queue_response.
> nvmet_tcp_queue_response add the command to queue response list.
> nvmet_req_init return error status and nvmet_tcp_handle_req_failure is called.
> nvmet_tcp_handle_req_failure checks if the command has inline data and will allocate pages and iovs.
> nvmet_tcp_handle_req_failure return and nvmet_tcp_done_recv_pdu returns -EAGAIN.
> nvmet_tcp_try_recv_pdu returns -EAGAIN and in that case nvmet_tcp_try_recv_one returns 0.
> In case nvmet_tcp_try_recv_one returns 0, nvmet_tcp_try_recv breaks from loop and return zero as well.
> In nvmet_tcp_io_work we continue to nvmet_tcp_try_send in case the return value is zero.
> nvmet_tcp_try_send_one will call nvmet_tcp_fetch_cmd which can assign snd_cmd as the failed command.
> nvmet_tcp_need_data_out and nvmet_tcp_need_data_in return false so we call nvmet_setup_response_pdu.
> nvmet_setup_response_pdu set the cmd state to NVMET_TCP_SEND_RESPONSE and return.
> Back to nvmet_tcp_try_send_one, cmd->state == NVMET_TCP_SEND_RESPONSE so nvmet_try_send_response is called.
> After kernel_sendpage we free the cmd iov and sgl.
> Eventually nvmet_tcp_try_recv_data will be called but the iov was already removed and general protection fault is triggered.
> 
> I was wondering if this is a knows issue or if someone else encountered this before.
> It seems to me a fix for this would be to not queue the response before receiving all data in this specific flow.
> Added below snippet of the kernel trace.
> Alternatively if we do not want to delay the response, free the iov after both flows are done, maybe using reference counting to do so.
> I will be glad to hear your thoughts and comments.

This is indeed a bug. We definitely need a refcount here. I recall that this used to work though..

What we probably need to do is to defer the response send until we complete fetching the entire incoming data and then safely free the command.

I'll try to spin a patch soon.

> 
> Thank you,
> Ran.
> 
> [13538.046364] general protection fault: 0000 [#1] SMP NOPTI 
> [13538.051762] CPU: 10 PID: 603 Comm: kworker/10:1H Kdump: loaded 
> Tainted: G        W  OE     4.19.134-coreos-r9999.1600774095-1610 #1 
> [13538.063574] Hardware name: EMC 900-532-002/110-551-910C-01, BIOS 
> 16.35 04/08/2020 [13538.071049] Workqueue: nvmet_tcp_wq 
> nvmet_tcp_io_work [nvmet_tcp] [13538.077142] RIP: 
> 0010:memcpy_erms+0x6/0x10 [13538.081237] Code: 90 90 90 90 eb 1e 0f 1f 
> 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 
> 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 
> f8 48 83 fa 20 72 7e> [13538.099981] RSP: 0018:ffffc9000781bc70 
> EFLAGS: 00010202 [13538.105199] RAX: 31d247ca1958f63f RBX: 
> 0000000000000510 RCX: 0000000000000510 [13538.112322] RDX: 
> 0000000000000510 RSI: ffff8886445f60ca RDI: 31d247ca1958f63f 
> [13538.119446] RBP: ffff888308c41920 R08: 0000000000000510 R09: 
> ffff88850b96e2bc [13538.126570] R10: ffff88850b96de48 R11: 
> ffff8886445f60ca R12: 0000000000000510 [13538.133693] R13: 
> 0000000000000510 R14: ffff8886445f65da R15: ffff8886902ddd90 
> [13538.140819] FS:  0000000000000000(0000) GS:ffff8897dfe80000(0000) 
> knlGS:0000000000000000 [13538.148906] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [13538.154650] CR2: 00007f547169beb0 CR3: 000000000220a001 CR4: 00000000007606e0 [13538.161772] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [13538.168896] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [13538.176022] PKRU: 55555554 [13538.178726] Call Trace:
> [13538.181171]  _copy_to_iter+0x30e/0x3b0 [13538.184925]  ? 
> wake_up_klogd.part.24+0x30/0x40 [13538.189372]  
> skb_copy_datagram_iter+0x7b/0x260 [13538.193815]  
> tcp_recvmsg+0x238/0xc80 [13538.197395]  ? 
> nvmet_tcp_io_work+0x400/0xa80 [nvmet_tcp] [13538.202707]  ? 
> release_sock+0x43/0x90 [13538.206373]  ? tcp_sendpage+0x41/0x50 
> [13538.210038]  inet_recvmsg+0x5b/0xd0 [13538.213522]  
> nvmet_tcp_io_work+0xea/0xa80 [nvmet_tcp] [13538.218570]  
> process_one_work+0x206/0x400 [13538.222580]  worker_thread+0x2d/0x3e0 
> [13538.226243]  ? process_one_work+0x400/0x400 [13538.230421]  
> kthread+0x112/0x130 [13538.233647]  ? kthread_bind+0x20/0x20 
> [13538.237312]  ret_from_fork+0x1f/0x40
> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
> 
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel panic in nvmet_tcp when receiving malformed admin command
  2021-01-03  8:40     ` Engel, Amit
@ 2021-01-05 19:45       ` Sagi Grimberg
  2021-01-08 20:23         ` Sagi Grimberg
  0 siblings, 1 reply; 7+ messages in thread
From: Sagi Grimberg @ 2021-01-05 19:45 UTC (permalink / raw)
  To: Engel, Amit, Anner, Ran, linux-nvme; +Cc: Zinger, Eldad, Grupi, Elad


> Hi Sagi, is the patch addressing the below issue is already out there?

No, thanks for reminding me, I'll have a look into this.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: kernel panic in nvmet_tcp when receiving malformed admin command
  2021-01-05 19:45       ` Sagi Grimberg
@ 2021-01-08 20:23         ` Sagi Grimberg
  2021-01-14 11:49           ` Grupi, Elad
  0 siblings, 1 reply; 7+ messages in thread
From: Sagi Grimberg @ 2021-01-08 20:23 UTC (permalink / raw)
  To: Engel, Amit, Anner, Ran, linux-nvme; +Cc: Zinger, Eldad, Grupi, Elad


>> Hi Sagi, is the patch addressing the below issue is already out there?
> 
> No, thanks for reminding me, I'll have a look into this.

Amit, Ran,

Does this solve your issue?
This needs to split to patches but would be good to understand
if this works.
--
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index dc1f0f647189..c41902f7ce39 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -204,7 +204,6 @@ nvmet_tcp_get_cmd(struct nvmet_tcp_queue *queue)
         cmd->rbytes_done = cmd->wbytes_done = 0;
         cmd->pdu_len = 0;
         cmd->pdu_recv = 0;
-       cmd->iov = NULL;
         cmd->flags = 0;
         return cmd;
  }
@@ -294,6 +293,7 @@ static void nvmet_tcp_unmap_pdu_iovec(struct 
nvmet_tcp_cmd *cmd)

         for (i = 0; i < cmd->nr_mapped; i++)
                 kunmap(sg_page(&sg[i]));
+       cmd->nr_mapped = 0;
  }

  static void nvmet_tcp_map_pdu_iovec(struct nvmet_tcp_cmd *cmd)
@@ -341,6 +341,14 @@ static void nvmet_tcp_socket_error(struct 
nvmet_tcp_queue *queue, int status)
                 nvmet_tcp_fatal_error(queue);
  }

+static void nvmet_tcp_unmap_data(struct nvmet_tcp_cmd *cmd)
+{
+       kfree(cmd->iov);
+       cmd->iov = NULL;
+       sgl_free(cmd->req.sg);
+       cmd->req.sg = NULL;
+}
+
  static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd)
  {
         struct nvme_sgl_desc *sgl = &cmd->req.cmd->common.dptr.sgl;
@@ -375,6 +383,7 @@ static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd)
         return 0;
  err:
         sgl_free(cmd->req.sg);
+       cmd->req.sg = NULL;
         return NVME_SC_INTERNAL;
  }

@@ -571,17 +580,13 @@ static int nvmet_try_send_data(struct 
nvmet_tcp_cmd *cmd, bool last_in_batch)
         } else {
                 if (queue->nvme_sq.sqhd_disabled) {
                         cmd->queue->snd_cmd = NULL;
+                       nvmet_tcp_unmap_data(cmd);
                         nvmet_tcp_put_cmd(cmd);
                 } else {
                         nvmet_setup_response_pdu(cmd);
                 }
         }

-       if (queue->nvme_sq.sqhd_disabled) {
-               kfree(cmd->iov);
-               sgl_free(cmd->req.sg);
-       }
-
         return 1;

  }
@@ -609,9 +614,8 @@ static int nvmet_try_send_response(struct 
nvmet_tcp_cmd *cmd,
         if (left)
                 return -EAGAIN;

-       kfree(cmd->iov);
-       sgl_free(cmd->req.sg);
         cmd->queue->snd_cmd = NULL;
+       nvmet_tcp_unmap_data(cmd);
         nvmet_tcp_put_cmd(cmd);
         return 1;
  }
@@ -665,6 +669,7 @@ static int nvmet_try_send_ddgst(struct nvmet_tcp_cmd 
*cmd, bool last_in_batch)

         if (queue->nvme_sq.sqhd_disabled) {
                 cmd->queue->snd_cmd = NULL;
+               nvmet_tcp_unmap_data(cmd);
                 nvmet_tcp_put_cmd(cmd);
         } else {
                 nvmet_setup_response_pdu(cmd);
@@ -1082,15 +1087,16 @@ static int nvmet_tcp_try_recv_data(struct 
nvmet_tcp_queue *queue)

         nvmet_tcp_unmap_pdu_iovec(cmd);

-       if (!(cmd->flags & NVMET_TCP_F_INIT_FAILED) &&
-           cmd->rbytes_done == cmd->req.transfer_len) {
-               if (queue->data_digest) {
-                       nvmet_tcp_prep_recv_ddgst(cmd);
-                       return 0;
-               }
-               cmd->req.execute(&cmd->req);
+       if (queue->data_digest) {
+               nvmet_tcp_prep_recv_ddgst(cmd);
+               return 0;
         }

+       if (unlikely(cmd->flags & NVMET_TCP_F_INIT_FAILED))
+               nvmet_tcp_finish_cmd(cmd);
+       else if (cmd->rbytes_done == cmd->req.transfer_len)
+               cmd->req.execute(&cmd->req);
+
         nvmet_prepare_receive_pdu(queue);
         return 0;
  }
@@ -1120,14 +1126,16 @@ static int nvmet_tcp_try_recv_ddgst(struct 
nvmet_tcp_queue *queue)
                         queue->idx, cmd->req.cmd->common.command_id,
                         queue->pdu.cmd.hdr.type, 
le32_to_cpu(cmd->recv_ddgst),
                         le32_to_cpu(cmd->exp_ddgst));
+               nvmet_req_uninit(&cmd->req);
                 nvmet_tcp_finish_cmd(cmd);
                 nvmet_tcp_fatal_error(queue);
                 ret = -EPROTO;
                 goto out;
         }

-       if (!(cmd->flags & NVMET_TCP_F_INIT_FAILED) &&
-           cmd->rbytes_done == cmd->req.transfer_len)
+       if (unlikely(cmd->flags & NVMET_TCP_F_INIT_FAILED))
+               nvmet_tcp_finish_cmd(cmd);
+       else if (cmd->rbytes_done == cmd->req.transfer_len)
                 cmd->req.execute(&cmd->req);
         ret = 0;
  out:
@@ -1139,7 +1147,8 @@ static int nvmet_tcp_try_recv_one(struct 
nvmet_tcp_queue *queue)
  {
         int result = 0;

-       if (unlikely(queue->rcv_state == NVMET_TCP_RECV_ERR))
+       if (unlikely(queue->rcv_state == NVMET_TCP_RECV_ERR ||
+                    queue->state == NVMET_TCP_Q_DISCONNECTING))
                 return 0;

         if (queue->rcv_state == NVMET_TCP_RECV_PDU) {
@@ -1333,10 +1342,26 @@ static void 
nvmet_tcp_restore_socket_callbacks(struct nvmet_tcp_queue *queue)

  static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd)
  {
-       nvmet_req_uninit(&cmd->req);
         nvmet_tcp_unmap_pdu_iovec(cmd);
-       kfree(cmd->iov);
-       sgl_free(cmd->req.sg);
+       nvmet_tcp_unmap_data(cmd);
+       nvmet_tcp_put_cmd(cmd);
+}
+
+static void nvmet_tcp_unmap_cmds(struct nvmet_tcp_queue *queue)
+{
+       struct nvmet_tcp_cmd *cmd = queue->cmds;
+       int i;
+
+       for (i = 0; i < queue->nr_cmds; i++, cmd++) {
+               nvmet_tcp_unmap_pdu_iovec(cmd);
+               nvmet_tcp_unmap_data(cmd);
+       }
+
+       if (!queue->nr_cmds) {
+               /* failed in connect */
+               nvmet_tcp_unmap_pdu_iovec(cmd);
+               nvmet_tcp_unmap_data(cmd);
+       }
  }

  static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue)
@@ -1346,12 +1371,12 @@ static void nvmet_tcp_uninit_data_in_cmds(struct 
nvmet_tcp_queue *queue)

         for (i = 0; i < queue->nr_cmds; i++, cmd++) {
                 if (nvmet_tcp_need_data_in(cmd))
-                       nvmet_tcp_finish_cmd(cmd);
+                       nvmet_req_uninit(&cmd->req);
         }

         if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect)) {
                 /* failed in connect */
-               nvmet_tcp_finish_cmd(&queue->connect);
+               nvmet_req_uninit(&cmd->req);
         }
  }

@@ -1370,6 +1395,7 @@ static void nvmet_tcp_release_queue_work(struct 
work_struct *w)
         nvmet_tcp_uninit_data_in_cmds(queue);
         nvmet_sq_destroy(&queue->nvme_sq);
         cancel_work_sync(&queue->io_work);
+       nvmet_tcp_unmap_cmds(queue);
         sock_release(queue->sock);
         nvmet_tcp_free_cmds(queue);
         if (queue->hdr_digest || queue->data_digest)
--

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* RE: kernel panic in nvmet_tcp when receiving malformed admin command
  2021-01-08 20:23         ` Sagi Grimberg
@ 2021-01-14 11:49           ` Grupi, Elad
  0 siblings, 0 replies; 7+ messages in thread
From: Grupi, Elad @ 2021-01-14 11:49 UTC (permalink / raw)
  To: Sagi Grimberg, Engel, Amit, Anner, Ran, linux-nvme; +Cc: Zinger, Eldad

Hi Sagi

This patch did not solve the below issue. We still see the same kernel panic.
We reproduce the issue by adding a validation to the resv1 field in connect command.

Few comments about your patch:

In nvmet_tcp_unmap_cmds, need to use queue->connect instead of cmd:
+               /* failed in connect */
+               nvmet_tcp_unmap_pdu_iovec(cmd);
+               nvmet_tcp_unmap_data(cmd);

Also in nvmet_tcp_uninit_data_in_cmds same issue:
                 /* failed in connect */
-               nvmet_tcp_finish_cmd(&queue->connect);
+               nvmet_req_uninit(&cmd->req);

Elad

-----Original Message-----
From: Linux-nvme <linux-nvme-bounces@lists.infradead.org> On Behalf Of Sagi Grimberg
Sent: Friday, 8 January 2021 22:24
To: Engel, Amit; Anner, Ran; linux-nvme@lists.infradead.org
Cc: Zinger, Eldad; Grupi, Elad
Subject: Re: kernel panic in nvmet_tcp when receiving malformed admin command


[EXTERNAL EMAIL] 


>> Hi Sagi, is the patch addressing the below issue is already out there?
> 
> No, thanks for reminding me, I'll have a look into this.

Amit, Ran,

Does this solve your issue?
This needs to split to patches but would be good to understand if this works.
--
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index dc1f0f647189..c41902f7ce39 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -204,7 +204,6 @@ nvmet_tcp_get_cmd(struct nvmet_tcp_queue *queue)
         cmd->rbytes_done = cmd->wbytes_done = 0;
         cmd->pdu_len = 0;
         cmd->pdu_recv = 0;
-       cmd->iov = NULL;
         cmd->flags = 0;
         return cmd;
  }
@@ -294,6 +293,7 @@ static void nvmet_tcp_unmap_pdu_iovec(struct nvmet_tcp_cmd *cmd)

         for (i = 0; i < cmd->nr_mapped; i++)
                 kunmap(sg_page(&sg[i]));
+       cmd->nr_mapped = 0;
  }

  static void nvmet_tcp_map_pdu_iovec(struct nvmet_tcp_cmd *cmd) @@ -341,6 +341,14 @@ static void nvmet_tcp_socket_error(struct nvmet_tcp_queue *queue, int status)
                 nvmet_tcp_fatal_error(queue);
  }

+static void nvmet_tcp_unmap_data(struct nvmet_tcp_cmd *cmd) {
+       kfree(cmd->iov);
+       cmd->iov = NULL;
+       sgl_free(cmd->req.sg);
+       cmd->req.sg = NULL;
+}
+
  static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd)
  {
         struct nvme_sgl_desc *sgl = &cmd->req.cmd->common.dptr.sgl; @@ -375,6 +383,7 @@ static int nvmet_tcp_map_data(struct nvmet_tcp_cmd *cmd)
         return 0;
  err:
         sgl_free(cmd->req.sg);
+       cmd->req.sg = NULL;
         return NVME_SC_INTERNAL;
  }

@@ -571,17 +580,13 @@ static int nvmet_try_send_data(struct nvmet_tcp_cmd *cmd, bool last_in_batch)
         } else {
                 if (queue->nvme_sq.sqhd_disabled) {
                         cmd->queue->snd_cmd = NULL;
+                       nvmet_tcp_unmap_data(cmd);
                         nvmet_tcp_put_cmd(cmd);
                 } else {
                         nvmet_setup_response_pdu(cmd);
                 }
         }

-       if (queue->nvme_sq.sqhd_disabled) {
-               kfree(cmd->iov);
-               sgl_free(cmd->req.sg);
-       }
-
         return 1;

  }
@@ -609,9 +614,8 @@ static int nvmet_try_send_response(struct nvmet_tcp_cmd *cmd,
         if (left)
                 return -EAGAIN;

-       kfree(cmd->iov);
-       sgl_free(cmd->req.sg);
         cmd->queue->snd_cmd = NULL;
+       nvmet_tcp_unmap_data(cmd);
         nvmet_tcp_put_cmd(cmd);
         return 1;
  }
@@ -665,6 +669,7 @@ static int nvmet_try_send_ddgst(struct nvmet_tcp_cmd *cmd, bool last_in_batch)

         if (queue->nvme_sq.sqhd_disabled) {
                 cmd->queue->snd_cmd = NULL;
+               nvmet_tcp_unmap_data(cmd);
                 nvmet_tcp_put_cmd(cmd);
         } else {
                 nvmet_setup_response_pdu(cmd); @@ -1082,15 +1087,16 @@ static int nvmet_tcp_try_recv_data(struct nvmet_tcp_queue *queue)

         nvmet_tcp_unmap_pdu_iovec(cmd);

-       if (!(cmd->flags & NVMET_TCP_F_INIT_FAILED) &&
-           cmd->rbytes_done == cmd->req.transfer_len) {
-               if (queue->data_digest) {
-                       nvmet_tcp_prep_recv_ddgst(cmd);
-                       return 0;
-               }
-               cmd->req.execute(&cmd->req);
+       if (queue->data_digest) {
+               nvmet_tcp_prep_recv_ddgst(cmd);
+               return 0;
         }

+       if (unlikely(cmd->flags & NVMET_TCP_F_INIT_FAILED))
+               nvmet_tcp_finish_cmd(cmd);
+       else if (cmd->rbytes_done == cmd->req.transfer_len)
+               cmd->req.execute(&cmd->req);
+
         nvmet_prepare_receive_pdu(queue);
         return 0;
  }
@@ -1120,14 +1126,16 @@ static int nvmet_tcp_try_recv_ddgst(struct nvmet_tcp_queue *queue)
                         queue->idx, cmd->req.cmd->common.command_id,
                         queue->pdu.cmd.hdr.type, le32_to_cpu(cmd->recv_ddgst),
                         le32_to_cpu(cmd->exp_ddgst));
+               nvmet_req_uninit(&cmd->req);
                 nvmet_tcp_finish_cmd(cmd);
                 nvmet_tcp_fatal_error(queue);
                 ret = -EPROTO;
                 goto out;
         }

-       if (!(cmd->flags & NVMET_TCP_F_INIT_FAILED) &&
-           cmd->rbytes_done == cmd->req.transfer_len)
+       if (unlikely(cmd->flags & NVMET_TCP_F_INIT_FAILED))
+               nvmet_tcp_finish_cmd(cmd);
+       else if (cmd->rbytes_done == cmd->req.transfer_len)
                 cmd->req.execute(&cmd->req);
         ret = 0;
  out:
@@ -1139,7 +1147,8 @@ static int nvmet_tcp_try_recv_one(struct nvmet_tcp_queue *queue)
  {
         int result = 0;

-       if (unlikely(queue->rcv_state == NVMET_TCP_RECV_ERR))
+       if (unlikely(queue->rcv_state == NVMET_TCP_RECV_ERR ||
+                    queue->state == NVMET_TCP_Q_DISCONNECTING))
                 return 0;

         if (queue->rcv_state == NVMET_TCP_RECV_PDU) { @@ -1333,10 +1342,26 @@ static void nvmet_tcp_restore_socket_callbacks(struct nvmet_tcp_queue *queue)

  static void nvmet_tcp_finish_cmd(struct nvmet_tcp_cmd *cmd)
  {
-       nvmet_req_uninit(&cmd->req);
         nvmet_tcp_unmap_pdu_iovec(cmd);
-       kfree(cmd->iov);
-       sgl_free(cmd->req.sg);
+       nvmet_tcp_unmap_data(cmd);
+       nvmet_tcp_put_cmd(cmd);
+}
+
+static void nvmet_tcp_unmap_cmds(struct nvmet_tcp_queue *queue) {
+       struct nvmet_tcp_cmd *cmd = queue->cmds;
+       int i;
+
+       for (i = 0; i < queue->nr_cmds; i++, cmd++) {
+               nvmet_tcp_unmap_pdu_iovec(cmd);
+               nvmet_tcp_unmap_data(cmd);
+       }
+
+       if (!queue->nr_cmds) {
+               /* failed in connect */
+               nvmet_tcp_unmap_pdu_iovec(cmd);
+               nvmet_tcp_unmap_data(cmd);
+       }
  }

  static void nvmet_tcp_uninit_data_in_cmds(struct nvmet_tcp_queue *queue) @@ -1346,12 +1371,12 @@ static void nvmet_tcp_uninit_data_in_cmds(struct
nvmet_tcp_queue *queue)

         for (i = 0; i < queue->nr_cmds; i++, cmd++) {
                 if (nvmet_tcp_need_data_in(cmd))
-                       nvmet_tcp_finish_cmd(cmd);
+                       nvmet_req_uninit(&cmd->req);
         }

         if (!queue->nr_cmds && nvmet_tcp_need_data_in(&queue->connect)) {
                 /* failed in connect */
-               nvmet_tcp_finish_cmd(&queue->connect);
+               nvmet_req_uninit(&cmd->req);
         }
  }

@@ -1370,6 +1395,7 @@ static void nvmet_tcp_release_queue_work(struct
work_struct *w)
         nvmet_tcp_uninit_data_in_cmds(queue);
         nvmet_sq_destroy(&queue->nvme_sq);
         cancel_work_sync(&queue->io_work);
+       nvmet_tcp_unmap_cmds(queue);
         sock_release(queue->sock);
         nvmet_tcp_free_cmds(queue);
         if (queue->hdr_digest || queue->data_digest)
--

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-01-14 11:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29 15:45 kernel panic in nvmet_tcp when receiving malformed admin command Anner, Ran
2020-09-29 20:47 ` Sagi Grimberg
2020-09-30 11:28   ` Anner, Ran
2021-01-03  8:40     ` Engel, Amit
2021-01-05 19:45       ` Sagi Grimberg
2021-01-08 20:23         ` Sagi Grimberg
2021-01-14 11:49           ` Grupi, Elad

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.