All of lore.kernel.org
 help / color / mirror / Atom feed
* crash at nvme_tcp_init_iter with header digest enabled
@ 2022-08-19  7:51 Daniel Wagner
  2022-08-28 12:09 ` Sagi Grimberg
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Wagner @ 2022-08-19  7:51 UTC (permalink / raw)
  To: linux-nvme

Hi,

we got a customer bug report against our downstream kernel
when doing fail over tests with header digest enabled.

The whole crash looks like an user after free bug but
so far we were not able to figure out where it happens.

  nvme nvme13: queue 1: header digest flag is cleared
  nvme nvme13: receive failed:  -71
  nvme nvme13: starting error recovery
  nvme nvme7: Reconnecting in 10 seconds...

  RIP: nvme_tcp_init_iter

  nvme_tcp_recv_skb
  ? tcp_mstamp_refresh
  ? nvme_tcp_submit_async_event
  tcp_read_sock
  nvme_tcp_try_recv
  nvme_tcp_io_work
  process_one_work
  ? process_one_work
  worker_thread
  ? process_one_work
  kthread
  ? set_kthread_struct
  ret_from_fork

In order to rule out that this caused by an reuse of a command id, I
added a test patch which always clears the request pointer (see below)
and hoped to see

   "got bad cqe.command_id %#x on queue %d\n"

but there was none. Instead the crash disappeared. It looks like we are
not clearing the request in the error path, but so far I haven't figured
out how this is related to the header digest enabled.

Anyway, this is just a FYI and in case anyone has an idea where to poke
at; I am listening.

Thanks,
Daniel



diff --git a/block/blk-mq.c b/block/blk-mq.c
index 98cc93d58575..bfadccb90be6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -847,6 +847,13 @@ struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
 }
 EXPORT_SYMBOL(blk_mq_tag_to_rq);
 
+void blk_mq_tag_reset(struct blk_mq_tags *tags, unsigned int tag)
+{
+	struct request *rq = tags->rqs[tag];
+	cmpxchg(&tags->rqs[tag], rq, NULL);
+}
+EXPORT_SYMBOL(blk_mq_tag_reset);
+
 static bool blk_mq_rq_inflight(struct blk_mq_hw_ctx *hctx, struct request *rq,
 			       void *priv, bool reserved)
 {
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 78cfe97031ca..f9a641fb7353 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -504,6 +504,8 @@ static int nvme_tcp_process_nvme_cqe(struct nvme_tcp_queue *queue,
 		nvme_tcp_error_recovery(&queue->ctrl->ctrl);
 		return -EINVAL;
 	}
+	blk_mq_tag_reset(nvme_tcp_tagset(queue),
+			 nvme_tag_from_cid(cqe->command_id));
 
 	req = blk_mq_rq_to_pdu(rq);
 	if (req->status == cpu_to_le16(NVME_SC_SUCCESS))
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 1d18447ebebc..a338ec65f3c8 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -470,6 +470,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
 		unsigned int op, blk_mq_req_flags_t flags,
 		unsigned int hctx_idx);
 struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag);
+void blk_mq_tag_reset(struct blk_mq_tags *tags, unsigned int tag);
 
 enum {
 	BLK_MQ_UNIQUE_TAG_BITS = 16,



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: crash at nvme_tcp_init_iter with header digest enabled
  2022-08-19  7:51 crash at nvme_tcp_init_iter with header digest enabled Daniel Wagner
@ 2022-08-28 12:09 ` Sagi Grimberg
  2022-09-05  9:46   ` Sagi Grimberg
  0 siblings, 1 reply; 5+ messages in thread
From: Sagi Grimberg @ 2022-08-28 12:09 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme


> Hi,
> 
> we got a customer bug report against our downstream kernel
> when doing fail over tests with header digest enabled.
> 
> The whole crash looks like an user after free bug but
> so far we were not able to figure out where it happens.
> 
>    nvme nvme13: queue 1: header digest flag is cleared
>    nvme nvme13: receive failed:  -71
>    nvme nvme13: starting error recovery
>    nvme nvme7: Reconnecting in 10 seconds...
> 
>    RIP: nvme_tcp_init_iter
> 
>    nvme_tcp_recv_skb
>    ? tcp_mstamp_refresh
>    ? nvme_tcp_submit_async_event
>    tcp_read_sock
>    nvme_tcp_try_recv
>    nvme_tcp_io_work
>    process_one_work
>    ? process_one_work
>    worker_thread
>    ? process_one_work
>    kthread
>    ? set_kthread_struct
>    ret_from_fork
> 
> In order to rule out that this caused by an reuse of a command id, I
> added a test patch which always clears the request pointer (see below)
> and hoped to see
> 
>     "got bad cqe.command_id %#x on queue %d\n"
> 
> but there was none. Instead the crash disappeared. It looks like we are
> not clearing the request in the error path, but so far I haven't figured
> out how this is related to the header digest enabled.
> 
> Anyway, this is just a FYI and in case anyone has an idea where to poke
> at; I am listening.

I think I see the problem. The stream is corrupted, and we keep
processing it.

The current logic says that once we hit a header-digest problem, we
immediately stop reading from the socket (rd_enabled=false) and trigger
error recovery.

When rd_enabled=false, we don't act on data_ready callbacks, as we know
we are tearing down the socket. However we may keep reading from the
socket if the io_work continues and calls try_recv again (mainly because
our error from nvme_tcp_recv_skb is not propagated back).

I think that this will make the issue go away:
--
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index e82dcfcda29b..3e3ebde4eff5 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1229,7 +1229,7 @@ static void nvme_tcp_io_work(struct work_struct *w)
                 else if (unlikely(result < 0))
                         return;

-               if (!pending)
+               if (!pending || !queue->rd_enabled)
                         return;

         } while (!time_after(jiffies, deadline)); /* quota is exhausted */
--


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: crash at nvme_tcp_init_iter with header digest enabled
  2022-08-28 12:09 ` Sagi Grimberg
@ 2022-09-05  9:46   ` Sagi Grimberg
  2022-09-05 10:26     ` Daniel Wagner
  0 siblings, 1 reply; 5+ messages in thread
From: Sagi Grimberg @ 2022-09-05  9:46 UTC (permalink / raw)
  To: Daniel Wagner, linux-nvme


>> Hi,
>>
>> we got a customer bug report against our downstream kernel
>> when doing fail over tests with header digest enabled.
>>
>> The whole crash looks like an user after free bug but
>> so far we were not able to figure out where it happens.
>>
>>    nvme nvme13: queue 1: header digest flag is cleared
>>    nvme nvme13: receive failed:  -71
>>    nvme nvme13: starting error recovery
>>    nvme nvme7: Reconnecting in 10 seconds...
>>
>>    RIP: nvme_tcp_init_iter
>>
>>    nvme_tcp_recv_skb
>>    ? tcp_mstamp_refresh
>>    ? nvme_tcp_submit_async_event
>>    tcp_read_sock
>>    nvme_tcp_try_recv
>>    nvme_tcp_io_work
>>    process_one_work
>>    ? process_one_work
>>    worker_thread
>>    ? process_one_work
>>    kthread
>>    ? set_kthread_struct
>>    ret_from_fork
>>
>> In order to rule out that this caused by an reuse of a command id, I
>> added a test patch which always clears the request pointer (see below)
>> and hoped to see
>>
>>     "got bad cqe.command_id %#x on queue %d\n"
>>
>> but there was none. Instead the crash disappeared. It looks like we are
>> not clearing the request in the error path, but so far I haven't figured
>> out how this is related to the header digest enabled.
>>
>> Anyway, this is just a FYI and in case anyone has an idea where to poke
>> at; I am listening.
> 
> I think I see the problem. The stream is corrupted, and we keep
> processing it.
> 
> The current logic says that once we hit a header-digest problem, we
> immediately stop reading from the socket (rd_enabled=false) and trigger
> error recovery.
> 
> When rd_enabled=false, we don't act on data_ready callbacks, as we know
> we are tearing down the socket. However we may keep reading from the
> socket if the io_work continues and calls try_recv again (mainly because
> our error from nvme_tcp_recv_skb is not propagated back).
> 
> I think that this will make the issue go away:
> -- 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index e82dcfcda29b..3e3ebde4eff5 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1229,7 +1229,7 @@ static void nvme_tcp_io_work(struct work_struct *w)
>                  else if (unlikely(result < 0))
>                          return;
> 
> -               if (!pending)
> +               if (!pending || !queue->rd_enabled)
>                          return;
> 
>          } while (!time_after(jiffies, deadline)); /* quota is exhausted */
> -- 

Daniel, any input here?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: crash at nvme_tcp_init_iter with header digest enabled
  2022-09-05  9:46   ` Sagi Grimberg
@ 2022-09-05 10:26     ` Daniel Wagner
  2022-09-05 10:53       ` Sagi Grimberg
  0 siblings, 1 reply; 5+ messages in thread
From: Daniel Wagner @ 2022-09-05 10:26 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme

On Mon, Sep 05, 2022 at 12:46:40PM +0300, Sagi Grimberg wrote:
> > diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> > index e82dcfcda29b..3e3ebde4eff5 100644
> > --- a/drivers/nvme/host/tcp.c
> > +++ b/drivers/nvme/host/tcp.c
> > @@ -1229,7 +1229,7 @@ static void nvme_tcp_io_work(struct work_struct *w)
> >                  else if (unlikely(result < 0))
> >                          return;
> > 
> > -               if (!pending)
> > +               if (!pending || !queue->rd_enabled)
> >                          return;
> > 
> >          } while (!time_after(jiffies, deadline)); /* quota is exhausted */
> > -- 
> 
> Daniel, any input here?

I haven't heard anything from our customer yet. Let me ping them again.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: crash at nvme_tcp_init_iter with header digest enabled
  2022-09-05 10:26     ` Daniel Wagner
@ 2022-09-05 10:53       ` Sagi Grimberg
  0 siblings, 0 replies; 5+ messages in thread
From: Sagi Grimberg @ 2022-09-05 10:53 UTC (permalink / raw)
  To: Daniel Wagner; +Cc: linux-nvme


>>> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
>>> index e82dcfcda29b..3e3ebde4eff5 100644
>>> --- a/drivers/nvme/host/tcp.c
>>> +++ b/drivers/nvme/host/tcp.c
>>> @@ -1229,7 +1229,7 @@ static void nvme_tcp_io_work(struct work_struct *w)
>>>                   else if (unlikely(result < 0))
>>>                           return;
>>>
>>> -               if (!pending)
>>> +               if (!pending || !queue->rd_enabled)
>>>                           return;
>>>
>>>           } while (!time_after(jiffies, deadline)); /* quota is exhausted */
>>> -- 
>>
>> Daniel, any input here?
> 
> I haven't heard anything from our customer yet. Let me ping them again.

Most likely it should address the issue, but the proposed patch is a fix
regardless.

Will send it as a proper patch, would love to get your reviewed-by, and
when you hear from your customer, you can add Tested-by tag.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-09-05 15:45 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-19  7:51 crash at nvme_tcp_init_iter with header digest enabled Daniel Wagner
2022-08-28 12:09 ` Sagi Grimberg
2022-09-05  9:46   ` Sagi Grimberg
2022-09-05 10:26     ` Daniel Wagner
2022-09-05 10:53       ` Sagi Grimberg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.