All of lore.kernel.org
 help / color / mirror / Atom feed
From: Max Gurtovoy <mgurtovoy@nvidia.com>
To: Yi Zhang <yi.zhang@redhat.com>, Sagi Grimberg <sagi@grimberg.me>
Cc: Max Gurtovoy <maxg@mellanox.com>,
	"open list:NVM EXPRESS DRIVER" <linux-nvme@lists.infradead.org>,
	RDMA mailing list <linux-rdma@vger.kernel.org>
Subject: Re: [bug report] NVMe/IB: reset_controller need more than 1min
Date: Sun, 20 Mar 2022 12:50:50 +0200	[thread overview]
Message-ID: <3d3f7b64-1c74-d41c-4c60-055e9fa79080@nvidia.com> (raw)
In-Reply-To: <CAHj4cs_ff3TGnD2QJSzx3QJQKc1HkF=TJkh_MokqGK3n8NWyQQ@mail.gmail.com>


On 3/19/2022 9:29 AM, Yi Zhang wrote:
> On Wed, Mar 16, 2022 at 11:16 PM Sagi Grimberg <sagi@grimberg.me> wrote:
>>
>>>> Hi Yi Zhang,
>>>>
>>>> thanks for testing the patches.
>>>>
>>>> Can you provide more info on the time it took with both kernels ?
>>> Hi Max
>>> Sorry for the late response, here are the test results/dmesg on
>>> debug/non-debug kernel with your patch:
>>> debug kernel: timeout
>>> # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
>>> real    0m16.956s
>>> user    0m0.000s
>>> sys     0m0.237s
>>> # time nvme reset /dev/nvme0
>>> real    1m33.623s
>>> user    0m0.000s
>>> sys     0m0.024s
>>> # time nvme disconnect-all
>>> real    1m26.640s
>>> user    0m0.000s
>>> sys     0m9.969s
>>>
>>> host dmesg:
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2F8T3Lqtkn&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cc89cc47d8acf4ef3256408da097a3305%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637832717692265478%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=qtZ8E6cvHlSu8LbUkBa0ehhguyQRfP%2B%2BC8BEonDNj9Y%3D&amp;reserved=0
>>> target dmesg:
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2FKpFP7xG2&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cc89cc47d8acf4ef3256408da097a3305%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637832717692265478%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=DerGWqQmWm9C30FFGbb5AcU%2B%2BrBErKClXzFlqSJT7jw%3D&amp;reserved=0
>>>
>>> non-debug kernel: no timeout issue, but still 12s for reset, and 8s
>>> for disconnect
>>> host:
>>> # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
>>>
>>> real    0m4.579s
>>> user    0m0.000s
>>> sys     0m0.004s
>>> # time nvme reset /dev/nvme0
>>>
>>> real    0m12.778s
>>> user    0m0.000s
>>> sys     0m0.006s
>>> # time nvme reset /dev/nvme0
>>>
>>> real    0m12.793s
>>> user    0m0.000s
>>> sys     0m0.006s
>>> # time nvme reset /dev/nvme0
>>>
>>> real    0m12.808s
>>> user    0m0.000s
>>> sys     0m0.006s
>>> # time nvme disconnect-all
>>>
>>> real    0m8.348s
>>> user    0m0.000s
>>> sys     0m0.189s
>> These are very long times for a non-debug kernel...
>> Max, do you see the root cause for this?
>>
>> Yi, does this happen with rxe/siw as well?
> Hi Sagi
>
> rxe/siw will take less than 1s
> with rdma_rxe
> # time nvme reset /dev/nvme0
> real 0m0.094s
> user 0m0.000s
> sys 0m0.006s
>
> with siw
> # time nvme reset /dev/nvme0
> real 0m0.097s
> user 0m0.000s
> sys 0m0.006s
>
> This is only reproducible with mlx IB card, as I mentioned before, the
> reset operation time changed from 3s to 12s after the below commit,
> could you check this commit?
>
> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc
> Author: Max Gurtovoy <maxg@mellanox.com>
> Date:   Tue May 19 17:05:56 2020 +0300
>
>      nvme-rdma: add metadata/T10-PI support
>
I couldn't repro these long reset times.

Nevertheless, the above commit added T10-PI offloads.

In this commit, for supported devices we create extra resources in HW 
(more memory keys per task).

I suggested doing this configuration as part of the "nvme connect" 
command and save this resource allocation by default but during the 
review I was asked to make it the default behavior.

Sagi/Christoph,

WDYT ? should we reconsider the "nvme connect --with_metadata" option ?

>

  reply	other threads:[~2022-03-20 10:51 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-21 16:12 [bug report] NVMe/IB: reset_controller need more than 1min Yi Zhang
2021-05-21 16:12 ` Yi Zhang
2021-05-21 18:00 ` Sagi Grimberg
2021-05-21 18:00   ` Sagi Grimberg
2021-05-22  4:27   ` Yi Zhang
2021-05-22  4:27     ` Yi Zhang
2021-06-23 10:01     ` Yi Zhang
2021-06-23 10:01       ` Yi Zhang
2021-06-23 21:32       ` Sagi Grimberg
2021-06-23 21:32         ` Sagi Grimberg
2021-06-24 16:14         ` Yi Zhang
2021-06-24 16:14           ` Yi Zhang
2021-12-11  3:01           ` Yi Zhang
2021-12-12  9:45             ` Sagi Grimberg
2021-12-13  6:12               ` Yi Zhang
2021-12-13  9:04                 ` Sagi Grimberg
2021-12-13 17:05                   ` Yi Zhang
2021-12-14 10:39                     ` Sagi Grimberg
2021-12-14 12:00                       ` Max Gurtovoy
2021-12-15  1:15                         ` Yi Zhang
2021-12-15 12:10                           ` Max Gurtovoy
2021-12-16  2:18                             ` Yi Zhang
2021-12-16 13:21                               ` Max Gurtovoy
2021-12-16 16:32                                 ` Yi Zhang
2021-12-16 17:33                                   ` Haakon Bugge
2021-12-17  7:03                                     ` Yi Zhang
2021-12-17 11:19                                       ` Haakon Bugge
2022-02-14  9:47                                         ` Yi Zhang
2022-02-14 11:00                                           ` Chaitanya Kulkarni
2022-02-14 11:32                                           ` Sagi Grimberg
2022-02-14 12:11                                             ` Max Gurtovoy
2022-02-15 13:52                                               ` Yi Zhang
2022-02-15 14:30                                                 ` Max Gurtovoy
2022-02-21 10:00                                                   ` Yi Zhang
2022-02-23 10:04                                                     ` Max Gurtovoy
2022-02-23 10:30                                                       ` Sagi Grimberg
2022-02-23 11:20                                                         ` Max Gurtovoy
2022-03-01  0:06                                                       ` Yi Zhang
2022-03-16 15:16                                                         ` Sagi Grimberg
2022-03-19  7:29                                                           ` Yi Zhang
2022-03-20 10:50                                                             ` Max Gurtovoy [this message]
2022-03-20 13:03                                                               ` Sagi Grimberg
2022-03-20 15:11                                                                 ` Max Gurtovoy
2022-03-21  9:28                                                                   ` Sagi Grimberg
2022-03-21 12:11                                                                     ` Max Gurtovoy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3d3f7b64-1c74-d41c-4c60-055e9fa79080@nvidia.com \
    --to=mgurtovoy@nvidia.com \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maxg@mellanox.com \
    --cc=sagi@grimberg.me \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.