From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 528C9C433F5 for ; Tue, 1 Mar 2022 00:06:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229603AbiCAAHG (ORCPT ); Mon, 28 Feb 2022 19:07:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39164 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229490AbiCAAHG (ORCPT ); Mon, 28 Feb 2022 19:07:06 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 14C848BF54 for ; Mon, 28 Feb 2022 16:06:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646093182; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9r0dAhRjqtyEd4NJeg51OL+Q4X6eROFSL4oCKDCHoLo=; b=AJhGcx39BNo7wAPY33gt4XgVqOg4ekPo9DF6kBSkWq0yuf/DYexJMofC1H89x9OZ/VKago nzWKqmK+iZg2C2Pmgq2xZHoajDsB3LZPUH/ZAQXkNXyFtk1QO7Af3j8DgKrrsx+xA8ZWwI 8guYD5wGKCfkoGMy0mP3hkrE2WZEezk= Received: from mail-pf1-f200.google.com (mail-pf1-f200.google.com [209.85.210.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-488-WbkaHcKAPPa-aeS1gCQgwg-1; Mon, 28 Feb 2022 19:06:20 -0500 X-MC-Unique: WbkaHcKAPPa-aeS1gCQgwg-1 Received: by mail-pf1-f200.google.com with SMTP id a22-20020aa79716000000b004e16e3cc5fcso8600195pfg.11 for ; Mon, 28 Feb 2022 16:06:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9r0dAhRjqtyEd4NJeg51OL+Q4X6eROFSL4oCKDCHoLo=; b=AeaMhCM8KM0vKoGOtrdM/6Fx3+u6IBvokiQGTnsmCICE9cRSINSn7rZ2Ruzzflw1PW jzIVAe3v6rqaPS2gOt2mT4jPVy/RPPCccTh1HVEq9S/HXWesRD9VyF9GKOP7ZK7iQMGw xPsNyu9ZEizvKnmV8zOvV2L6rqlOLKPY9uTAQ3QS0ihuILD3s40HwrwnVXoWojZUvN3r 13i6vpvb4U8cu2RPRi0KjQcy6xlxZcLiKAeB6HZOmbbzl0vMY3bvfZO+OIBPJfRpFMmg C4+WTDx2x9CLuyoLml4tsORNRiG/deAThZdVE/ajvoEcUir55kP/6dOMuFMZ87ofM3WY 3eOg== X-Gm-Message-State: AOAM533KDwhmHf0q6KoOVubPQ6BSDdPUix7d16vaA9YzKkfniIXWm4zJ XUaAgXR8JNOVVSoWxl7lPMa3aAkgg9Kz38mBcoF7rynn4bO2Qn4XFFROyZYzef0Zq4HJlqhGliX wYhtnax96bMUCAbw1Oc+/hiFDo0L2ogzURFOwUA== X-Received: by 2002:a63:7150:0:b0:372:e0e0:f1a4 with SMTP id b16-20020a637150000000b00372e0e0f1a4mr19272005pgn.507.1646093179410; Mon, 28 Feb 2022 16:06:19 -0800 (PST) X-Google-Smtp-Source: ABdhPJzaVTGdWuPPIH5cmpDafn9OiK/Bs8DFAs+J48mTpKqMyfqZCi565KVv+/qQs09tnHxhA4CsrKSsYiY+qZXPOR8= X-Received: by 2002:a63:7150:0:b0:372:e0e0:f1a4 with SMTP id b16-20020a637150000000b00372e0e0f1a4mr19271978pgn.507.1646093179085; Mon, 28 Feb 2022 16:06:19 -0800 (PST) MIME-Version: 1.0 References: <162ec7c5-9483-3f53-bd1c-502ff5ac9f87@nvidia.com> <3292547e-2453-0320-c2e7-e17dbc20bbdd@nvidia.com> <2D31D2FB-BC4B-476A-9717-C02E84542DFA@oracle.com> <4BB6D957-6C18-4E58-A622-0880007ECD9F@oracle.com> <6347079f-ec3b-c2e5-bb3b-43b539d6d3f1@nvidia.com> In-Reply-To: <6347079f-ec3b-c2e5-bb3b-43b539d6d3f1@nvidia.com> From: Yi Zhang Date: Tue, 1 Mar 2022 08:06:07 +0800 Message-ID: Subject: Re: [bug report] NVMe/IB: reset_controller need more than 1min To: Max Gurtovoy , Max Gurtovoy Cc: "open list:NVM EXPRESS DRIVER" , RDMA mailing list , Sagi Grimberg Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On Wed, Feb 23, 2022 at 6:04 PM Max Gurtovoy wrote: > > Hi Yi Zhang, > > thanks for testing the patches. > > Can you provide more info on the time it took with both kernels ? Hi Max Sorry for the late response, here are the test results/dmesg on debug/non-debug kernel with your patch: debug kernel: timeout # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn real 0m16.956s user 0m0.000s sys 0m0.237s # time nvme reset /dev/nvme0 real 1m33.623s user 0m0.000s sys 0m0.024s # time nvme disconnect-all real 1m26.640s user 0m0.000s sys 0m9.969s host dmesg: https://pastebin.com/8T3Lqtkn target dmesg: https://pastebin.com/KpFP7xG2 non-debug kernel: no timeout issue, but still 12s for reset, and 8s for disconnect host: # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn real 0m4.579s user 0m0.000s sys 0m0.004s # time nvme reset /dev/nvme0 real 0m12.778s user 0m0.000s sys 0m0.006s # time nvme reset /dev/nvme0 real 0m12.793s user 0m0.000s sys 0m0.006s # time nvme reset /dev/nvme0 real 0m12.808s user 0m0.000s sys 0m0.006s # time nvme disconnect-all real 0m8.348s user 0m0.000s sys 0m0.189s > > The patches don't intend to decrease this time but re-start the KA in > early stage - as soon as we create the AQ. > > I guess we need to debug it offline. > > On 2/21/2022 12:00 PM, Yi Zhang wrote: > > Hi Max > > > > The patch fixed the timeout issue when I use one non-debug kernel, > > but when I tested on debug kernel with your patches, the timeout still > > can be triggered with "nvme reset/nvme disconnect-all" operations. > > > > On Tue, Feb 15, 2022 at 10:31 PM Max Gurtovoy wrote: > >> Thanks Yi Zhang. > >> > >> Few years ago I've sent some patches that were supposed to fix the KA > >> mechanism but eventually they weren't accepted. > >> > >> I haven't tested it since but maybe you can run some tests with it. > >> > >> The attached patches are partial and include only rdma transport for > >> your testing. > >> > >> If it work for you we can work on it again and argue for correctness. > >> > >> Please don't use the workaround we suggested earlier with these patches. > >> > >> -Max. > >> > >> On 2/15/2022 3:52 PM, Yi Zhang wrote: > >>> Hi Sagi/Max > >>> > >>> Changing the value to 10 or 15 fixed the timeout issue. > >>> And the reset operation still needs more than 12s on my environment, I > >>> also tried disabling the pi_enable, the reset operation will be back > >>> to 3s, so seems the added 9s was due to the PI enabled code path. > >>> > >>> On Mon, Feb 14, 2022 at 8:12 PM Max Gurtovoy wrote: > >>>> On 2/14/2022 1:32 PM, Sagi Grimberg wrote: > >>>>>> Hi Sagi/Max > >>>>>> Here are more findings with the bisect: > >>>>>> > >>>>>> The time for reset operation changed from 3s[1] to 12s[2] after > >>>>>> commit[3], and after commit[4], the reset operation timeout at the > >>>>>> second reset[5], let me know if you need any testing for it, thanks. > >>>>> Does this at least eliminate the timeout? > >>>>> -- > >>>>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h > >>>>> index a162f6c6da6e..60e415078893 100644 > >>>>> --- a/drivers/nvme/host/nvme.h > >>>>> +++ b/drivers/nvme/host/nvme.h > >>>>> @@ -25,7 +25,7 @@ extern unsigned int nvme_io_timeout; > >>>>> extern unsigned int admin_timeout; > >>>>> #define NVME_ADMIN_TIMEOUT (admin_timeout * HZ) > >>>>> > >>>>> -#define NVME_DEFAULT_KATO 5 > >>>>> +#define NVME_DEFAULT_KATO 10 > >>>>> > >>>>> #ifdef CONFIG_ARCH_NO_SG_CHAIN > >>>>> #define NVME_INLINE_SG_CNT 0 > >>>>> -- > >>>>> > >>>> or for the initial test you can use --keep-alive-tmo=<10 or 15> flag in > >>>> the connect command > >>>> > >>>>>> [1] > >>>>>> # time nvme reset /dev/nvme0 > >>>>>> > >>>>>> real 0m3.049s > >>>>>> user 0m0.000s > >>>>>> sys 0m0.006s > >>>>>> [2] > >>>>>> # time nvme reset /dev/nvme0 > >>>>>> > >>>>>> real 0m12.498s > >>>>>> user 0m0.000s > >>>>>> sys 0m0.006s > >>>>>> [3] > >>>>>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD) > >>>>>> Author: Max Gurtovoy > >>>>>> Date: Tue May 19 17:05:56 2020 +0300 > >>>>>> > >>>>>> nvme-rdma: add metadata/T10-PI support > >>>>>> > >>>>>> [4] > >>>>>> commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD) > >>>>>> Author: Hannes Reinecke > >>>>>> Date: Fri Apr 16 13:46:20 2021 +0200 > >>>>>> > >>>>>> nvme: sanitize KATO setting- > >>>>> This change effectively changed the keep-alive timeout > >>>>> from 15 to 5 and modified the host to send keepalives every > >>>>> 2.5 seconds instead of 5. > >>>>> > >>>>> I guess that in combination that now it takes longer to > >>>>> create and delete rdma resources (either qps or mrs) > >>>>> it starts to timeout in setups where there are a lot of > >>>>> queues. > > > > > -- Best Regards, Yi Zhang