From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78DBEC433EF for ; Sat, 19 Mar 2022 07:29:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242391AbiCSHat (ORCPT ); Sat, 19 Mar 2022 03:30:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236258AbiCSHar (ORCPT ); Sat, 19 Mar 2022 03:30:47 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8147E92D0F for ; Sat, 19 Mar 2022 00:29:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1647674965; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=MtnQkQQa4FWyZNWbj9JSTFrxAsO52s5iMepRv7LRB74=; b=JDR6R1UxDv6QNFJIXX4Ecog0KHdMiXniurTPkUb8cK+NVCeCbK0qJRoEe63l/LcD1Nj5Bd wNJMIq5lL1/+NH2/g1lyvZD9ePO/t+8E2pUlrVmPQV8hLR36rHB98ql41vqCFHkYbRqzPF +hGrBGnkQfwC+CasYYqeK8YZl/TPa4o= Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-192-rDhA2pe7PhqFp4Z8E7okiQ-1; Sat, 19 Mar 2022 03:29:23 -0400 X-MC-Unique: rDhA2pe7PhqFp4Z8E7okiQ-1 Received: by mail-pg1-f199.google.com with SMTP id q13-20020a638c4d000000b003821725ad66so3181304pgn.23 for ; Sat, 19 Mar 2022 00:29:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MtnQkQQa4FWyZNWbj9JSTFrxAsO52s5iMepRv7LRB74=; b=GFKaUyhZWhh+AKdJDY523a6yF6rCmzuaU+fKxhXSr5Bnz55H/RWI6z6r5AigxkNUVU 9xLurmtiTDJUN4bWYU5+7O36tBuDItWUoAy6Isr186BQtwFGMPcD5c/fxPMLwxqzEs6G 16ye55xUgl49Ecz0YjOp4AiSfRGfeuepvJos0HOI40Chz4qBGy5vrw0JqgCtRUllnci7 kKXfhKs4+Zl32/5Py78hssTZweYjdU4rloDWma/CsKtpDbx7RdWNO2x7b6LXv1cxieOW 0ZPlJzDSEareQMqOBEna+Sg4u8AsXCUtul5yhochhlDl4QAh8LKqa05kU5ejlqASbgP4 IE6w== X-Gm-Message-State: AOAM533xTo8j/0IRJPdeQlfbs040Q6azbAkv8LUtM3E2CavDHTaYyza8 5HPFFqGDjNny0drwbsDwRzLzbkwCtNYE7+R07vyC9lfHwZ3A++GmvVpybVDED/MXxIeBj0zKvBy LUO6fq/ejOoe+9CHYmVX3U9HCAFeBbVqMQFtI/w== X-Received: by 2002:a17:90b:33cc:b0:1c6:6012:5647 with SMTP id lk12-20020a17090b33cc00b001c660125647mr17714280pjb.165.1647674962735; Sat, 19 Mar 2022 00:29:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwZBiajyKjPniL5vHn8R/XlXmGH8c65aXx12XtgM9iLtUtANOiWcGdnCRQ0iTevi4kj7fYK59f8qsckAchB6F8= X-Received: by 2002:a17:90b:33cc:b0:1c6:6012:5647 with SMTP id lk12-20020a17090b33cc00b001c660125647mr17714271pjb.165.1647674962480; Sat, 19 Mar 2022 00:29:22 -0700 (PDT) MIME-Version: 1.0 References: <3292547e-2453-0320-c2e7-e17dbc20bbdd@nvidia.com> <2D31D2FB-BC4B-476A-9717-C02E84542DFA@oracle.com> <4BB6D957-6C18-4E58-A622-0880007ECD9F@oracle.com> <6347079f-ec3b-c2e5-bb3b-43b539d6d3f1@nvidia.com> <6d8f4525-f663-18cc-8644-bfddd7d86bd0@grimberg.me> In-Reply-To: <6d8f4525-f663-18cc-8644-bfddd7d86bd0@grimberg.me> From: Yi Zhang Date: Sat, 19 Mar 2022 15:29:10 +0800 Message-ID: Subject: Re: [bug report] NVMe/IB: reset_controller need more than 1min To: Sagi Grimberg Cc: Max Gurtovoy , Max Gurtovoy , "open list:NVM EXPRESS DRIVER" , RDMA mailing list Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On Wed, Mar 16, 2022 at 11:16 PM Sagi Grimberg wrote: > > > >> Hi Yi Zhang, > >> > >> thanks for testing the patches. > >> > >> Can you provide more info on the time it took with both kernels ? > > > > Hi Max > > Sorry for the late response, here are the test results/dmesg on > > debug/non-debug kernel with your patch: > > debug kernel: timeout > > # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn > > real 0m16.956s > > user 0m0.000s > > sys 0m0.237s > > # time nvme reset /dev/nvme0 > > real 1m33.623s > > user 0m0.000s > > sys 0m0.024s > > # time nvme disconnect-all > > real 1m26.640s > > user 0m0.000s > > sys 0m9.969s > > > > host dmesg: > > https://pastebin.com/8T3Lqtkn > > target dmesg: > > https://pastebin.com/KpFP7xG2 > > > > non-debug kernel: no timeout issue, but still 12s for reset, and 8s > > for disconnect > > host: > > # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn > > > > real 0m4.579s > > user 0m0.000s > > sys 0m0.004s > > # time nvme reset /dev/nvme0 > > > > real 0m12.778s > > user 0m0.000s > > sys 0m0.006s > > # time nvme reset /dev/nvme0 > > > > real 0m12.793s > > user 0m0.000s > > sys 0m0.006s > > # time nvme reset /dev/nvme0 > > > > real 0m12.808s > > user 0m0.000s > > sys 0m0.006s > > # time nvme disconnect-all > > > > real 0m8.348s > > user 0m0.000s > > sys 0m0.189s > > These are very long times for a non-debug kernel... > Max, do you see the root cause for this? > > Yi, does this happen with rxe/siw as well? Hi Sagi rxe/siw will take less than 1s with rdma_rxe # time nvme reset /dev/nvme0 real 0m0.094s user 0m0.000s sys 0m0.006s with siw # time nvme reset /dev/nvme0 real 0m0.097s user 0m0.000s sys 0m0.006s This is only reproducible with mlx IB card, as I mentioned before, the reset operation time changed from 3s to 12s after the below commit, could you check this commit? commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc Author: Max Gurtovoy Date: Tue May 19 17:05:56 2020 +0300 nvme-rdma: add metadata/T10-PI support > -- Best Regards, Yi Zhang