From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE957C0650F for ; Mon, 5 Aug 2019 19:14:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9EA8F20B1F for ; Mon, 5 Aug 2019 19:14:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730300AbfHETOF (ORCPT ); Mon, 5 Aug 2019 15:14:05 -0400 Received: from youngberry.canonical.com ([91.189.89.112]:44103 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728870AbfHETOF (ORCPT ); Mon, 5 Aug 2019 15:14:05 -0400 Received: from mail-pf1-f200.google.com ([209.85.210.200]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1huiQk-0006XU-SH for linux-kernel@vger.kernel.org; Mon, 05 Aug 2019 19:14:03 +0000 Received: by mail-pf1-f200.google.com with SMTP id h27so54065614pfq.17 for ; Mon, 05 Aug 2019 12:14:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=4dWsHtjse4Kk9Git6Bu5ITVptOPxfe5QlK/iNi9AwCA=; b=piVbfgpKqKGvsnC27JFSpdbvUmClPutqRx0h7TzrMlls6u+STijW+kkhlOi5siP6oX 6oi6Og0JWJYKbCT2CdjNjZ+EnTWJVjL/H9gYrFpJihttqQ6KBeYrgapjXN1y2sHGR9W4 6L97bn5LItOIxfDj8JKTtjCT9q1OCT4X5QDVvL0+jBk2TmP7ig8TvYI75w1mLGyhIP2N GfWqXRvS5etJj8QP4NHEG0UOTIaNjduLeNlZdVFHVDRP1n8mnsI9gjAis41O4HZ05nzu u/RqtrlDt78KRsuM9X5OYJjoAVD7x1idZcBkEOXtoE0hwizojzJBZ1ZDPDC1DYUZd3dt gr5A== X-Gm-Message-State: APjAAAXDdTRTtfvmCwZWJoXim2BU2K34QFd/Kiu21UROzrwUrZrYIwrg qntVgHsm2rrv4r5kI9Vrk48KnPHaOj7li3TKh9cAjPm+Am/igCAFtwXaBYgf5S0demHEBGjXs/5 odhoR49Lll3QahG/jGPHI203q/5Ckcxsn1IFMugNN2g== X-Received: by 2002:a62:1ccd:: with SMTP id c196mr75140049pfc.102.1565032441420; Mon, 05 Aug 2019 12:14:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqzGckPxqY0vwW+zLWOZY31pM4wFsYRr748iKaZ96qdRFeNpy1KaPsLZQAVzoHYtt2hUH/LOmw== X-Received: by 2002:a62:1ccd:: with SMTP id c196mr75140014pfc.102.1565032441028; Mon, 05 Aug 2019 12:14:01 -0700 (PDT) Received: from 2001-b011-380f-37d3-6851-7bc4-3469-2fa7.dynamic-ip6.hinet.net (2001-b011-380f-37d3-6851-7bc4-3469-2fa7.dynamic-ip6.hinet.net. [2001:b011:380f:37d3:6851:7bc4:3469:2fa7]) by smtp.gmail.com with ESMTPSA id b14sm19802255pga.20.2019.08.05.12.13.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 05 Aug 2019 12:14:00 -0700 (PDT) Content-Type: text/plain; charset=utf-8; delsp=yes; format=flowed Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: [Regression] Commit "nvme/pci: Use host managed power state for suspend" has problems From: Kai-Heng Feng In-Reply-To: Date: Tue, 6 Aug 2019 03:13:56 +0800 Cc: Mario Limonciello , Keith Busch , Keith Busch , Christoph Hellwig , Sagi Grimberg , linux-nvme , Linux PM , Linux Kernel Mailing List , Rajat Jain Content-Transfer-Encoding: 8bit Message-Id: <1FA3D56B-80C6-496C-8772-2F773AA8043C@canonical.com> References: <4323ed84dd07474eab65699b4d007aaf@AUSX13MPC105.AMER.DELL.COM> <47415939.KV5G6iaeJG@kreacher> <20190730144134.GA12844@localhost.localdomain> <100ba4aff1c6434a81e47774ab4acddc@AUSX13MPC105.AMER.DELL.COM> <8246360B-F7D9-42EB-94FC-82995A769E28@canonical.com> <20190730191934.GD13948@localhost.localdomain> <7d3e0b8ba1444194a153c93faa1cabb3@AUSX13MPC105.AMER.DELL.COM> <20190730213114.GK13948@localhost.localdomain> <20190731221956.GB15795@localhost.localdomain> <70D536BE-8DC7-4CA2-84A9-AFB067BA520E@canonical.com> <38d4b4b107154454a932781acde0fa5a@AUSX13MPC105.AMER.DELL.COM> <43A8DF53-8463-4314-9E8E-47A7D3C5A709@canonical.com> To: "Rafael J. Wysocki" X-Mailer: Apple Mail (2.3445.104.11) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org at 19:04, Rafael J. Wysocki wrote: > On Fri, Aug 2, 2019 at 12:55 PM Kai-Heng Feng > wrote: >> at 06:26, Rafael J. Wysocki wrote: >> >>> On Thu, Aug 1, 2019 at 9:05 PM wrote: >>>>> -----Original Message----- >>>>> From: Rafael J. Wysocki >>>>> Sent: Thursday, August 1, 2019 12:30 PM >>>>> To: Kai-Heng Feng; Keith Busch; Limonciello, Mario >>>>> Cc: Keith Busch; Christoph Hellwig; Sagi Grimberg; linux-nvme; Linux >>>>> PM; Linux >>>>> Kernel Mailing List; Rajat Jain >>>>> Subject: Re: [Regression] Commit "nvme/pci: Use host managed power >>>>> state for >>>>> suspend" has problems >>>>> >>>>> >>>>> [EXTERNAL EMAIL] >>>>> >>>>> On Thu, Aug 1, 2019 at 11:06 AM Kai-Heng Feng >>>>> wrote: >>>>>> at 06:33, Rafael J. Wysocki wrote: >>>>>> >>>>>>> On Thu, Aug 1, 2019 at 12:22 AM Keith Busch >>>>>>> wrote: >>>>>>>> On Wed, Jul 31, 2019 at 11:25:51PM +0200, Rafael J. Wysocki wrote: >>>>>>>>> A couple of remarks if you will. >>>>>>>>> >>>>>>>>> First, we don't know which case is the majority at this point. For >>>>>>>>> now, there is one example of each, but it may very well turn out >>>>>>>>> that >>>>>>>>> the SK Hynix BC501 above needs to be quirked. >>>>>>>>> >>>>>>>>> Second, the reference here really is 5.2, so if there are any >>>>>>>>> systems >>>>>>>>> that are not better off with 5.3-rc than they were with 5.2, >>>>>>>>> well, we >>>>>>>>> have not made progress. However, if there are systems that are >>>>>>>>> worse >>>>>>>>> off with 5.3, that's bad. In the face of the latest findings the >>>>>>>>> only >>>>>>>>> way to avoid that is to be backwards compatible with 5.2 and that's >>>>>>>>> where my patch is going. That cannot be achieved by quirking all >>>>>>>>> cases that are reported as "bad", because there still may be >>>>>>>>> unreported ones. >>>>>>>> >>>>>>>> I have to agree. I think your proposal may allow PCI D3cold, >>>>>>> >>>>>>> Yes, it may. >>>>>> >>>>>> Somehow the 9380 with Toshiba NVMe never hits SLP_S0 with or without >>>>>> Rafael’s patch. >>>>>> But the “real” s2idle power consumption does improve with the patch. >>>>> >>>>> Do you mean this patch: >>>>> >>>>> https://lore.kernel.org/linux-pm/70D536BE-8DC7-4CA2-84A9- >>>>> AFB067BA520E@canonical.com/T/#m456aa5c69973a3b68f2cdd4713a1ce83be5145 >>>>> 8f >>>>> >>>>> or the $subject one without the above? >>>>> >>>>>> Can we use a DMI based quirk for this platform? It seems like a >>>>>> platform >>>>>> specific issue. >>>>> >>>>> We seem to see too many "platform-specific issues" here. :-) >>>>> >>>>> To me, the status quo (ie. what we have in 5.3-rc2) is not defensible. >>>>> Something needs to be done to improve the situation. >>>> >>>> Rafael, would it be possible to try popping out PC401 from the 9380 and >>>> into a 9360 to >>>> confirm there actually being a platform impact or not? >>> >>> Not really, sorry. >>> >>>> I was hoping to have something useful from Hynix by now before >>>> responding, but oh well. >>>> >>>> In terms of what is the majority, I do know that between folks at Dell, >>>> Google, Compal, >>>> Wistron, Canonical, Micron, Hynix, Toshiba, LiteOn, and Western Digital >>>> we tested a wide >>>> variety of SSDs with this patch series. I would like to think that they >>>> are representative of >>>> what's being manufactured into machines now. >>> >>> Well, what about drives already in the field? My concern is mostly >>> about those ones. >>> >>>> Notably the LiteOn CL1 was tested with the HMB flushing support and >>>> and Hynix PC401 was tested with older firmware though. >>>> >>>>>>>> In which case we do need to reintroduce the HMB handling. >>>>>>> >>>>>>> Right. >>>>>> >>>>>> The patch alone doesn’t break HMB Toshiba NVMe I tested. But I think >>>>>> it’s >>>>>> still safer to do proper HMB handling. >>>>> >>>>> Well, so can anyone please propose something specific? Like an >>>>> alternative patch? >>>> >>>> This was proposed a few days ago: >>>> http://lists.infradead.org/pipermail/linux-nvme/2019-July/026056.html >>>> >>>> However we're still not sure why it is needed, and it will take some >>>> time to get >>>> a proper failure analysis from LiteOn regarding the CL1. >>> >>> Thanks for the update, but IMO we still need to do something before >>> final 5.3 while the investigation continues. >>> >>> Honestly, at this point I would vote for going back to the 5.2 >>> behavior at least by default and only running the new code on the >>> drives known to require it (because they will block PC10 otherwise). >>> >>> Possibly (ideally) with an option for users who can't get beyond PC3 >>> to test whether or not the new code helps them. >> >> I just found out that the XPS 9380 at my hand never reaches SLP_S0 but >> only >> PC10. > > That's the case for me too. > >> This happens with or without putting the device to D3. > > On my system, though, it only can get to PC3 without putting the NVMe > into D3 (as reported previously). I forgot to ask, what BIOS version does the system have? I don’t see this issue on BIOS v1.5.0. Kai-Heng From mboxrd@z Thu Jan 1 00:00:00 1970 From: kai.heng.feng@canonical.com (Kai-Heng Feng) Date: Tue, 6 Aug 2019 03:13:56 +0800 Subject: [Regression] Commit "nvme/pci: Use host managed power state for suspend" has problems In-Reply-To: References: <4323ed84dd07474eab65699b4d007aaf@AUSX13MPC105.AMER.DELL.COM> <47415939.KV5G6iaeJG@kreacher> <20190730144134.GA12844@localhost.localdomain> <100ba4aff1c6434a81e47774ab4acddc@AUSX13MPC105.AMER.DELL.COM> <8246360B-F7D9-42EB-94FC-82995A769E28@canonical.com> <20190730191934.GD13948@localhost.localdomain> <7d3e0b8ba1444194a153c93faa1cabb3@AUSX13MPC105.AMER.DELL.COM> <20190730213114.GK13948@localhost.localdomain> <20190731221956.GB15795@localhost.localdomain> <70D536BE-8DC7-4CA2-84A9-AFB067BA520E@canonical.com> <38d4b4b107154454a932781acde0fa5a@AUSX13MPC105.AMER.DELL.COM> <43A8DF53-8463-4314-9E8E-47A7D3C5A709@canonical.com> Message-ID: <1FA3D56B-80C6-496C-8772-2F773AA8043C@canonical.com> at 19:04, Rafael J. Wysocki wrote: > On Fri, Aug 2, 2019 at 12:55 PM Kai-Heng Feng > wrote: >>@06:26, Rafael J. Wysocki wrote: >> >>> On Thu, Aug 1, 2019@9:05 PM wrote: >>>>> -----Original Message----- >>>>> From: Rafael J. Wysocki >>>>> Sent: Thursday, August 1, 2019 12:30 PM >>>>> To: Kai-Heng Feng; Keith Busch; Limonciello, Mario >>>>> Cc: Keith Busch; Christoph Hellwig; Sagi Grimberg; linux-nvme; Linux >>>>> PM; Linux >>>>> Kernel Mailing List; Rajat Jain >>>>> Subject: Re: [Regression] Commit "nvme/pci: Use host managed power >>>>> state for >>>>> suspend" has problems >>>>> >>>>> >>>>> [EXTERNAL EMAIL] >>>>> >>>>> On Thu, Aug 1, 2019 at 11:06 AM Kai-Heng Feng >>>>> wrote: >>>>>>@06:33, Rafael J. Wysocki wrote: >>>>>> >>>>>>> On Thu, Aug 1, 2019 at 12:22 AM Keith Busch >>>>>>> wrote: >>>>>>>> On Wed, Jul 31, 2019@11:25:51PM +0200, Rafael J. Wysocki wrote: >>>>>>>>> A couple of remarks if you will. >>>>>>>>> >>>>>>>>> First, we don't know which case is the majority at this point. For >>>>>>>>> now, there is one example of each, but it may very well turn out >>>>>>>>> that >>>>>>>>> the SK Hynix BC501 above needs to be quirked. >>>>>>>>> >>>>>>>>> Second, the reference here really is 5.2, so if there are any >>>>>>>>> systems >>>>>>>>> that are not better off with 5.3-rc than they were with 5.2, >>>>>>>>> well, we >>>>>>>>> have not made progress. However, if there are systems that are >>>>>>>>> worse >>>>>>>>> off with 5.3, that's bad. In the face of the latest findings the >>>>>>>>> only >>>>>>>>> way to avoid that is to be backwards compatible with 5.2 and that's >>>>>>>>> where my patch is going. That cannot be achieved by quirking all >>>>>>>>> cases that are reported as "bad", because there still may be >>>>>>>>> unreported ones. >>>>>>>> >>>>>>>> I have to agree. I think your proposal may allow PCI D3cold, >>>>>>> >>>>>>> Yes, it may. >>>>>> >>>>>> Somehow the 9380 with Toshiba NVMe never hits SLP_S0 with or without >>>>>> Rafael?s patch. >>>>>> But the ?real? s2idle power consumption does improve with the patch. >>>>> >>>>> Do you mean this patch: >>>>> >>>>> https://lore.kernel.org/linux-pm/70D536BE-8DC7-4CA2-84A9- >>>>> AFB067BA520E at canonical.com/T/#m456aa5c69973a3b68f2cdd4713a1ce83be5145 >>>>> 8f >>>>> >>>>> or the $subject one without the above? >>>>> >>>>>> Can we use a DMI based quirk for this platform? It seems like a >>>>>> platform >>>>>> specific issue. >>>>> >>>>> We seem to see too many "platform-specific issues" here. :-) >>>>> >>>>> To me, the status quo (ie. what we have in 5.3-rc2) is not defensible. >>>>> Something needs to be done to improve the situation. >>>> >>>> Rafael, would it be possible to try popping out PC401 from the 9380 and >>>> into a 9360 to >>>> confirm there actually being a platform impact or not? >>> >>> Not really, sorry. >>> >>>> I was hoping to have something useful from Hynix by now before >>>> responding, but oh well. >>>> >>>> In terms of what is the majority, I do know that between folks at Dell, >>>> Google, Compal, >>>> Wistron, Canonical, Micron, Hynix, Toshiba, LiteOn, and Western Digital >>>> we tested a wide >>>> variety of SSDs with this patch series. I would like to think that they >>>> are representative of >>>> what's being manufactured into machines now. >>> >>> Well, what about drives already in the field? My concern is mostly >>> about those ones. >>> >>>> Notably the LiteOn CL1 was tested with the HMB flushing support and >>>> and Hynix PC401 was tested with older firmware though. >>>> >>>>>>>> In which case we do need to reintroduce the HMB handling. >>>>>>> >>>>>>> Right. >>>>>> >>>>>> The patch alone doesn?t break HMB Toshiba NVMe I tested. But I think >>>>>> it?s >>>>>> still safer to do proper HMB handling. >>>>> >>>>> Well, so can anyone please propose something specific? Like an >>>>> alternative patch? >>>> >>>> This was proposed a few days ago: >>>> http://lists.infradead.org/pipermail/linux-nvme/2019-July/026056.html >>>> >>>> However we're still not sure why it is needed, and it will take some >>>> time to get >>>> a proper failure analysis from LiteOn regarding the CL1. >>> >>> Thanks for the update, but IMO we still need to do something before >>> final 5.3 while the investigation continues. >>> >>> Honestly, at this point I would vote for going back to the 5.2 >>> behavior at least by default and only running the new code on the >>> drives known to require it (because they will block PC10 otherwise). >>> >>> Possibly (ideally) with an option for users who can't get beyond PC3 >>> to test whether or not the new code helps them. >> >> I just found out that the XPS 9380 at my hand never reaches SLP_S0 but >> only >> PC10. > > That's the case for me too. > >> This happens with or without putting the device to D3. > > On my system, though, it only can get to PC3 without putting the NVMe > into D3 (as reported previously). I forgot to ask, what BIOS version does the system have? I don?t see this issue on BIOS v1.5.0. Kai-Heng