From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D10F4C433E2 for ; Mon, 31 Aug 2020 18:25:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A46A32067C for ; Mon, 31 Aug 2020 18:25:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729360AbgHaSZ2 (ORCPT ); Mon, 31 Aug 2020 14:25:28 -0400 Received: from netrider.rowland.org ([192.131.102.5]:43663 "HELO netrider.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1726174AbgHaSZ1 (ORCPT ); Mon, 31 Aug 2020 14:25:27 -0400 Received: (qmail 560428 invoked by uid 1000); 31 Aug 2020 14:25:26 -0400 Date: Mon, 31 Aug 2020 14:25:26 -0400 From: Alan Stern To: Bart Van Assche Cc: linux-scsi@vger.kernel.org, Martin Kepplinger , Can Guo Subject: Re: [PATCH RFC 6/6] block, scsi, ide: Only submit power management requests in state RPM_SUSPENDED Message-ID: <20200831182526.GA558270@rowland.harvard.edu> References: <20200831025357.32700-1-bvanassche@acm.org> <20200831025357.32700-7-bvanassche@acm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200831025357.32700-7-bvanassche@acm.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org On Sun, Aug 30, 2020 at 07:53:57PM -0700, Bart Van Assche wrote: > Recently Martin Kepplinger reported a problem with the SCSI runtime PM > code. Alan Stern root-caused that deadlock as follows: > > Thread 0 Thread 1 > -------- -------- > Start runtime suspend > blk_pre_runtime_suspend calls > blk_set_pm_only and sets > q->rpm_status to RPM_SUSPENDING > > Call sd_open -> ... -> scsi_test_unit_ready > -> __scsi_execute -> ... > -> blk_queue_enter > Sees BLK_MQ_REQ_PREEMPT set and > RPM_SUSPENDING queue status, so does > not postpone the request > > blk_post_runtime_suspend sets > q->rpm_status to RPM_SUSPENDED > The drive goes into runtime suspend > > Issues the TEST UNIT READY request > Request fails because the drive is suspended > > Fix that deadlock by only accepting power management requests while > suspended. Remove flag RQF_PREEMPT. Let me clarify this description. Firstly, the second-to-last sentence is ambiguous. The word "only" is all too easy to misuse through carelessness. In this case you meant to say "by accepting only power management requests while suspended", but what you actually wrote was equivalent to "by accepting power management requests only while suspended". And as it happens, both meanings are incorrect because we don't want to accept _any_ requests while the device is suspended -- not even power-management requests. The description should have said "by postponing all non-power-management requests while the device is suspending, suspended, or resuming." Secondly, the scenario described above is not a deadlock; it is a race leading to a command failure. Namely, the thread setting q->rpm_status to RPM_SUSPENDED races with the thread testing q->rpm_status. Thirdly, this race is _not_ the problem that Martin encountered. His problem was a simple bug (failure to postpone a request), not a race or a deadlock. Fourthly, the race illustrated above is, for now, theoretical. It cannot occur with the existing code base (mostly because the existing code is buggy). The advantage of your series over the patch I submitted on Aug. 23 ("block: Fix bug in runtime-resume handling") is that it fixes Martin's problem without introducing this new race. Alan Stern