From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 715A2C433F5 for ; Tue, 21 Sep 2021 02:17:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 57B7D60F9D for ; Tue, 21 Sep 2021 02:17:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346017AbhIUCRa (ORCPT ); Mon, 20 Sep 2021 22:17:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236312AbhIUBuE (ORCPT ); Mon, 20 Sep 2021 21:50:04 -0400 Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A0EAC06AB09 for ; Mon, 20 Sep 2021 14:55:22 -0700 (PDT) Received: by mail-pg1-x529.google.com with SMTP id n18so18748374pgm.12 for ; Mon, 20 Sep 2021 14:55:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=amBkTo8paIO6KKoyr0sD3I13KXEgBWAgALAI9bRUOrY=; b=XL3OFLcRkLeLa9eXRmHa7cKmBwEZybbc9mdJnrjPUsNmyhIp9vj8F3iNcHNSqXTb1K L9OEBMzu/TTeVT/s4PldqIkTZ0HgtlJEgIwwbfok6X1NLdtuYuyvrzXuuS1QyCQHhDf3 FY2aphwijxMkpTX43r3twUk5IdFvYJi1wu5sZUQ0IQCbDLg80wgE3Q0dm575Bea3zf1f JfQJ5ifnA2ISAORhBlZA82ZPIyQ2zFi5MELZoyko92jJrrmD7LWhkltsG8XSY3PQ3mqC YdMiNP+ggNC5rUbR8bzVM/xfxgJFK13EPKdACqQQqNx9Bpea47bYvjeXMJ6UV3MCOnZ/ JrFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=amBkTo8paIO6KKoyr0sD3I13KXEgBWAgALAI9bRUOrY=; b=hCq7nM564+WTJpkfr9IUIFalRxPo1ZtjhE3EE+gv43qA1ZVsjWfjGd2fUvYoHeUV7T Lt3GNTNz9uYA8M8dRdTDtFfvGQKKkDbVG/rTwXUz+P0efqkGMrVuF0AomY4McniWy8ep 4AcA8jnDjcMxCv6i8dBJuKpVxzi7TesJowarrtlOywLHl9Ax/GjxwQuIM57+jLR5dsy4 LYEmxWYGuLLaqQyyfiO2kA83fNcUWETN0dsz/POEw5N0Lj+qi9+XOZz9ksv2/SemPqNt jpUaZfMyMB//ANCVAnTwegnPyEPHCe6Wl2xNaJb3/Za0piq/H9KIqRaT74ujCFd5tQNJ u8kw== X-Gm-Message-State: AOAM533HmJ2+y9ImOn58HsY50cjEPpMcotRalAshOTfBXRkqIuctsDoj uyAv+/vO0b8hqgZxgyT13xTiLyR7M1IaA8eYnDN+4Q== X-Google-Smtp-Source: ABdhPJwBWi/pa4o5rVuzlrO7QoclxHzSHljmGejbBs+BPyI0/4Ad8yRwI26n/BXik+g+dmpD6uwLMi2rTRHApy0oOl0= X-Received: by 2002:a63:1e0e:: with SMTP id e14mr25231633pge.5.1632174921576; Mon, 20 Sep 2021 14:55:21 -0700 (PDT) MIME-Version: 1.0 References: <20210918050430.3671227-1-mcgrof@kernel.org> <20210918050430.3671227-10-mcgrof@kernel.org> In-Reply-To: From: Dan Williams Date: Mon, 20 Sep 2021 14:55:10 -0700 Message-ID: Subject: Re: [PATCH v7 09/12] sysfs: fix deadlock race with module removal To: Luis Chamberlain Cc: Tejun Heo , Greg KH , Andrew Morton , Minchan Kim , jeyu@kernel.org, shuah , Randy Dunlap , "Rafael J. Wysocki" , Masahiro Yamada , Nick Desaulniers , yzaikin@google.com, Nathan Chancellor , ojeda@kernel.org, Tetsuo Handa , vitor@massaru.org, elver@google.com, Jarkko Sakkinen , Alexander Potapenko , rf@opensource.cirrus.com, Stephen Hemminger , David Laight , bvanassche@acm.org, jolsa@kernel.org, Andy Shevchenko , trishalfonso@google.com, andreyknvl@gmail.com, Jiri Kosina , mbenes@suse.com, Nitin Gupta , Sergey Senozhatsky , Reinette Chatre , Fenghua Yu , Borislav Petkov , X86 ML , "H. Peter Anvin" , lizefan.x@bytedance.com, Johannes Weiner , Daniel Vetter , Bjorn Helgaas , =?UTF-8?Q?Krzysztof_Wilczy=C5=84ski?= , senozhatsky@chromium.org, Christoph Hellwig , Joe Perches , hkallweit1@gmail.com, Jens Axboe , Josh Poimboeuf , Thomas Gleixner , Kees Cook , Steven Rostedt , Peter Zijlstra , linux-spdx@vger.kernel.org, Linux Doc Mailing List , linux-block@vger.kernel.org, linux-fsdevel , linux-kselftest@vger.kernel.org, cgroups@vger.kernel.org, Linux Kernel Mailing List , copyleft-next@lists.fedorahosted.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org On Mon, Sep 20, 2021 at 2:17 PM Luis Chamberlain wrote: > > On Mon, Sep 20, 2021 at 01:52:21PM -0700, Dan Williams wrote: > > On Fri, Sep 17, 2021 at 10:05 PM Luis Chamberlain wrote: > > > This deadlock was first reported with the zram driver, however the live > > > patching folks have acknowledged they have observed this as well with > > > live patching, when a live patch is removed. I was then able to > > > reproduce easily by creating a dedicated selftests. > > > > > > A sketch of how this can happen follows: > > > > > > CPU A CPU B > > > whatever_store() > > > module_unload > > > mutex_lock(foo) > > > mutex_lock(foo) > > > del_gendisk(zram->disk); > > > device_del() > > > device_remove_groups() > > > > This flow seems possible to trigger with: > > > > echo $dev > /sys/bus/$bus/drivers/$driver/unbind > > > > I am missing why module pinning > > The aspect of try_module_get() which comes to value to prevent the > deadlock is it ensures kernfs ops do not run once exit is on the way. > > > is part of the solution when it's the > > device_del() path that is racing? > > But its not, the device_del() path will yield until the kernfs op > completes. It is fine to wait there. > > The deadlock happens if a module exit routine uses a lock which is > also used on a sysfs op. If the lock was first held by module exit, > and module exit is waiting for the kernfs op to complete, and the > kernfs op is waiting to hold the same lock then the exit will wait > forever. > > > Module removal is just a more coarse > > grained way to trigger unbind => device_del(). > > Right, but the device_del() path is not sharing a lock with the sysfs op. The deadlock in the example comes from holding a lock over device_del() that is also taken in a kernfs op. For example, the code above looks like something that runs from driver.remove(), not exclusively module_exit(). Yes, module_exit() may trigger driver.remove() via driver_unregister(), but there's other ways to trigger driver.remove() that do not involve module_exit(). > > Isn't the above a bug > > in the driver, not missing synchronization in kernfs? > > We can certainly take the position as an alternative: > > "thou shalt not use a lock on exit which is also used on a syfs op" > > However that seems counter intuitive, specially if we can resolve the > issue easily with a try_module_get(). Again, I don't see how try_module_get() can affect the ABBA failure case of holding a lock over device_del() that is also held inside sysfs op. I agree that the problem is subtle. Does lockdep not complain about this case? If it's going to be avoided in the core it seems try_module_get() does not completely cover the hole that unsuspecting driver writers might fall into.