From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2B9CC433FE for ; Thu, 2 Sep 2021 21:50:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5968C610E6 for ; Thu, 2 Sep 2021 21:50:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5968C610E6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id EC1F56B007D; Thu, 2 Sep 2021 17:50:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E48B96B007E; Thu, 2 Sep 2021 17:50:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CC2698D0001; Thu, 2 Sep 2021 17:50:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0030.hostedemail.com [216.40.44.30]) by kanga.kvack.org (Postfix) with ESMTP id BCDA16B007D for ; Thu, 2 Sep 2021 17:50:19 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 63AE22AF08 for ; Thu, 2 Sep 2021 21:50:19 +0000 (UTC) X-FDA: 78543977358.16.95B2CD1 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf04.hostedemail.com (Postfix) with ESMTP id 1690C5000097 for ; Thu, 2 Sep 2021 21:50:18 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E9842610D2; Thu, 2 Sep 2021 21:50:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1630619418; bh=tqaA7XDrR+7FTZxHRG8A/mlgf13gkLOrvDPmgum5ztU=; h=Date:From:To:Subject:In-Reply-To:From; b=bkKUW9waVuX1jbeKC+O7xCEwhrNLcI7z5GzhMSDMWiev6wqoPzLfOzalKdFan3zVd nBFbIC0npsEBBT0f78nWsjeYedWc5tzuhu0oAFGrbgulUTzgtRaoqvoG8Cf5UgNNtz U34KNRkaL87gS9LMESmVjbbJxrE+eJcEwcPGaQYg= Date: Thu, 02 Sep 2021 14:50:17 -0700 From: Andrew Morton To: akpm@linux-foundation.org, gechangwei@live.cn, ghe@suse.com, jlbec@evilplan.org, joseph.qi@linux.alibaba.com, junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com, mm-commits@vger.kernel.org, piaojun@huawei.com, torvalds@linux-foundation.org Subject: [patch 007/212] ocfs2: ocfs2_downconvert_lock failure results in deadlock Message-ID: <20210902215017.n_kxnxsoA%akpm@linux-foundation.org> In-Reply-To: <20210902144820.78957dff93d7bea620d55a89@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Rspamd-Queue-Id: 1690C5000097 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=bkKUW9wa; dmarc=none; spf=pass (imf04.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-Rspamd-Server: rspam01 X-Stat-Signature: 5z4it6itrm5sarfns6qig8cg1fg9hbba X-HE-Tag: 1630619418-169180 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Gang He Subject: ocfs2: ocfs2_downconvert_lock failure results in deadlock Usually, ocfs2_downconvert_lock() function always downconverts dlm lock to the expected level for satisfy dlm bast requests from the other nodes. But there is a rare situation. When dlm lock conversion is being canceled, ocfs2_downconvert_lock() function will return -EBUSY. You need to be aware that ocfs2_cancel_convert() function is asynchronous in fsdlm implementation. If we does not requeue this lockres entry, ocfs2 downconvert thread no longer handles this dlm lock bast request. Then, the other nodes will not get the dlm lock again, the current node's process will be blocked when acquire this dlm lock again. Link: https://lkml.kernel.org/r/20210830044621.12544-1-ghe@suse.com Signed-off-by: Gang He Reviewed-by: Joseph Qi Cc: Mark Fasheh Cc: Joel Becker Cc: Junxiao Bi Cc: Changwei Ge Cc: Gang He Cc: Jun Piao Signed-off-by: Andrew Morton --- fs/ocfs2/dlmglue.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) --- a/fs/ocfs2/dlmglue.c~ocfs2-ocfs2_downconvert_lock-failure-results-in-deadlock +++ a/fs/ocfs2/dlmglue.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include @@ -3912,6 +3913,17 @@ downconvert: spin_unlock_irqrestore(&lockres->l_lock, flags); ret = ocfs2_downconvert_lock(osb, lockres, new_level, set_lvb, gen); + /* The dlm lock convert is being cancelled in background, + * ocfs2_cancel_convert() is asynchronous in fs/dlm, + * requeue it, try again later. + */ + if (ret == -EBUSY) { + ctl->requeue = 1; + mlog(ML_BASTS, "lockres %s, ReQ: Downconvert busy\n", + lockres->l_name); + ret = 0; + msleep(20); + } leave: if (ret) _