From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB609C433E1 for ; Tue, 21 Jul 2020 15:34:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8F6C5206E9 for ; Tue, 21 Jul 2020 15:34:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="eQ7WI6xk" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8F6C5206E9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id F10D46B0005; Tue, 21 Jul 2020 11:34:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EC1D86B0006; Tue, 21 Jul 2020 11:34:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DB0296B0007; Tue, 21 Jul 2020 11:34:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0116.hostedemail.com [216.40.44.116]) by kanga.kvack.org (Postfix) with ESMTP id C5EBE6B0005 for ; Tue, 21 Jul 2020 11:34:15 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 645438248047 for ; Tue, 21 Jul 2020 15:34:15 +0000 (UTC) X-FDA: 77062479270.29.screw33_190dde726f2e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin29.hostedemail.com (Postfix) with ESMTP id 710541801A833 for ; Tue, 21 Jul 2020 15:33:53 +0000 (UTC) X-HE-Tag: screw33_190dde726f2e X-Filterd-Recvd-Size: 5819 Received: from mail-lj1-f195.google.com (mail-lj1-f195.google.com [209.85.208.195]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Tue, 21 Jul 2020 15:33:52 +0000 (UTC) Received: by mail-lj1-f195.google.com with SMTP id j11so24530964ljo.7 for ; Tue, 21 Jul 2020 08:33:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=njynPEuSAx3YutPUm1Ks7mASPVE5qpU/6FLY7l0S5aI=; b=eQ7WI6xkilW9FmgBXW/xysDF/3IDtBx6ypESMtBihC1eSPJLn5eC8eHS5o6dVcno9s vwH7lLb02YEl6fLuBANrEc7wsinozUQG/BuWIv3nBwdXFniYX41S1JKBigNvoypkkWDt v7e3VdKkBjijF9BamxUZuXGlgb38adI4J9jRE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=njynPEuSAx3YutPUm1Ks7mASPVE5qpU/6FLY7l0S5aI=; b=QlbsNLk4PK3mbTCFs1JFarl7BmrW3W6ZbN9xtBSJXqrgeTBSkknkxRxU4g22hdU4as hN9OqJYEc5IOpSIGqv6Ykbi5gIv6PCOT4ApsFTxROTCoD2QtEBSbuMHmSRomV3M/axam XYeSmLHszyT1+YZprWH81SJzbl8uQiwJvqo23HZtWumrlQDe3YJPnBhx1dvJfcueZgbK 1Y4yX2zgnILkgFrImYdK5H7tCpfIsPoqMll7Ab12+BD7GmNrxh47/jz/bf91yN5Sa8LH Y6Ze+8Bu5pJxlu+z6vTYuCpJy3NWzzvdyOd0VURIFCYlk2pgMD/lxul6Y5KBAUCfZK96 CZ4Q== X-Gm-Message-State: AOAM530H88sANTcrjFy/jDHLx4Emx7Nyd1NK0NEa2gOqEIrznJ3iE8DU GAHQr8OXiiMpSQP1MJiHPPfO2TQ97t8= X-Google-Smtp-Source: ABdhPJyuj2p0n+x98bV+k5j6hOIMJyH3BAOFcN+xtnK6oMdiEkQBRTL8cof771gwTVPfUtENllpzPQ== X-Received: by 2002:a2e:3304:: with SMTP id d4mr11824321ljc.115.1595345630878; Tue, 21 Jul 2020 08:33:50 -0700 (PDT) Received: from mail-lf1-f54.google.com (mail-lf1-f54.google.com. [209.85.167.54]) by smtp.gmail.com with ESMTPSA id u10sm1978184lfo.39.2020.07.21.08.33.49 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 21 Jul 2020 08:33:50 -0700 (PDT) Received: by mail-lf1-f54.google.com with SMTP id y18so11895220lfh.11 for ; Tue, 21 Jul 2020 08:33:49 -0700 (PDT) X-Received: by 2002:ac2:58d5:: with SMTP id u21mr3650183lfo.31.1595345629595; Tue, 21 Jul 2020 08:33:49 -0700 (PDT) MIME-Version: 1.0 References: <20200721063258.17140-1-mhocko@kernel.org> In-Reply-To: <20200721063258.17140-1-mhocko@kernel.org> From: Linus Torvalds Date: Tue, 21 Jul 2020 08:33:33 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page To: Michal Hocko Cc: Linux-MM , LKML , Andrew Morton , Tim Chen , Michal Hocko Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 710541801A833 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 20, 2020 at 11:33 PM Michal Hocko wrote: > > The lockup is in page_unlock in do_read_fault and I suspect that this is > yet another effect of a very long waitqueue chain which has been > addresses by 11a19c7b099f ("sched/wait: Introduce wakeup boomark in > wake_up_page_bit") previously. Hmm. I do not believe that you can actually get to the point where you have a million waiters and it takes 20+ seconds to wake everybody up. More likely, it's actually *caused* by that commit 11a19c7b099f, and what might be happening is that other CPU's are just adding new waiters to the list *while* we're waking things up, because somebody else already got the page lock again. Humor me.. Does something like this work instead? It's whitespace-damaged because of just a cut-and-paste, but it's entirely untested, and I haven't really thought about any memory ordering issues, but I think it's ok. The logic is that anybody who called wake_up_page_bit() _must_ have cleared that bit before that. So if we ever see it set again (and memory ordering doesn't matter), then clearly somebody else got access to the page bit (whichever it was), and we should not (a) waste time waking up people who can't get the bit anyway (b) be in a livelock where other CPU's continually add themselves to the wait queue because somebody else got the bit. and it's that (b) case that I think happens for you. NOTE! Totally UNTESTED patch follows. I think it's good, but maybe somebody sees some problem with this approach? I realize that people can wait for other bits than the unlocked, and if you're waiting for writeback to complete maybe you don't care if somebody else then started writeback *AGAIN* on the page and you'd actually want to be woken up regardless, but honestly, I don't think it really matters. Linus --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1054,6 +1054,15 @@ static void wake_up_page_bit(struct page *page, int bit_nr) * from wait queue */ spin_unlock_irqrestore(&q->lock, flags); + + /* + * If somebody else set the bit again, stop waking + * people up. It's now the responsibility of that + * other page bit owner to do so. + */ + if (test_bit(bit_nr, &page->flags)) + return; + cpu_relax(); spin_lock_irqsave(&q->lock, flags); __wake_up_locked_key_bookmark(q, TASK_NORMAL, &key, &bookmark);