From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D637FC433E1 for ; Sun, 26 Jul 2020 20:41:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 898D120715 for ; Sun, 26 Jul 2020 20:41:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="VgsUmhcZ" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 898D120715 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1C6DC6B0002; Sun, 26 Jul 2020 16:41:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 178906B0003; Sun, 26 Jul 2020 16:41:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 068086B0005; Sun, 26 Jul 2020 16:41:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0244.hostedemail.com [216.40.44.244]) by kanga.kvack.org (Postfix) with ESMTP id E5BCB6B0002 for ; Sun, 26 Jul 2020 16:41:36 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 94C27180AD811 for ; Sun, 26 Jul 2020 20:41:36 +0000 (UTC) X-FDA: 77081397792.26.money83_270bafb26f5b Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 6657F1804B667 for ; Sun, 26 Jul 2020 20:41:36 +0000 (UTC) X-HE-Tag: money83_270bafb26f5b X-Filterd-Recvd-Size: 6106 Received: from mail-lf1-f67.google.com (mail-lf1-f67.google.com [209.85.167.67]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Sun, 26 Jul 2020 20:41:35 +0000 (UTC) Received: by mail-lf1-f67.google.com with SMTP id y18so7863143lfh.11 for ; Sun, 26 Jul 2020 13:41:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=zMAqM19mk6Z9hrysbVvmXBiZmKyuJZ/Q/akYm6Ar5O8=; b=VgsUmhcZ3TmL8PgzB6o7dC+JS3frPjbw6MQDMCMiEzv9oMbMxuQYkleMes2lYNtFRX ZDWnCDaWTL+RvDudxDYAUTOCn1MqyzZQxhxZU0etSaJR+Qj6T+1p4T9CjrG6CK524+F5 FyyAoED45cuGDWZQmmbxBFVESvc0nq6NlSqsQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=zMAqM19mk6Z9hrysbVvmXBiZmKyuJZ/Q/akYm6Ar5O8=; b=WHJ3sbk3WjzogF4LQ0PUsBIQkmPFmj/V7RknMVlIWeqtq5+WQL5q2OmXLDnlP416td EEpEhkhcwK3/QbvwmvPMqPMvSe9ur1aNJWpOdIf2e6oVCJgb+28qBso0bDn5Tbxgk9kn ceKZ8rYBcFuD4UZagNXdd3wfyuKoeck7n1mM+O2BcNTGQJezVST0BO6fOHSUdD2/5skH s3u/AOCIaGteYOGO/ONhATsrBIDbHdG9544llzMJ9do3lEwQt4qirearDwlQIkt9AKdB qQGbdhTA3wlqx3GiXIgIN3FyKPAb5lgas3jrGGRSJU5METmeeiqoU+rv7fA3RYJdKjpj pNmQ== X-Gm-Message-State: AOAM530rmQyfwR7JMaVPjxDVoyQj7mHLNs3ClPPWryziZpqzQlWjUCiI uvW365B3zWKv26s0CS5CbOmgtTUV/UU= X-Google-Smtp-Source: ABdhPJwSu6naOJC496JkXhStXd0R8vBwopvRz+6UOGORG9bMH2W/yIE6NVOGx3ujwHVgJq+NQD0ACQ== X-Received: by 2002:ac2:51ac:: with SMTP id f12mr10120538lfk.6.1595796093874; Sun, 26 Jul 2020 13:41:33 -0700 (PDT) Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com. [209.85.208.176]) by smtp.gmail.com with ESMTPSA id j6sm2560571lfp.44.2020.07.26.13.41.32 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 26 Jul 2020 13:41:32 -0700 (PDT) Received: by mail-lj1-f176.google.com with SMTP id q4so15059176lji.2 for ; Sun, 26 Jul 2020 13:41:32 -0700 (PDT) X-Received: by 2002:a2e:9b42:: with SMTP id o2mr8304055ljj.102.1595796091999; Sun, 26 Jul 2020 13:41:31 -0700 (PDT) MIME-Version: 1.0 References: <20200723124749.GA7428@redhat.com> <20200724152424.GC17209@redhat.com> <20200725101445.GB3870@redhat.com> In-Reply-To: From: Linus Torvalds Date: Sun, 26 Jul 2020 13:41:16 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page To: Hugh Dickins Cc: Oleg Nesterov , Michal Hocko , Linux-MM , LKML , Andrew Morton , Tim Chen , Michal Hocko Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 6657F1804B667 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Jul 26, 2020 at 1:30 PM Hugh Dickins wrote: > > I've deduced nothing useful from the logs, will have to leave that > to others here with more experience of them. But my assumption now > is that you have successfully removed one bottleneck, so the tests > get somewhat further and now stick in the next bottleneck, whatever > that may be. Which shows up as "failure", where the unlock_page() > wake_up_page_bit() bottleneck had allowed the tests to proceed in > a more serially sedate way. Well, that's the very optimistic reading. As the optimistic and happy person I am (hah!) I'm going to agree with you, and plan on just merging that patch early in the next merge window. It may fix a real bug in the current trere, but it's much too late to apply right now, particularly with your somewhat ambiguous results. Oleg's theoretical race has probably never been seen, and while the watchdog triggering is clearly a real bug, it's also extreme enough not to really be a strong argument for merging this out-of-window.. > The xhci handle_cmd_completion list_del bugs (on an older version > of the driver): weird, nothing to do with page wakeups, I'll just > have to assume that it's some driver bug exposed by the greater > stress allowed down, and let driver people investigate (if it > still manifests) when we take in your improvements. Do you have the bug-report, just to google against anybody else reporting something simialr> > One nice thing from the comparison runs without your patches: > watchdog panic did crash one of those with exactly the unlock_page() > wake_up_page_bit() softlockup symptom we've been fighting, that did > not appear with your patches. So although the sample size is much > too small to justify a conclusion, it does tend towards confirming > your changes. You win some, you lose some. But yes, I'll take that as a tentative success and that the approach is valid. Thanks, Linus