From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73226C433E0 for ; Fri, 7 Aug 2020 19:08:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 311B12177B for ; Fri, 7 Aug 2020 19:08:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="IgxCxB4u" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 311B12177B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C02048D0003; Fri, 7 Aug 2020 15:08:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB1386B000D; Fri, 7 Aug 2020 15:08:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7A438D0003; Fri, 7 Aug 2020 15:08:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0030.hostedemail.com [216.40.44.30]) by kanga.kvack.org (Postfix) with ESMTP id 8D8F66B000C for ; Fri, 7 Aug 2020 15:08:13 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 23D44181AEF09 for ; Fri, 7 Aug 2020 19:08:13 +0000 (UTC) X-FDA: 77124708066.23.month86_2314e9d26fc2 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id EFB5F37608 for ; Fri, 7 Aug 2020 19:08:12 +0000 (UTC) X-HE-Tag: month86_2314e9d26fc2 X-Filterd-Recvd-Size: 6922 Received: from mail-lj1-f194.google.com (mail-lj1-f194.google.com [209.85.208.194]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Fri, 7 Aug 2020 19:08:12 +0000 (UTC) Received: by mail-lj1-f194.google.com with SMTP id w14so3318012ljj.4 for ; Fri, 07 Aug 2020 12:08:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=c8tGfyphxRtiXyEdcvoYman9XgjfDP3kct92h+829I4=; b=IgxCxB4uub7VQqj+eQvlcRIhDoXI2u63xnHa6xhV72pL6k6VdF2Riprov2SytE5Be5 PCTWDk4SeGRy98Ba2QdQ8bI/ekpBd9OazBCyPQGu8VFQp11MxKlxQX1S3uRYaNw0+dMs Sa2DcnatE4QSVfijxi+fBGnSJF78tPUoiZFBU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=c8tGfyphxRtiXyEdcvoYman9XgjfDP3kct92h+829I4=; b=PHTXWkha0tqCZedHVxfNxYTYUzrU+MPSb2vPzHPipi5yymTWIbYUmv7yps0uz5cIPd 5ZtabkezvC80DPQvAqF6L60H4izt+irMLMqiom8Hgqe/4t07Zpa4hpuRP40Do2jymnob Dgk63orua3Z7ytIBZUU9Ki4r+hnWT6sNMrth+Rue2aIXyjbWDTDbY8wdT30CJjX5ynJ8 9c+Tb7xCI8ixXjamvAMbbyoK7mue4OZhL58r4YbksHHMG+6g9QObBnKSKe3964njFp9/ M1Pr/hTw07znouZy1JULcwJVmR9KeGWxA2R7uniGyhcsEUcWenPcs88ibu2OTyiXaOVv qsRw== X-Gm-Message-State: AOAM532o82WzN2KS9biGy5qVrD+CkZA2U+B9DswbjjI6iKJRqUGrUO8U xsaQqMzF9X/Q5fQQnynmtjpU0ZtXB70= X-Google-Smtp-Source: ABdhPJx9jec6PvxDmAtM44rFWJspWgnwapX1z1EcEnfpiQq/fjGe0gGtqv5bqvJMxKOLYXflYA+tgQ== X-Received: by 2002:a2e:9b99:: with SMTP id z25mr7315247lji.226.1596827290478; Fri, 07 Aug 2020 12:08:10 -0700 (PDT) Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com. [209.85.208.181]) by smtp.gmail.com with ESMTPSA id y9sm4144986ljm.89.2020.08.07.12.08.04 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 07 Aug 2020 12:08:05 -0700 (PDT) Received: by mail-lj1-f181.google.com with SMTP id w14so3317699ljj.4 for ; Fri, 07 Aug 2020 12:08:04 -0700 (PDT) X-Received: by 2002:a05:651c:503:: with SMTP id o3mr4808734ljp.312.1596827284312; Fri, 07 Aug 2020 12:08:04 -0700 (PDT) MIME-Version: 1.0 References: <20200724152424.GC17209@redhat.com> <20200725101445.GB3870@redhat.com> <20200806180024.GB17456@casper.infradead.org> In-Reply-To: From: Linus Torvalds Date: Fri, 7 Aug 2020 12:07:47 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page To: Hugh Dickins Cc: Matthew Wilcox , Oleg Nesterov , Michal Hocko , Linux-MM , LKML , Andrew Morton , Tim Chen , Michal Hocko , Greg KH Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: EFB5F37608 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 7, 2020 at 11:41 AM Hugh Dickins wrote: > > + > + /* > + * If we hoped to pass PG_locked on to the next locker, but found > + * no exclusive taker, then we must clear it before dropping q->lock. > + * Otherwise unlock_page() can race trylock_page_bit_common(), and if > + * PageWaiters was already set from before, then cmpxchg sees no > + * difference to send it back to wake_up_page_bit(). > + * > + * We may have already dropped and retaken q->lock on the way here, but > + * I think this works out because new entries are always added at tail. > + */ > + if (exclude && !exclusive) > + clear_bit(bit_nr, &page->flags); > + > spin_unlock_irqrestore(&q->lock, flags); Yeah, I started thinking about this, and that's when I decided to just throw away the patch. Because now it clears the bit *after* it has woken people up, and that just made me go "Eww". You did add a comment about the whole "always added to the tail" thing, and the spinlock should serialize everything, so I guess it's ok (because the spinlock should serialize things that care), but it just feels wrong. I also started worrying about other people waiting on the page lock (ie we now have that whole io_uring case), and interaction with the PG_writeback case etc, and it just ended up feeling very messy. I think it was all fine - other cases won't hit that exclusive case at all - but I had a hard time wrapping my little mind around the exact handoff rules, and the cmpxchg behavior when other bits changed (eg PG_waiters), so I gave up. > My home testing was, effectively, on top of c6fe44d96fc1 (v5.8 plus > your two patches): I did not have in the io_uring changes from the > current tree. In glancing there, I noticed one new and one previous > instance of SetPageWaiters() *after* __add_wait_queue_entry_tail(): > are those correct? I don't think SetPageWaiters() has any ordering constraints with the wait queue. Nobody looks at the waitqueue unless they already got to the slowpath. The only ordering constraint is with the testing of the bit we're going to wait on. Because doing if (test_and_set_bit()) SetPageWaiters(page); is wrong - there's a race in there when somebody can clear the bit, and not see that there are waiters. And obviously that needs to be done inside the spinlock, but once you do that, the ordering of the actual wait-queue is entirely irrelevant. The spinlock will order _that_ part. The only thing the spinlock won't order and serialize is the PG_lock bit and making sure we get to the slowpath in the first place. So if you're talking about __wait_on_page_locked_async(), then I think that part is ok. Or am I missing something? Linus