From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D7A1C433DF for ; Wed, 22 Jul 2020 18:31:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F3B0F20714 for ; Wed, 22 Jul 2020 18:31:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="Y9Grku/L" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F3B0F20714 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 512D06B0002; Wed, 22 Jul 2020 14:31:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 49AFC6B0005; Wed, 22 Jul 2020 14:31:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 33B9B6B0006; Wed, 22 Jul 2020 14:31:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0177.hostedemail.com [216.40.44.177]) by kanga.kvack.org (Postfix) with ESMTP id 16D1E6B0002 for ; Wed, 22 Jul 2020 14:31:27 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 824741852F20C for ; Wed, 22 Jul 2020 18:31:26 +0000 (UTC) X-FDA: 77066554572.10.plant13_0e16dac26f38 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id 464791C8C1B9C for ; Wed, 22 Jul 2020 18:29:41 +0000 (UTC) X-HE-Tag: plant13_0e16dac26f38 X-Filterd-Recvd-Size: 5927 Received: from mail-lj1-f193.google.com (mail-lj1-f193.google.com [209.85.208.193]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Wed, 22 Jul 2020 18:29:40 +0000 (UTC) Received: by mail-lj1-f193.google.com with SMTP id f5so3538227ljj.10 for ; Wed, 22 Jul 2020 11:29:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=MfwVArCwT7ZvorIJCn6XJE1kGXlXCCBt7JStKiemrDc=; b=Y9Grku/LEnGv2LHPufhT1fzjPgYUvb0x+f63akNlgbkoKFxp4b3ixHQ05s2Yig5vKy 1Y1eRLuHzf8zeKZcLkDNgasiT05atEtXVZnl04tHaqbUP5o1Am8KOoLRm08IYOGMEFHM FBc21ltlpIXX8MggmcTkSzNb7oTZcMifypARw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=MfwVArCwT7ZvorIJCn6XJE1kGXlXCCBt7JStKiemrDc=; b=R/PJGllp8e4x4YAHemHFOTCWnTaZtCRiWxKuR0r5aVHJtKuLWxEpj5zO1t3BIKHsjZ wIh+CtvHEGeHcFrje6Zrj1DPpR9kbTgKMn0PsC43C9E2ztelD4GqT4F/zHMrzRDPq/Py 24rT0IPSZug41qXg7VqBstg/mTZ0AYSX1Vh4GLxekAQt9XOlrZzpzbESWz7nGeHqNi+6 nrGq84yBzi18opnVusaFGeqmM4xSa5DHGR5fbwzrCZqNG+HRiT1NeY/CuryAgwXk4Ojq IGECkBx292RnHJ2nlNvcsAmMxet+TaMyKQRxHBBHPNUpW1zmxkWC2+Jy4L2L4by38aV8 lyRg== X-Gm-Message-State: AOAM533Cx+HUStSm5KMnp7q/lkMdfDwqnrbsHaKWecSVmWv9WUmiGS3l Rew9mbjGhSICn5xUobGquVhhKtvFTSU= X-Google-Smtp-Source: ABdhPJznxyGDxsRC64KsMhohaJwywxODobaIW82bdojPJlNZOfAU33kMnGbNVzX86IEXo0SZcabazQ== X-Received: by 2002:a2e:9396:: with SMTP id g22mr211159ljh.135.1595442578047; Wed, 22 Jul 2020 11:29:38 -0700 (PDT) Received: from mail-lj1-f171.google.com (mail-lj1-f171.google.com. [209.85.208.171]) by smtp.gmail.com with ESMTPSA id r11sm551766ljj.76.2020.07.22.11.29.36 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Jul 2020 11:29:37 -0700 (PDT) Received: by mail-lj1-f171.google.com with SMTP id z24so3543413ljn.8 for ; Wed, 22 Jul 2020 11:29:36 -0700 (PDT) X-Received: by 2002:a2e:86c4:: with SMTP id n4mr244946ljj.312.1595442576423; Wed, 22 Jul 2020 11:29:36 -0700 (PDT) MIME-Version: 1.0 References: <20200721063258.17140-1-mhocko@kernel.org> In-Reply-To: From: Linus Torvalds Date: Wed, 22 Jul 2020 11:29:20 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page To: Michal Hocko Cc: Linux-MM , LKML , Andrew Morton , Tim Chen , Michal Hocko Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 464791C8C1B9C X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 21, 2020 at 8:33 AM Linus Torvalds wrote: > > More likely, it's actually *caused* by that commit 11a19c7b099f, and > what might be happening is that other CPU's are just adding new > waiters to the list *while* we're waking things up, because somebody > else already got the page lock again. > > Humor me.. Does something like this work instead? I went back and looked at this, because it bothered me. And I'm no longer convinced it can possibly make a difference. Why? Because __wake_up_locked_key_bookmark() just calls __wake_up_common(), and that one checks the return value of the wakeup function: ret = curr->func(curr, mode, wake_flags, key); if (ret < 0) break; and will not add the bookmark back to the list if this triggers. And the wakeup function does that same "stop walking" thing: if (test_bit(key->bit_nr, &key->page->flags)) return -1; So if somebody else took the page lock, I think we should already have stopped walking the list. Of course, the page table lock hash table is very small. It's only 256 entries. So maybe the list is basically all aliases for another page entirely that is being hammered by that load, and we're just unlucky. Because the wakeup function only does that "stop walking" if the page key matched. So wait queue entries for another page that just hashes to the same bucket (or even the same page, but a different bit in the page) will confuse that logic. Hmm. I still can't see how you'd get so many entries (without re-adding them) that you'd hit the softlockup timer. So I wonder if maybe we really do hit the "aliasing with a really hot page that gets re-added in the page wait table" case, but it seems a bit contrived. So I think that patch is still worth testing, but I'm not quite as hopeful about it as I was originally. I do wonder if we should make that PAGE_WAIT_TABLE_SIZE be larger. 256 entries seems potentially ridiculously small, and aliasing not only increases the waitqueue length, it also potentially causes more contention on the waitqueue spinlock (which is already likely seeing some false sharing on a cacheline basis due to the fairly dense array of waitqueue entries: wait_queue_head is intentionally fairly small and dense unless you have lots of spinlock debugging options enabled). That hashed wait-queue size is an independent issue, though. But it might be part of "some loads can get into some really nasty behavior in corner cases" Linus