From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0513AC433DF for ; Wed, 22 Jul 2020 22:11:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D1C0F20825 for ; Wed, 22 Jul 2020 22:11:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1595455866; bh=rrD1MaIcckKimmVcrd51dA0aWUpllFClYsv/eT1nyCw=; h=References:In-Reply-To:From:Date:Subject:To:Cc:List-ID:From; b=dDfo+AN3mf6bZHqEtf1czryYExxNQQ/+ut1QxnoQ3i6KYMqfCP+n13VwwR17QrJKf 8dYT77rUWL5+c1AzBMfWuliH/wNKSCXIdy5stH4WdR4CtewrDdO6X5RaHJX47kCKJt yA4hrCBESDabJd+rZ+bwFxirni5/gCDwPEryYbJU= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732998AbgGVWLF (ORCPT ); Wed, 22 Jul 2020 18:11:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39068 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726642AbgGVWLF (ORCPT ); Wed, 22 Jul 2020 18:11:05 -0400 Received: from mail-lf1-x141.google.com (mail-lf1-x141.google.com [IPv6:2a00:1450:4864:20::141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8921DC0619DC for ; Wed, 22 Jul 2020 15:11:04 -0700 (PDT) Received: by mail-lf1-x141.google.com with SMTP id u12so2213426lff.2 for ; Wed, 22 Jul 2020 15:11:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mKcnks++BVJckGT27RdKRDX4KPgvtTbjBQBJPx6EVtE=; b=HfjxMSNXOBFs72NZQwzgLB+n1HTdUhb2cUmrKtKW+8eiwn4OoCtnYZdYaBLyuKrcQk UfOFr2okT+3ozHOJ2IfVHuWgMZEiY+vOc8DmYBkD/QLBRZQfVAs94sB2DMASW5ElGu2q P25unJrMLZmaAUjrKbNr8RMGKLunq02CX9d4s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mKcnks++BVJckGT27RdKRDX4KPgvtTbjBQBJPx6EVtE=; b=Jf+e5iFKL5zj4MqLUaAmMLorS76CiFnrJw1SUPZwlxkhaet9Pauj4FHrlFIRmKWACJ Q7ShBZTef0lSL7nlG+OTP29KolTlpx8A5BpYCPBCtVqcUpDDoL38M8IoRxi3RfXcFlmG kcmMfhNzDROIZEoGaYWlGOR3r//CJx91ZfEiVJEZ/kTh37hIq1D4vDC4WkOLDmJZKv8k nctOblvJdABifkUsSvFIYbdxUgqb1YVcsU6/d1K/AQcjE7bau/9S/fp1rAclnGNgjjcU q97WhBSfY+ebeNdMl/zkFJB+akSMGqnJ7grxyvE5k+uKhckJHNtVg05Uv6xHIKh+szV+ Cpqg== X-Gm-Message-State: AOAM530gmuoWAH5INaXlBvsYtA/ctE9TJei1xuIm8KwqATgB9n2e0oGZ HRE+EG8awONpc9FipgJB9lWaPtM+5K4= X-Google-Smtp-Source: ABdhPJxosU6FFBca4Dy+IoqpqG0yo77BQOCFrM+6ZRl7XhjF/jxsCAkYPKSmF/NQPqcgGZ2345Sivw== X-Received: by 2002:a05:6512:74b:: with SMTP id c11mr654342lfs.119.1595455862650; Wed, 22 Jul 2020 15:11:02 -0700 (PDT) Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com. [209.85.167.53]) by smtp.gmail.com with ESMTPSA id 2sm893935lfr.48.2020.07.22.15.11.01 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Jul 2020 15:11:01 -0700 (PDT) Received: by mail-lf1-f53.google.com with SMTP id b30so2185137lfj.12 for ; Wed, 22 Jul 2020 15:11:01 -0700 (PDT) X-Received: by 2002:ac2:58d5:: with SMTP id u21mr673351lfo.31.1595455860915; Wed, 22 Jul 2020 15:11:00 -0700 (PDT) MIME-Version: 1.0 References: <20200721063258.17140-1-mhocko@kernel.org> In-Reply-To: From: Linus Torvalds Date: Wed, 22 Jul 2020 15:10:44 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page To: Hugh Dickins Cc: Michal Hocko , Linux-MM , LKML , Andrew Morton , Tim Chen , Michal Hocko Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 22, 2020 at 2:29 PM Hugh Dickins wrote: > > -#define PAGE_WAIT_TABLE_BITS 8 > +#define PAGE_WAIT_TABLE_BITS 10 Well, that seems harmless even on small machines. > + bool first_time = true; > bool thrashing = false; > bool delayacct = false; > unsigned long pflags; > @@ -1134,7 +1135,12 @@ static inline int wait_on_page_bit_commo > spin_lock_irq(&q->lock); > > if (likely(list_empty(&wait->entry))) { > - __add_wait_queue_entry_tail(q, wait); > + if (first_time) { > + __add_wait_queue_entry_tail(q, wait); > + first_time = false; > + } else { > + __add_wait_queue(q, wait); > + } > SetPageWaiters(page); > } This seems very hacky. And in fact, looking closer, I'd say that there are more serious problems here. Look at that WQ_FLAG_EXCLUSIVE thing: non-exclusive waits should always go at the head (because they're not going to steal the bit, they just want to know when it got cleared), and exclusive waits should always go at the tail (because of fairness). But that's not at all what we do. Your patch adds even more confusion to this nasty area. And your third one: > + if (ret) > + woken++; > > - if (bookmark && (++cnt > WAITQUEUE_WALK_BREAK_CNT) && > + if (bookmark && (++cnt > WAITQUEUE_WALK_BREAK_CNT) && woken && I've got two reactions to this (a) we should not need a new "woken" variable, we should just set a high bit of "cnt" and make WAITQUEUE_WALK_BREAK_CNT contain that high bit (Tune "high bit" to whatever you want: it could be either the _real_ high bit of the variable, or it could be something like "128", which would mean that you'd break out after 128 non-waking entries). (b) Ugh, what hackery and magic behavior regardless I'm really starting to hate that wait_on_page_bit_common() function. See a few weeks ago how the function looks buggy to begin with https://lore.kernel.org/lkml/CAHk-=wjJA2Z3kUFb-5s=6+n0qbTs8ELqKFt9B3pH85a8fGD73w@mail.gmail.com/ and that never got resolved either (but probably never happens in practice). Linus From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02C5CC433E4 for ; Wed, 22 Jul 2020 22:11:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9E18220825 for ; Wed, 22 Jul 2020 22:11:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="HfjxMSNX" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9E18220825 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D4A096B0002; Wed, 22 Jul 2020 18:11:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CD31F6B0005; Wed, 22 Jul 2020 18:11:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B9B9E6B0006; Wed, 22 Jul 2020 18:11:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0093.hostedemail.com [216.40.44.93]) by kanga.kvack.org (Postfix) with ESMTP id 9E28D6B0002 for ; Wed, 22 Jul 2020 18:11:06 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C4EE818265288 for ; Wed, 22 Jul 2020 22:11:05 +0000 (UTC) X-FDA: 77067108090.28.lip35_101781a26f39 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 97FDABBDD for ; Wed, 22 Jul 2020 22:11:05 +0000 (UTC) X-HE-Tag: lip35_101781a26f39 X-Filterd-Recvd-Size: 5864 Received: from mail-lj1-f196.google.com (mail-lj1-f196.google.com [209.85.208.196]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Wed, 22 Jul 2020 22:11:05 +0000 (UTC) Received: by mail-lj1-f196.google.com with SMTP id e8so4230320ljb.0 for ; Wed, 22 Jul 2020 15:11:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mKcnks++BVJckGT27RdKRDX4KPgvtTbjBQBJPx6EVtE=; b=HfjxMSNXOBFs72NZQwzgLB+n1HTdUhb2cUmrKtKW+8eiwn4OoCtnYZdYaBLyuKrcQk UfOFr2okT+3ozHOJ2IfVHuWgMZEiY+vOc8DmYBkD/QLBRZQfVAs94sB2DMASW5ElGu2q P25unJrMLZmaAUjrKbNr8RMGKLunq02CX9d4s= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mKcnks++BVJckGT27RdKRDX4KPgvtTbjBQBJPx6EVtE=; b=dDFikXWEME9maxbS3lgwIXAPM0Cg2LMnaoOm0HJ9kdNA3iJnnb/LIwqqnT6myjiXSR btEQIhbhZcu7DhyvacBPp9NR+HHd91xMMQHBETP1WXR5RkY8O4f7oJ0c2MR9LRNsDRFJ CnOcoz8w8E5IGY8L0Xo3LxBj8WbTX/DS6fWZPJZEnth1AGpxnO3MJU7Xwh0EnZ08r8xQ 0eYIWHN41nKdQ1+TvET2VMtdzOSzY4d4dcpzRXaf1x3PBQAAVmDbYq4l4i7SHMHulCRq 7Bmtt2dN0A7AV/dqTx0AaThv0wZx1zozFTRv7QDJR1uNY6+d8a1eYsAFEH5+Wr9ZOEmX SzMQ== X-Gm-Message-State: AOAM5314tuqSqtkxG5/+VLjDWIE8i3oXbO8ScJiFyNlnYFkjkHO2MvQv FeUmCcZYAMXwzCHd4zfXcbHehP5cYwk= X-Google-Smtp-Source: ABdhPJzmlWtdNgyaHdJ1Hx4HmHIhwyAcCvOs5xVFIZ9MPvS0VGkv1I8VWgK7gtOnzCULqvNDszinLg== X-Received: by 2002:a2e:9089:: with SMTP id l9mr534008ljg.431.1595455863340; Wed, 22 Jul 2020 15:11:03 -0700 (PDT) Received: from mail-lf1-f49.google.com (mail-lf1-f49.google.com. [209.85.167.49]) by smtp.gmail.com with ESMTPSA id j17sm902071lfr.32.2020.07.22.15.11.01 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 22 Jul 2020 15:11:02 -0700 (PDT) Received: by mail-lf1-f49.google.com with SMTP id u12so2213387lff.2 for ; Wed, 22 Jul 2020 15:11:01 -0700 (PDT) X-Received: by 2002:ac2:58d5:: with SMTP id u21mr673351lfo.31.1595455860915; Wed, 22 Jul 2020 15:11:00 -0700 (PDT) MIME-Version: 1.0 References: <20200721063258.17140-1-mhocko@kernel.org> In-Reply-To: From: Linus Torvalds Date: Wed, 22 Jul 2020 15:10:44 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH] mm: silence soft lockups from unlock_page To: Hugh Dickins Cc: Michal Hocko , Linux-MM , LKML , Andrew Morton , Tim Chen , Michal Hocko Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 97FDABBDD X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Jul 22, 2020 at 2:29 PM Hugh Dickins wrote: > > -#define PAGE_WAIT_TABLE_BITS 8 > +#define PAGE_WAIT_TABLE_BITS 10 Well, that seems harmless even on small machines. > + bool first_time = true; > bool thrashing = false; > bool delayacct = false; > unsigned long pflags; > @@ -1134,7 +1135,12 @@ static inline int wait_on_page_bit_commo > spin_lock_irq(&q->lock); > > if (likely(list_empty(&wait->entry))) { > - __add_wait_queue_entry_tail(q, wait); > + if (first_time) { > + __add_wait_queue_entry_tail(q, wait); > + first_time = false; > + } else { > + __add_wait_queue(q, wait); > + } > SetPageWaiters(page); > } This seems very hacky. And in fact, looking closer, I'd say that there are more serious problems here. Look at that WQ_FLAG_EXCLUSIVE thing: non-exclusive waits should always go at the head (because they're not going to steal the bit, they just want to know when it got cleared), and exclusive waits should always go at the tail (because of fairness). But that's not at all what we do. Your patch adds even more confusion to this nasty area. And your third one: > + if (ret) > + woken++; > > - if (bookmark && (++cnt > WAITQUEUE_WALK_BREAK_CNT) && > + if (bookmark && (++cnt > WAITQUEUE_WALK_BREAK_CNT) && woken && I've got two reactions to this (a) we should not need a new "woken" variable, we should just set a high bit of "cnt" and make WAITQUEUE_WALK_BREAK_CNT contain that high bit (Tune "high bit" to whatever you want: it could be either the _real_ high bit of the variable, or it could be something like "128", which would mean that you'd break out after 128 non-waking entries). (b) Ugh, what hackery and magic behavior regardless I'm really starting to hate that wait_on_page_bit_common() function. See a few weeks ago how the function looks buggy to begin with https://lore.kernel.org/lkml/CAHk-=wjJA2Z3kUFb-5s=6+n0qbTs8ELqKFt9B3pH85a8fGD73w@mail.gmail.com/ and that never got resolved either (but probably never happens in practice). Linus