From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.5 required=3.0 tests=MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEFA1C33CB6 for ; Fri, 17 Jan 2020 15:29:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7B02C2083E for ; Fri, 17 Jan 2020 15:29:51 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7B02C2083E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 136886B04B5; Fri, 17 Jan 2020 10:29:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0E9126B04B6; Fri, 17 Jan 2020 10:29:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F19DE6B04B7; Fri, 17 Jan 2020 10:29:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0156.hostedemail.com [216.40.44.156]) by kanga.kvack.org (Postfix) with ESMTP id DBD696B04B5 for ; Fri, 17 Jan 2020 10:29:50 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 7E40E824805A for ; Fri, 17 Jan 2020 15:29:50 +0000 (UTC) X-FDA: 76387511340.08.rain79_900fb1814c357 X-HE-Tag: rain79_900fb1814c357 X-Filterd-Recvd-Size: 6967 Received: from mail-wr1-f65.google.com (mail-wr1-f65.google.com [209.85.221.65]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Fri, 17 Jan 2020 15:29:49 +0000 (UTC) Received: by mail-wr1-f65.google.com with SMTP id y17so23131887wrh.5 for ; Fri, 17 Jan 2020 07:29:49 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=6t3yfFexJC6Tl9V+RJRNT0Pk0BtQCTtgmeoaWvyr77Q=; b=JuLM19UxsnxlCRnhotC1p6n4rtqd8mjJSK9jehgvKQz9STVV+TJeXIICtVU7s1mMAD z6X9CYUOSoOuVRD1YSVM71jUp9PA/P5y4LRksLgavqpthS0bOoYoCMipMQKlOK0hIwjs UxWpfnys26jWaNrn7dk9KCV8hUGAIepPKP9aX2X+5zEZuYg8ZUXpatBiilAE6rL+gVQT SjgUz9XLMSLOPWAa3QFITOeleGcb9qjKXT/qje5DVrOHJAgJpSgib7zuUCd7L8wtZO8B qmjupwpSEuVENWw3+OPFstFlY5IA34vJJyDQ+5n5yzEy1IIzDuEUs17pKXMxofcXtgP+ /VLg== X-Gm-Message-State: APjAAAXtSLBwTJy2zlDodn/pzYOvGCyHKSSyO095mM25QRYfJduCa34X /A7dYjZ8gOLdg5s8DCo+xwY= X-Google-Smtp-Source: APXvYqyx9Wp89eN6+iz3QNoVLr7ZRoTCY6IXaZ4nazRJW+ejjPTRYu84zeOl+BMdPqNUU7H/RJKwYA== X-Received: by 2002:adf:f885:: with SMTP id u5mr3700551wrp.359.1579274988805; Fri, 17 Jan 2020 07:29:48 -0800 (PST) Received: from localhost (prg-ext-pat.suse.com. [213.151.95.130]) by smtp.gmail.com with ESMTPSA id s15sm32021764wrp.4.2020.01.17.07.29.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Jan 2020 07:29:48 -0800 (PST) Date: Fri, 17 Jan 2020 16:29:47 +0100 From: Michal Hocko To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Greg Kroah-Hartman , "Rafael J. Wysocki" , Andrew Morton , Leonardo Bras , Nathan Lynch , Allison Randal , Nathan Fontenot , Thomas Gleixner , Dan Williams , Stephen Rothwell , Anshuman Khandual , lantianyu1986@gmail.com, linuxppc-dev@lists.ozlabs.org Subject: Re: [PATCH RFC v1] mm: is_mem_section_removable() overhaul Message-ID: <20200117152947.GK19428@dhcp22.suse.cz> References: <20200117105759.27905-1-david@redhat.com> <20200117113353.GT19428@dhcp22.suse.cz> <20200117145233.GB19428@dhcp22.suse.cz> <65606e2e-1cf7-de3b-10b1-33653cb41a52@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <65606e2e-1cf7-de3b-10b1-33653cb41a52@redhat.com> User-Agent: Mutt/1.12.2 (2019-09-21) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 17-01-20 15:58:26, David Hildenbrand wrote: > On 17.01.20 15:52, Michal Hocko wrote: > > On Fri 17-01-20 14:08:06, David Hildenbrand wrote: > >> On 17.01.20 12:33, Michal Hocko wrote: > >>> On Fri 17-01-20 11:57:59, David Hildenbrand wrote: > >>>> Let's refactor that code. We want to check if we can offline memory > >>>> blocks. Add a new function is_mem_section_offlineable() for that and > >>>> make it call is_mem_section_offlineable() for each contained section. > >>>> Within is_mem_section_offlineable(), add some more sanity checks and > >>>> directly bail out if the section contains holes or if it spans multiple > >>>> zones. > >>> > >>> I didn't read the patch (yet) but I am wondering. If we want to touch > >>> this code, can we simply always return true there? I mean whoever > >>> depends on this check is racy and the failure can happen even after > >>> the sysfs says good to go, right? The check is essentially as expensive > >>> as calling the offlining code itself. So the only usecase I can think of > >>> is a dumb driver to crawl over blocks and check which is removable and > >>> try to hotremove it. But just trying to offline one block after another > >>> is essentially going to achieve the same. > >> > >> Some thoughts: > >> > >> 1. It allows you to check if memory is likely to be offlineable without > >> doing expensive locking and trying to isolate pages (meaning: > >> zone->lock, mem_hotplug_lock. but also, calling drain_all_pages() > >> when isolating) > >> > >> 2. There are use cases that want to identify a memory block/DIMM to > >> unplug. One example is PPC DLPAR code (see this patch). Going over all > >> memory block trying to offline them is an expensive operation. > >> > >> 3. powerpc-utils (https://github.com/ibm-power-utilities/powerpc-utils) > >> makes use of /sys/.../removable to speed up the search AFAIK. > > > > Well, while I do see those points I am not really sure they are worth > > having a broken (by-definition) interface. > > It's a pure speedup. And for that, the interface has been working > perfectly fine for years? > > > > >> 4. lsmem displays/groups by "removable". > > > > Is anybody really using that? > > Well at least I am using that when testing to identify which > (ZONE_NORMAL!) block I can easily offline/re-online (e.g., to validate > all the zone shrinking stuff I have been fixing) > > So there is at least one user ;) Fair enough. But I would argue that there are better ways to do the same solely for testing purposes. Rather than having a subtly broken code to maintain. > > > >>> Or does anybody see any reasonable usecase that would break if we did > >>> that unconditional behavior? > >> > >> If we would return always "true", then the whole reason the > >> interface originally was introduced would be "broken" (meaning, less > >> performant as you would try to offline any memory block). > > > > I would argue that the whole interface is broken ;). Not the first time > > in the kernel development history and not the last time either. What I > > am trying to say here is that unless there are _real_ usecases depending > > on knowing that something surely is _not_ offlineable then I would just > > try to drop the functionality while preserving the interface and see > > what happens. > > I can see that, but I can perfectly well understand why - especially > powerpc - wants a fast way to sense which blocks actually sense to try > to online. > > The original patch correctly states > "which sections of > memory are likely to be removable before attempting the potentially > expensive operation." > > It works as designed I would say. Then I would just keep it crippled the same way it has been for years without anybody noticing. -- Michal Hocko SUSE Labs