From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CA34FA372A for ; Wed, 16 Oct 2019 11:43:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1516D218DE for ; Wed, 16 Oct 2019 11:43:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1516D218DE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 87D058E001A; Wed, 16 Oct 2019 07:43:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 854908E0001; Wed, 16 Oct 2019 07:43:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 790AA8E001A; Wed, 16 Oct 2019 07:43:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0173.hostedemail.com [216.40.44.173]) by kanga.kvack.org (Postfix) with ESMTP id 570048E0001 for ; Wed, 16 Oct 2019 07:43:25 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id EEA02802A409 for ; Wed, 16 Oct 2019 11:43:24 +0000 (UTC) X-FDA: 76049462328.09.cough09_1dc32ac3aee3a X-HE-Tag: cough09_1dc32ac3aee3a X-Filterd-Recvd-Size: 5382 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf19.hostedemail.com (Postfix) with ESMTP for ; Wed, 16 Oct 2019 11:43:24 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id BBC97B3CC; Wed, 16 Oct 2019 11:43:22 +0000 (UTC) Date: Wed, 16 Oct 2019 13:43:21 +0200 From: Michal Hocko To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtualization@lists.linux-foundation.org, Andrea Arcangeli , Andrew Morton , Juergen Gross , Pavel Tatashin , Alexander Duyck , Anthony Yznaga , Vlastimil Babka , Johannes Weiner , Oscar Salvador , Pingfan Liu , Qian Cai , Dan Williams , Mel Gorman , Mike Rapoport , Wei Yang , Alexander Potapenko , Anshuman Khandual , Jason Gunthorpe , Stephen Rothwell , Mauro Carvalho Chehab , Matthew Wilcox , Yu Zhao , Minchan Kim , Yang Shi , Ira Weiny , Andrey Ryabinin Subject: Re: [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0 Message-ID: <20191016114321.GX317@dhcp22.suse.cz> References: <20190919142228.5483-1-david@redhat.com> <20190919142228.5483-7-david@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190919142228.5483-7-david@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 19-09-19 16:22:25, David Hildenbrand wrote: > virtio-mem wants to allow to offline memory blocks of which some parts > were unplugged, especially, to later offline and remove completely > unplugged memory blocks. The important part is that PageOffline() has > to remain set until the section is offline, so these pages will never > get accessed (e.g., when dumping). The pages should not be handed > back to the buddy (which would require clearing PageOffline() and > result in issues if offlining fails and the pages are suddenly in the > buddy). > > Let's use "PageOffline() + reference count = 0" as a sign to > memory offlining code that these pages can simply be skipped when > offlining, similar to free or HWPoison pages. > > Pass flags to test_pages_isolated(), similar as already done for > has_unmovable_pages(). Use a new flag to indicate the > requirement of memory offlining to skip over these special pages. > > In has_unmovable_pages(), make sure the pages won't be detected as > movable. This is not strictly necessary, however makes e.g., > alloc_contig_range() stop early, trying to isolate such page blocks - > compared to failing later when testing if all pages were isolated. > > Also, make sure that when a reference to a PageOffline() page is > dropped, that the page will not be returned to the buddy. > > memory devices (like virtio-mem) that want to make use of this > functionality have to make sure to synchronize against memory offlining, > using the memory hotplug notifier. > > Alternative: Allow to offline with a reference count of 1 > and use some other sign in the struct page that offlining is permitted. Few questions. I do not see onlining code to take care of this special case. What should happen when offline && online? Should we allow to try_remove_memory to succeed with these pages? Do we really have hook into __put_page? Why do we even care about the reference count of those pages? Wouldn't it be just more consistent to elevate the reference count (I guess this is what you suggest in the last paragraph) and the virtio driver would return that page to the buddy by regular put_page. This is also related to the above question about the physical memory remove. [...] > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index d5d7944954b3..fef74720d8b4 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -8221,6 +8221,15 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, > if (!page_ref_count(page)) { > if (PageBuddy(page)) > iter += (1 << page_order(page)) - 1; > + /* > + * Memory devices allow to offline a page if it is > + * marked PG_offline and has a reference count of 0. > + * However, the pages are not movable as it would be > + * required e.g., for alloc_contig_range(). > + */ > + if (PageOffline(page) && !(flags & SKIP_OFFLINE)) > + if (++found > count) > + goto unmovable; > continue; > } Do we really need to distinguish offline and hwpoison pages? They are both unmovable for allocator purposes and offlineable for the hotplug, right? Should we just hide them behind a helper and use it rather than an explicit SKIP_$FOO? -- Michal Hocko SUSE Labs