From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6949BFA372B for ; Wed, 16 Oct 2019 13:45:18 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 38DBB218DE for ; Wed, 16 Oct 2019 13:45:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 38DBB218DE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D9EE78E0028; Wed, 16 Oct 2019 09:45:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D764B8E0001; Wed, 16 Oct 2019 09:45:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C652D8E0028; Wed, 16 Oct 2019 09:45:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0211.hostedemail.com [216.40.44.211]) by kanga.kvack.org (Postfix) with ESMTP id A438A8E0001 for ; Wed, 16 Oct 2019 09:45:17 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 2353F1834630C for ; Wed, 16 Oct 2019 13:45:17 +0000 (UTC) X-FDA: 76049769474.09.laugh37_4b15a5d81e932 X-HE-Tag: laugh37_4b15a5d81e932 X-Filterd-Recvd-Size: 5259 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by imf35.hostedemail.com (Postfix) with ESMTP for ; Wed, 16 Oct 2019 13:45:15 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6ABA210CC212; Wed, 16 Oct 2019 13:45:14 +0000 (UTC) Received: from [10.36.116.19] (ovpn-116-19.ams2.redhat.com [10.36.116.19]) by smtp.corp.redhat.com (Postfix) with ESMTP id 602B61001B20; Wed, 16 Oct 2019 13:45:07 +0000 (UTC) Subject: Re: [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0 To: Michal Hocko Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtualization@lists.linux-foundation.org, Andrea Arcangeli , Andrew Morton , Juergen Gross , Pavel Tatashin , Alexander Duyck , Anthony Yznaga , Vlastimil Babka , Johannes Weiner , Oscar Salvador , Pingfan Liu , Qian Cai , Dan Williams , Mel Gorman , Mike Rapoport , Wei Yang , Alexander Potapenko , Anshuman Khandual , Jason Gunthorpe , Stephen Rothwell , Mauro Carvalho Chehab , Matthew Wilcox , Yu Zhao , Minchan Kim , Yang Shi , Ira Weiny , Andrey Ryabinin References: <20190919142228.5483-1-david@redhat.com> <20190919142228.5483-7-david@redhat.com> <20191016114321.GX317@dhcp22.suse.cz> From: David Hildenbrand Organization: Red Hat GmbH Message-ID: <36fef317-78e3-0500-43ba-f537f9a6fea4@redhat.com> Date: Wed, 16 Oct 2019 15:45:06 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.1 MIME-Version: 1.0 In-Reply-To: <20191016114321.GX317@dhcp22.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.2 (mx1.redhat.com [10.5.110.65]); Wed, 16 Oct 2019 13:45:15 +0000 (UTC) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 16.10.19 13:43, Michal Hocko wrote: > On Thu 19-09-19 16:22:25, David Hildenbrand wrote: >> virtio-mem wants to allow to offline memory blocks of which some parts >> were unplugged, especially, to later offline and remove completely >> unplugged memory blocks. The important part is that PageOffline() has >> to remain set until the section is offline, so these pages will never >> get accessed (e.g., when dumping). The pages should not be handed >> back to the buddy (which would require clearing PageOffline() and >> result in issues if offlining fails and the pages are suddenly in the >> buddy). >> >> Let's use "PageOffline() + reference count = 0" as a sign to >> memory offlining code that these pages can simply be skipped when >> offlining, similar to free or HWPoison pages. >> >> Pass flags to test_pages_isolated(), similar as already done for >> has_unmovable_pages(). Use a new flag to indicate the >> requirement of memory offlining to skip over these special pages. >> >> In has_unmovable_pages(), make sure the pages won't be detected as >> movable. This is not strictly necessary, however makes e.g., >> alloc_contig_range() stop early, trying to isolate such page blocks - >> compared to failing later when testing if all pages were isolated. >> >> Also, make sure that when a reference to a PageOffline() page is >> dropped, that the page will not be returned to the buddy. >> >> memory devices (like virtio-mem) that want to make use of this >> functionality have to make sure to synchronize against memory offlining, >> using the memory hotplug notifier. >> >> Alternative: Allow to offline with a reference count of 1 >> and use some other sign in the struct page that offlining is permitted. > > Few questions. I do not see onlining code to take care of this special > case. What should happen when offline && online? > Should we allow to try_remove_memory to succeed with these pages? > Do we really have hook into __put_page? Why do we even care about the > reference count of those pages? Oh, I forgot to answer this questions. The __put_page() change is necessary for the following race I identified: Page has a refcount of 1 (e.g., allocated by virtio-mem using alloc_contig_range()). a) kernel: get_page_unless_zero(page): refcount = 2 b) virtio-mem: set page PG_offline, reduce refcount): refocunt = 1 c) kernel: put_page(page): refcount = 0 The page would suddenly be given to the buddy. which is bad. -- Thanks, David / dhildenb