All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	virtualization@lists.linux-foundation.org,
	Andrea Arcangeli <aarcange@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Juergen Gross <jgross@suse.com>,
	Pavel Tatashin <pavel.tatashin@microsoft.com>,
	Alexander Duyck <alexander.h.duyck@linux.intel.com>,
	Anthony Yznaga <anthony.yznaga@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Oscar Salvador <osalvador@suse.de>,
	Pingfan Liu <kernelfans@gmail.com>, Qian Cai <cai@lca.pw>,
	Dan Williams <dan.j.williams@intel.com>,
	Mel Gorman <mgorman@techsingularity.net>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	Wei Yang <richardw.yang@linux.intel.com>,
	Alexander Potapenko <glider@google.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Stephen Rothwell <sfr@canb.auug.org.au>,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	Matthew Wilcox <willy@infradead.org>, Yu Zhao <yuzhao@google.com>,
	Minchan Kim <minchan@kernel.org>,
	Yang Shi <yang.shi@linux.alibaba.com>,
	Ira Weiny <ira.weiny@intel.com>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>
Subject: Re: [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0
Date: Wed, 16 Oct 2019 16:03:50 +0200	[thread overview]
Message-ID: <20191016140350.GD317@dhcp22.suse.cz> (raw)
In-Reply-To: <36fef317-78e3-0500-43ba-f537f9a6fea4@redhat.com>

On Wed 16-10-19 15:45:06, David Hildenbrand wrote:
> On 16.10.19 13:43, Michal Hocko wrote:
> > On Thu 19-09-19 16:22:25, David Hildenbrand wrote:
> > > virtio-mem wants to allow to offline memory blocks of which some parts
> > > were unplugged, especially, to later offline and remove completely
> > > unplugged memory blocks. The important part is that PageOffline() has
> > > to remain set until the section is offline, so these pages will never
> > > get accessed (e.g., when dumping). The pages should not be handed
> > > back to the buddy (which would require clearing PageOffline() and
> > > result in issues if offlining fails and the pages are suddenly in the
> > > buddy).
> > > 
> > > Let's use "PageOffline() + reference count = 0" as a sign to
> > > memory offlining code that these pages can simply be skipped when
> > > offlining, similar to free or HWPoison pages.
> > > 
> > > Pass flags to test_pages_isolated(), similar as already done for
> > > has_unmovable_pages(). Use a new flag to indicate the
> > > requirement of memory offlining to skip over these special pages.
> > > 
> > > In has_unmovable_pages(), make sure the pages won't be detected as
> > > movable. This is not strictly necessary, however makes e.g.,
> > > alloc_contig_range() stop early, trying to isolate such page blocks -
> > > compared to failing later when testing if all pages were isolated.
> > > 
> > > Also, make sure that when a reference to a PageOffline() page is
> > > dropped, that the page will not be returned to the buddy.
> > > 
> > > memory devices (like virtio-mem) that want to make use of this
> > > functionality have to make sure to synchronize against memory offlining,
> > > using the memory hotplug notifier.
> > > 
> > > Alternative: Allow to offline with a reference count of 1
> > > and use some other sign in the struct page that offlining is permitted.
> > 
> > Few questions. I do not see onlining code to take care of this special
> > case. What should happen when offline && online?
> > Should we allow to try_remove_memory to succeed with these pages?
> > Do we really have hook into __put_page? Why do we even care about the
> > reference count of those pages?
> 
> Oh, I forgot to answer this questions. The __put_page() change is necessary
> for the following race I identified:
> 
> Page has a refcount of 1 (e.g., allocated by virtio-mem using
> alloc_contig_range()).
> 
> a) kernel: get_page_unless_zero(page): refcount = 2
> b) virtio-mem: set page PG_offline, reduce refcount): refocunt = 1
> c) kernel: put_page(page): refcount = 0
> 
> The page would suddenly be given to the buddy. which is bad.

But why cannot you keep the reference count at 1 (do get_page when
offlining the page)? In other words as long as the driver knows the page
has been returned to the host then it has ref count at 1. Once the page
is returned to the guest for whatever reason it can free it to the
system by clearing the offline state and put_page.

An elevated ref count could help to detect that the memory hotremove is
not safe until the driver removes all potential metadata it might still
hold. You also know that memory online should skip such a page.

All in all your page is still in use by the driver and the life cycle is
controlled by that driver.

Or am I am missing something?
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2019-10-16 14:03 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-19 14:22 [PATCH RFC v3 0/9] virtio-mem: paravirtualized memory David Hildenbrand
2019-09-19 14:22 ` [PATCH RFC v3 1/9] ACPI: NUMA: export pxm_to_node David Hildenbrand
2019-09-19 14:22 ` David Hildenbrand
2019-09-23 10:13   ` David Hildenbrand
2019-09-23 10:36     ` Michal Hocko
2019-09-23 10:36       ` Michal Hocko
2019-09-23 10:39       ` David Hildenbrand
2019-09-23 10:39       ` David Hildenbrand
2019-09-23 10:13   ` David Hildenbrand
2019-09-19 14:22 ` [PATCH RFC v3 2/9] virtio-mem: Paravirtualized memory hotplug David Hildenbrand
2019-09-19 14:22 ` David Hildenbrand
2019-09-19 14:22 ` [PATCH RFC v3 3/9] virtio-mem: Paravirtualized memory hotunplug part 1 David Hildenbrand
2019-09-19 14:22 ` David Hildenbrand
2019-09-19 14:22 ` [PATCH RFC v3 4/9] mm: Export alloc_contig_range() / free_contig_range() David Hildenbrand
2019-10-16 11:20   ` Michal Hocko
2019-10-16 12:31     ` David Hildenbrand
2019-10-16 12:31     ` David Hildenbrand
2019-10-16 11:20   ` Michal Hocko
2019-09-19 14:22 ` David Hildenbrand
2019-09-19 14:22 ` [PATCH RFC v3 5/9] virtio-mem: Paravirtualized memory hotunplug part 2 David Hildenbrand
2019-09-19 14:22 ` David Hildenbrand
2019-09-19 14:22 ` [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0 David Hildenbrand
2019-09-19 14:22   ` David Hildenbrand
2019-10-16 11:43   ` Michal Hocko
2019-10-16 12:50     ` David Hildenbrand
2019-10-16 12:50     ` David Hildenbrand
2019-10-16 13:45       ` Michal Hocko
2019-10-16 13:55         ` David Hildenbrand
2019-10-16 13:55         ` David Hildenbrand
2019-10-16 14:09           ` Michal Hocko
2019-10-16 14:16             ` David Hildenbrand
2019-10-16 14:16             ` David Hildenbrand
2019-10-16 14:09           ` Michal Hocko
2019-10-16 13:59         ` David Hildenbrand
2019-10-16 13:59           ` David Hildenbrand
2019-10-16 13:45       ` Michal Hocko
2019-10-16 13:45     ` David Hildenbrand
2019-10-16 13:45     ` David Hildenbrand
2019-10-16 14:03       ` Michal Hocko [this message]
2019-10-16 14:14         ` David Hildenbrand
2019-10-16 14:14           ` David Hildenbrand
2019-10-18  8:15           ` Michal Hocko
2019-10-18  8:50             ` David Hildenbrand
2019-10-18  8:50             ` David Hildenbrand
2019-10-18 11:20               ` Michal Hocko
2019-10-18 11:20               ` Michal Hocko
2019-10-18 12:35                 ` David Hildenbrand
2019-10-18 12:35                   ` David Hildenbrand
2019-10-22 12:23                   ` Michal Hocko
2019-10-22 12:23                     ` Michal Hocko
2019-10-22 14:02                     ` David Hildenbrand
2019-10-22 14:02                       ` David Hildenbrand
2019-10-23  9:43                       ` Michal Hocko
2019-10-23  9:43                         ` Michal Hocko
2019-10-23 10:03                         ` David Hildenbrand
2019-10-23 10:03                         ` David Hildenbrand
2019-10-24  8:42                           ` Michal Hocko
2019-10-24  8:42                           ` Michal Hocko
2019-10-24  8:51                             ` David Hildenbrand
2019-10-24  8:51                               ` David Hildenbrand
2019-10-25 11:28                               ` [PATCH RFC] mm: Allow to offline unmovable PageOffline() pages if the driver agrees David Hildenbrand
2019-10-25 11:28                                 ` David Hildenbrand
2019-10-18  8:15           ` [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0 Michal Hocko
2019-10-16 14:03       ` Michal Hocko
2019-10-16 11:43   ` Michal Hocko
2019-09-19 14:22 ` [PATCH RFC v3 7/9] virtio-mem: Allow to offline partially unplugged memory blocks David Hildenbrand
2019-09-19 14:22 ` David Hildenbrand
2019-09-19 14:22 ` [PATCH RFC v3 8/9] mm/memory_hotplug: Introduce offline_and_remove_memory() David Hildenbrand
2019-09-19 14:22 ` David Hildenbrand
2019-10-16 11:47   ` Michal Hocko
2019-10-16 11:47     ` Michal Hocko
2019-10-16 12:57     ` David Hildenbrand
2019-10-16 12:57       ` David Hildenbrand
2019-09-19 14:22 ` [PATCH RFC v3 9/9] virtio-mem: Offline and remove completely unplugged memory blocks David Hildenbrand
2019-09-19 14:22 ` David Hildenbrand
2019-10-16  8:12 ` [PATCH RFC v3 0/9] virtio-mem: paravirtualized memory David Hildenbrand
2019-10-16  8:12 ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191016140350.GD317@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=anshuman.khandual@arm.com \
    --cc=anthony.yznaga@oracle.com \
    --cc=aryabinin@virtuozzo.com \
    --cc=cai@lca.pw \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=glider@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=ira.weiny@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=jgross@suse.com \
    --cc=kernelfans@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=minchan@kernel.org \
    --cc=osalvador@suse.de \
    --cc=pavel.tatashin@microsoft.com \
    --cc=richardw.yang@linux.intel.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=sfr@canb.auug.org.au \
    --cc=vbabka@suse.cz \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=willy@infradead.org \
    --cc=yang.shi@linux.alibaba.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.