linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Wei Yang <richard.weiyang@linux.alibaba.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	virtualization@lists.linux-foundation.org,
	Andrew Morton <akpm@linux-foundation.org>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Jason Wang <jasowang@redhat.com>,
	Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Subject: Re: [PATCH v1 05/29] virtio-mem: generalize check for added memory
Date: Fri, 16 Oct 2020 11:11:24 +0200	[thread overview]
Message-ID: <5caec772-295c-436a-2b19-ca261ea1ad0c@redhat.com> (raw)
In-Reply-To: <20201016021651.GI86495@L-31X9LVDL-1304.local>

>> That's an interesting corner case. Assume you have a 128MB memory block
>> but only 64MB are plugged.
> 
> Since we just plug a part of memory block, this state is OFFLINE_PARTIAL
> first. But then we would add these memory and online it. This means the state
> of this memory block is ONLINE_PARTIAL.
> 
> When this state is changed to OFFLINE_PARTIAL again?

Please note that memory onlining is *completely* controllable by user
space. User space can offline/online memory blocks as it wants. Not
saying this might actually be the right thing to do - but we cannot
trust that user space does the right thing.

So at any point in time, you have to assume that

a) added memory might not get onlined
b) previously onlined memory might get offlined
c) previously offline memory might get onlined

> 
>>
>> As long as we have our online_pages callback in place, we can hinder the
>> unplugged 64MB from getting exposed to the buddy
>> (virtio_mem_online_page_cb()). However, once we unloaded the driver,
> 
> Yes,
> 
> virtio_mem_set_fake_offline() would __SetPageOffline() to those pages.
> 
>> this is no longer the case. If someone would online that memory block,
>> we would expose unplugged memory to the buddy - very bad.
>>
> 
> Per my understanding, at this point of time, the memory block is at online
> state. Even part of it is set to *fake* offline.
> 
> So how could user trigger another online from sysfs interface?

Assume we added a partially plugged memory block, which is now offline.
Further assume user space did not online the memory block (e.g., no udev
rules).

User space could happily online the block after unloading the driver.
Again, we have to assume user space could do crazy things.

> 
>> So we have to remove these partially plugged, offline memory blocks when
>> losing control over them.
>>
>> I tried to document that via:
>>
>> "After we unregistered our callbacks, user space can online partially
>> plugged offline blocks. Make sure to remove them."
>>
>>>
>>> Also, during virtio_mem_remove(), we just handle OFFLINE_PARTIAL memory block.
>>> How about memory block in other states? It is not necessary to remove
>>> ONLINE[_PARTIAL] memroy blocks?
>>
>> Blocks that are fully plugged (ONLINE or OFFLINE) can get
>> onlined/offlined without us having to care. Works fine - we only have to
>> care about partially plugged blocks.
>>
>> While we *could* unplug OFFLINE blocks, there is no way we can
>> deterministically offline+remove ONLINE blocks. So that memory has to
>> stay, even after we unloaded the driver (similar to the dax/kmem driver).
> 
> For OFFLINE memory blocks, would that leave the situation:
> 
> Guest doesn't need those pages, while host still maps them?

Yes, but the guest could online the memory and make use of it.

(again, whoever decides to unload the driver better be knowing what he does)

To do it even more cleanly, we would

a) Have to remove completely plugged offline blocks (not done)
b) Have to remove partially plugged offline blocks (done)
c) Actually send unplug requests to the hypervisor

Right now, only b) is done, because it might actually cause harm (as
discussed). However, the problem is, that c) might actually fail.

Long short: we could add a) if it turns out to be a real issue. But
than, unloading the driver isn't really suggested, the current
implementation just "keeps it working without crashes" - and I guess
that's good enough for now.

> 
>>
>> ONLINE_PARTIAL is already taken care of: it cannot get offlined anymore,
>> as we still hold references to these struct pages
>> (virtio_mem_set_fake_offline()), and as we no longer have the memory
>> notifier in place, we can no longer agree to offline this memory (when
>> going_offline).
>>
> 
> Ok, I seems to understand the logic now.
> 
> But how we prevent ONLINE_PARTIAL memory block get offlined? There are three
> calls in virtio_mem_set_fake_offline(), while all of them adjust page's flag.
> How they hold reference to struct page?

Sorry, I should have given you the right pointer. (similar to my other
reply)

We hold a reference either via

1. alloc_contig_range()
2. memmap init code, when not calling generic_online_page().

So these fake-offline pages can never be actually offlined, because we
no longer have the memory notifier registered to fix that up.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2020-10-16  9:11 UTC|newest]

Thread overview: 108+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-12 12:52 [PATCH v1 00/29] virtio-mem: Big Block Mode (BBM) David Hildenbrand
2020-10-12 12:52 ` [PATCH v1 01/29] virtio-mem: determine nid only once using memory_add_physaddr_to_nid() David Hildenbrand
2020-10-15  3:56   ` Wei Yang
2020-10-15 19:26   ` Pankaj Gupta
2020-10-12 12:52 ` [PATCH v1 02/29] virtio-mem: simplify calculation in virtio_mem_mb_state_prepare_next_mb() David Hildenbrand
2020-10-15  4:02   ` Wei Yang
2020-10-15  8:00     ` David Hildenbrand
2020-10-15 10:00       ` Wei Yang
2020-10-15 10:01         ` David Hildenbrand
2020-10-15 20:24   ` Pankaj Gupta
2020-10-16  9:00     ` David Hildenbrand
2020-10-12 12:52 ` [PATCH v1 03/29] virtio-mem: simplify MAX_ORDER - 1 / pageblock_order handling David Hildenbrand
2020-10-15  7:06   ` Wei Yang
2020-10-12 12:52 ` [PATCH v1 04/29] virtio-mem: drop rc2 in virtio_mem_mb_plug_and_add() David Hildenbrand
2020-10-12 13:09   ` Pankaj Gupta
2020-10-15  7:14   ` Wei Yang
2020-10-12 12:52 ` [PATCH v1 05/29] virtio-mem: generalize check for added memory David Hildenbrand
2020-10-15  8:28   ` Wei Yang
2020-10-15  8:50     ` David Hildenbrand
2020-10-16  2:16       ` Wei Yang
2020-10-16  9:11         ` David Hildenbrand [this message]
2020-10-16 10:02           ` Wei Yang
2020-10-16 10:32             ` David Hildenbrand
2020-10-16 22:38               ` Wei Yang
2020-10-17  7:39                 ` David Hildenbrand
2020-10-18 12:27                   ` Wei Yang
2020-10-16 22:39   ` Wei Yang
2020-10-12 12:53 ` [PATCH v1 06/29] virtio-mem: generalize virtio_mem_owned_mb() David Hildenbrand
2020-10-15  8:32   ` Wei Yang
2020-10-15  8:37     ` David Hildenbrand
2020-10-15 20:30   ` Pankaj Gupta
2020-10-12 12:53 ` [PATCH v1 07/29] virtio-mem: generalize virtio_mem_overlaps_range() David Hildenbrand
2020-10-20  9:22   ` Pankaj Gupta
2020-10-12 12:53 ` [PATCH v1 08/29] virtio-mem: drop last_mb_id David Hildenbrand
2020-10-15  8:35   ` Wei Yang
2020-10-15 20:32   ` Pankaj Gupta
2020-10-12 12:53 ` [PATCH v1 09/29] virtio-mem: don't always trigger the workqueue when offlining memory David Hildenbrand
2020-10-16  4:03   ` Wei Yang
2020-10-16  9:18     ` David Hildenbrand
2020-10-18  3:57       ` Wei Yang
2020-10-19  9:04         ` David Hildenbrand
2020-10-20  0:41           ` Wei Yang
2020-10-20  9:09             ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 10/29] virtio-mem: generalize handling when memory is getting onlined deferred David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 11/29] virtio-mem: use "unsigned long" for nr_pages when fake onlining/offlining David Hildenbrand
2020-10-15 20:31   ` Pankaj Gupta
2020-10-16  6:11   ` Wei Yang
2020-10-12 12:53 ` [PATCH v1 12/29] virtio-mem: factor out fake-offlining into virtio_mem_fake_offline() David Hildenbrand
2020-10-16  6:24   ` Wei Yang
2020-10-20  9:31   ` Pankaj Gupta
2020-10-12 12:53 ` [PATCH v1 13/29] virtio-mem: factor out handling of fake-offline pages in memory notifier David Hildenbrand
2020-10-16  7:15   ` Wei Yang
2020-10-16  8:00     ` Wei Yang
2020-10-16  8:57       ` David Hildenbrand
2020-10-18 12:37         ` Wei Yang
2020-10-18 12:38   ` Wei Yang
2020-10-12 12:53 ` [PATCH v1 14/29] virtio-mem: retry fake-offlining via alloc_contig_range() on ZONE_MOVABLE David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 15/29] virito-mem: document Sub Block Mode (SBM) David Hildenbrand
2020-10-15  9:33   ` David Hildenbrand
2020-10-20  9:38     ` Pankaj Gupta
2020-10-16  8:03   ` Wei Yang
2020-10-12 12:53 ` [PATCH v1 16/29] virtio-mem: memory block states are specific to " David Hildenbrand
2020-10-16  8:40   ` Wei Yang
2020-10-16  8:43   ` Wei Yang
2020-10-20  9:48   ` Pankaj Gupta
2020-10-12 12:53 ` [PATCH v1 17/29] virito-mem: subblock " David Hildenbrand
2020-10-16  8:43   ` Wei Yang
2020-10-20  9:54   ` Pankaj Gupta
2020-10-12 12:53 ` [PATCH v1 18/29] virtio-mem: factor out calculation of the bit number within the sb_states bitmap David Hildenbrand
2020-10-16  8:46   ` Wei Yang
2020-10-20  9:58   ` Pankaj Gupta
2020-10-12 12:53 ` [PATCH v1 19/29] virito-mem: existing (un)plug functions are specific to Sub Block Mode (SBM) David Hildenbrand
2020-10-16  8:49   ` Wei Yang
2020-10-12 12:53 ` [PATCH v1 20/29] virtio-mem: nb_sb_per_mb and subblock_size " David Hildenbrand
2020-10-16  8:51   ` Wei Yang
2020-10-16  8:53   ` Wei Yang
2020-10-16 13:17     ` David Hildenbrand
2020-10-18 12:41       ` Wei Yang
2020-10-19 11:57         ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 21/29] virtio-mem: memory notifier callbacks " David Hildenbrand
2020-10-19  1:57   ` Wei Yang
2020-10-19 10:22     ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 22/29] virtio-mem: memory block ids " David Hildenbrand
2020-10-16  8:54   ` Wei Yang
2020-10-12 12:53 ` [PATCH v1 23/29] virtio-mem: factor out adding/removing memory from Linux David Hildenbrand
2020-10-16  8:59   ` Wei Yang
2020-10-12 12:53 ` [PATCH v1 24/29] virtio-mem: print debug messages from virtio_mem_send_*_request() David Hildenbrand
2020-10-16  9:07   ` Wei Yang
2020-10-12 12:53 ` [PATCH v1 25/29] virtio-mem: Big Block Mode (BBM) memory hotplug David Hildenbrand
2020-10-16  9:38   ` Wei Yang
2020-10-16 13:13     ` David Hildenbrand
2020-10-19  2:26   ` Wei Yang
2020-10-19  9:15     ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 26/29] virtio-mem: allow to force Big Block Mode (BBM) and set the big block size David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 27/29] mm/memory_hotplug: extend offline_and_remove_memory() to handle more than one memory block David Hildenbrand
2020-10-15 13:08   ` Michael S. Tsirkin
2020-10-19  3:22   ` Wei Yang
2020-10-12 12:53 ` [PATCH v1 28/29] virtio-mem: Big Block Mode (BBM) - basic memory hotunplug David Hildenbrand
2020-10-19  3:48   ` Wei Yang
2020-10-19  9:12     ` David Hildenbrand
2020-10-12 12:53 ` [PATCH v1 29/29] virtio-mem: Big Block Mode (BBM) - safe " David Hildenbrand
2020-10-19  7:54   ` Wei Yang
2020-10-19  8:50     ` David Hildenbrand
2020-10-20  0:23       ` Wei Yang
2020-10-20  0:24   ` Wei Yang
2020-10-18 12:49 ` [PATCH v1 00/29] virtio-mem: Big Block Mode (BBM) Wei Yang
2020-10-18 15:29 ` Michael S. Tsirkin
2020-10-18 16:34   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5caec772-295c-436a-2b19-ca261ea1ad0c@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=pankaj.gupta.linux@gmail.com \
    --cc=richard.weiyang@linux.alibaba.com \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).