linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: James Morse <james.morse@arm.com>
Cc: kexec@lists.infradead.org,  linux-mm@kvack.org,
	 linux-arm-kernel@lists.infradead.org,
	 Anshuman Khandual <anshuman.khandual@arm.com>,
	 Catalin Marinas <catalin.marinas@arm.com>,
	 Bhupesh Sharma <bhsharma@redhat.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Will Deacon <will@kernel.org>
Subject: Re: [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use
Date: Wed, 22 Apr 2020 08:04:21 -0500	[thread overview]
Message-ID: <87a7333c4q.fsf@x220.int.ebiederm.org> (raw)
In-Reply-To: <59b74cc5-89aa-83fa-5532-8e64d6382fdd@arm.com> (James Morse's message of "Wed, 22 Apr 2020 13:14:10 +0100")

James Morse <james.morse@arm.com> writes:

> Hi Eric,
>
> On 15/04/2020 21:29, Eric W. Biederman wrote:
>> James Morse <james.morse@arm.com> writes:
>> 
>>> Hello!
>>>
>>> arm64 recently queued support for memory hotremove, which led to some
>>> new corner cases for kexec.
>>>
>>> If the kexec segments are loaded for a removable region, that region may
>>> be removed before kexec actually occurs. This causes the first kernel to
>>> lockup when applying the relocations. (I've triggered this on x86 too).
>>>
>>> The first patch adds a memory notifier for kexec so that it can refuse
>>> to allow in-use regions to be taken offline.
>>>
>>>
>>> This doesn't solve the problem for arm64, where the new kernel must
>>> initially rely on the data structures from the first boot to describe
>>> memory. These don't describe hotpluggable memory.
>>> If kexec places the kernel in one of these regions, it must also provide
>>> a DT that describes the region in which the kernel was mapped as memory.
>>> (and somehow ensure its always present in the future...)
>>>
>>> To prevent this from happening accidentally with unaware user-space,
>>> patches two and three allow arm64 to give these regions a different
>>> name.
>>>
>>> This is a change in behaviour for arm64 as memory hotadd and hotremove
>>> were added separately.
>>>
>>>
>>> I haven't tried kdump.
>>> Unaware kdump from user-space probably won't describe the hotplug
>>> regions if the name is different, which saves us from problems if
>>> the memory is no longer present at kdump time, but means the vmcore
>>> is incomplete.
>>>
>>>
>>> These patches are based on arm64's for-next/core branch, but can all
>>> be merged independently.
>> 
>> So I just looked through these quickly and I think there are real
>> problems here we can fix, and that are worth fixing.
>> 
>> However I am not thrilled with the fixes you propose.
>
> Sure. Unfortunately /proc/iomem is the only trick arm64 has to keep the existing
> kexec-tools working.
> (We've had 'unthrilling' patches like this before to prevent user-space from loading the
> kernel over the top of the in-memory firmware tables.)
>
> arm64 expects the description of memory to come from firmware, be that UEFI for memory
> present at boot, or the ACPI AML methods for memory that was added
> later.
>
> On arm64 there is no standard location for memory. The kernel has to be handed a pointer
> to the firmware tables that describe it. The kernel expects to boot from memory that was
> present at boot.

What do you do when the firmware is wrong?  Does arm64 support the
mem=xxx@yyy kernel command line options?

If you want to handle the general case of memory hotplug having a
limitation that you have to boot from memory that was present at boot is
a bug, because the memory might not be there.

> Modifying the firmware tables at runtime doesn't solve the problem as we may need to move
> the firmware-reserved memory region that describes memory. User-space may still load and
> kexec either side of that update.
>
> Even if we could modify the structures at runtime, we can't update a loaded kexec image.
> We have no idea which blob from userspace is the DT. It may not even be linux that has
> been loaded.

What can be done and very reasonably so is on memory hotplug:
- Unloaded any loaded kexec image.
- Block loading any new image until the hotplug operation completes.

That is simple and generic, and can be done for all architectures.

This doesn't apply to kexec on panic kernel because it fundamentally
needs to figure out how to limp along (or reliably stop) when it has the
wrong memory map.

> We can't emulate parts of UEFI's handover because kexec's purgatory
> isn't an EFI program.

Plus much of EFI is unusable after ExitBootServices is called.

> I can't see a path through all this. If we have to modify existing user-space, I'd rather
> leave it broken. We can detect the problem in the arch code and print a warning at load time.

The weirdest thing to me in all of this is that you have been wanting to
handle memory hotplug.  But you don't want to change or deal with the
memory map changing when hotplug occurs.  The memory map changing is
fundamentally memory hotplug does.

So I think it is fundamental to figure out how to pass the updated
memory map.  Either through command line mem=xxx@yyy command line
options or through another option.

If you really want to keep the limitation that you have to have the
kernel in the initial memory map you can compare that map to the
efi tables when selecting the load address.

Expecting userspace to reload the loaded kernel after memory hotplug is
completely reasonable.

Unless I am mistaken memory hotplug is expected to be a rare event not
something that happens every day, certainly not something that happens
every minute.

Eric




  reply	other threads:[~2020-04-22 13:07 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-26 18:07 [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use James Morse
2020-03-26 18:07 ` [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image James Morse
2020-03-27  0:43   ` Anshuman Khandual
2020-03-27  2:54     ` Baoquan He
2020-03-27 15:46     ` James Morse
2020-03-27  2:34   ` Baoquan He
2020-03-27  9:30   ` David Hildenbrand
2020-03-27 16:56     ` James Morse
2020-03-27 17:06       ` David Hildenbrand
2020-03-27 18:07         ` James Morse
2020-03-27 18:52           ` David Hildenbrand
2020-03-30 13:00             ` James Morse
2020-03-30 13:13               ` David Hildenbrand
2020-03-30 17:17                 ` James Morse
2020-03-30 18:14                   ` David Hildenbrand
2020-04-10 19:10                     ` Andrew Morton
2020-04-11  3:44                       ` Baoquan He
2020-04-11  9:30                         ` Russell King - ARM Linux admin
2020-04-11  9:58                           ` David Hildenbrand
2020-04-12  5:35                           ` Baoquan He
2020-04-12  8:08                             ` Russell King - ARM Linux admin
2020-04-12 19:52                               ` Eric W. Biederman
2020-04-12 20:37                                 ` Bhupesh SHARMA
2020-04-13  2:37                                 ` Baoquan He
2020-04-13 13:15                                   ` Eric W. Biederman
2020-04-13 23:01                                     ` Andrew Morton
2020-04-14  6:13                                       ` Eric W. Biederman
2020-04-14  6:40                                     ` Baoquan He
2020-04-14  6:51                                       ` Baoquan He
2020-04-14  8:00                                       ` David Hildenbrand
2020-04-14  9:22                                         ` Baoquan He
2020-04-14  9:37                                           ` David Hildenbrand
2020-04-14 14:39                                             ` Baoquan He
2020-04-14 14:49                                               ` David Hildenbrand
2020-04-15  2:35                                                 ` Baoquan He
2020-04-16 13:31                                                   ` David Hildenbrand
2020-04-16 14:02                                                     ` Baoquan He
2020-04-16 14:09                                                       ` David Hildenbrand
2020-04-16 14:36                                                         ` Baoquan He
2020-04-16 14:47                                                           ` David Hildenbrand
2020-04-21 13:29                                                             ` David Hildenbrand
2020-04-21 13:57                                                               ` David Hildenbrand
2020-04-21 13:59                                                               ` Eric W. Biederman
2020-04-21 14:30                                                                 ` David Hildenbrand
2020-04-22  9:17                                                               ` Baoquan He
2020-04-22  9:24                                                                 ` David Hildenbrand
2020-04-22  9:57                                                                   ` Baoquan He
2020-04-22 10:05                                                                     ` David Hildenbrand
2020-04-22 10:36                                                                       ` Baoquan He
2020-04-14  9:16                                     ` Dave Young
2020-04-14  9:38                                       ` Dave Young
2020-04-14  7:05                       ` David Hildenbrand
2020-04-14 16:55                         ` James Morse
2020-04-14 17:41                           ` David Hildenbrand
2020-04-15 20:33   ` Eric W. Biederman
2020-04-22 12:28     ` James Morse
2020-04-22 15:25       ` Eric W. Biederman
2020-04-22 16:40         ` David Hildenbrand
2020-04-23 16:29           ` Eric W. Biederman
2020-04-24  7:39             ` David Hildenbrand
2020-04-24  7:41               ` David Hildenbrand
2020-05-01 16:55           ` James Morse
2020-03-26 18:07 ` [PATCH 2/3] mm/memory_hotplug: Allow arch override of non boot memory resource names James Morse
2020-03-27  9:59   ` David Hildenbrand
2020-03-27 15:39     ` James Morse
2020-03-30 13:23       ` David Hildenbrand
2020-03-30 17:17         ` James Morse
2020-04-02  5:49   ` Dave Young
2020-04-02  6:12     ` piliu
2020-04-14 17:21       ` James Morse
2020-04-15 20:36   ` Eric W. Biederman
2020-04-22 12:14     ` James Morse
2020-05-09  0:45   ` Andrew Morton
2020-05-11  8:35     ` David Hildenbrand
2020-03-26 18:07 ` [PATCH 3/3] arm64: memory: Give hotplug memory a different resource name James Morse
2020-03-30 19:01   ` David Hildenbrand
2020-04-15 20:37   ` Eric W. Biederman
2020-04-22 12:14     ` James Morse
2020-03-27  2:11 ` [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use Baoquan He
2020-03-27 15:40   ` James Morse
2020-03-27  9:27 ` David Hildenbrand
2020-03-27 15:42   ` James Morse
2020-03-30 13:18     ` David Hildenbrand
2020-03-30 13:55 ` Baoquan He
2020-03-30 17:17   ` James Morse
2020-03-31  3:46     ` Dave Young
2020-04-14 17:31       ` James Morse
2020-03-31  3:38 ` Dave Young
2020-04-15 20:29 ` Eric W. Biederman
2020-04-22 12:14   ` James Morse
2020-04-22 13:04     ` Eric W. Biederman [this message]
2020-04-22 15:40       ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87a7333c4q.fsf@x220.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=bhsharma@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=james.morse@arm.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).