From: James Morse <james.morse@arm.com>
To: David Hildenbrand <david@redhat.com>
Cc: kexec@lists.infradead.org, linux-mm@kvack.org,
linux-arm-kernel@lists.infradead.org,
Eric Biederman <ebiederm@xmission.com>,
Andrew Morton <akpm@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
Anshuman Khandual <anshuman.khandual@arm.com>,
Bhupesh Sharma <bhsharma@redhat.com>
Subject: Re: [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image
Date: Mon, 30 Mar 2020 14:00:47 +0100 [thread overview]
Message-ID: <34274b02-60ba-eb78-eacd-6dc1146ed3cd@arm.com> (raw)
In-Reply-To: <72672e2c-a57a-8df9-0cff-8035cbce7740@redhat.com>
Hi David,
On 3/27/20 6:52 PM, David Hildenbrand wrote:
>>> 2. You do the kexec. The kexec kernel will only operate on a reserved
>>> memory region (reserved via e.g., kernel cmdline crashkernel=128M).
>>
>> I think you are merging the kexec and kdump behaviours.
>> (Wrong terminology? The things behind 'kexec -l Image' and 'kexec -p Image')
>
> Oh, I see - I think your example below clarifies things. Something like
> that should go in the cover letter if we end up in this patch being
> required :)
Do you mean the commit message? I think its far too long...
Adding a sentence about the way kexec load works may help, the first paragraph
would read:
| Kexec allows user-space to specify the address that the kexec image should be
| loaded to. Because this memory may be in use, an image loaded for kexec is not
| stored in place, instead its segments are scattered through memory, and are
| re-assembled when needed. In the meantime, the target memory may have been
| removed.
Do you think thats clearer?
> (I missed that the problematic part is "random" addresses passed by user
> space to the kernel, where it wants data to be loaded to on kexec -e)
[...]
>> Load kexec:
>> | root@vm:/sys/devices/system/memory# kexec -l /root/bzImage --reuse-cmdline
>>
>
> I assume this will trigger
>
> kexec_load -> do_kexec_load -> kimage_load_segment ->
> kimage_load_normal_segment -> kimage_alloc_page -> kimage_alloc_pages
>
> Which will just allocate a bunch of pages and mark them reserved.
>
> Now, AFAIKs, all allocations will be unmovable. So none of the kexec
> segment allocations will actually end up on your DIMM (as it is onlined
> online_movable).
>
> So, the loaded image (with its segments) from user won't be problematic
> and not get placed on your DIMM.
>
>
> Now, the problematic part is (via man kexec_load) "mem and memsz specify
> a physical address range that is the target of the copy."
>
> So the place where the image will be "assembled" at when doing the
> reboot. Understood :)
Yup.
[...]
> I wonder if we should instead make the "kexec -e" fail. It tries to
> touch random system memory.
Heh, isn't touching random system memory what kexec does?!
Its all described to user-space as 'System RAM'. Teaching it to probe
/sys/devices/memory/... would require a user-space change.
> Denying to offline MOVABLE memory should be avoided - and what kexec
> does here sounds dangerous to me (allowing it to write random system
> memory).
> Roughly what I am thinking is this:
>
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index ba1d91e868ca..70c39a5307e5 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1135,6 +1135,10 @@ int kernel_kexec(void)
> error = -EINVAL;
> goto Unlock;
> }
> + if (!kexec_image_validate()) {
> + error = -EINVAL;
> + goto Unlock;
> + }
>
> #ifdef CONFIG_KEXEC_JUMP
> if (kexec_image->preserve_context) {
>
>
> kexec_image_validate() would go over all segments and validate that the
> involved pages are actual valid memory (pfn_to_online_page()).
>
> All we have to do is protect from memory hotplug until we switch to the
> new kernel.
(migrate_to_reboot_cpu() can sleep), I think you'd end up with something like
this patch, but only while kexec_in_progress. I don't think letting kexec fail
if the events occur in a different order is good for user-space.
> Will probably need some thought. But it will actually also bail out when
> user space passes wrong physical memory addresses, instead of
> triple-faulting silently.
With this change, the reboot(LINUX_REBOOT_CMD_KEXEC), call would fail. This
thing doesn't usually return, so we're likely to trigger error-handling that has
never run before.
(Last time I debugged one of these, it turned out kexec had taken the network
interfaces down, meaning the nfsroot was no longer accessible)
How can user-space know whether kexec is going to succeed, or fail like this?
Any loaded kexec kernel could secretly be in this broken state.
Can user-space know what caused this to become unreliable? (without reading the
kernel source)
Given kexec can be unloaded by user-space, I think its better to prevent us
getting into the broken state, preferably giving the hint that kexec us using
that memory. The user can 'kexec -u', then retry removing the memory.
I think forbidding the memory-offline is simpler for user-space to deal with.
Thanks,
James
next prev parent reply other threads:[~2020-03-30 13:01 UTC|newest]
Thread overview: 92+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-26 18:07 [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use James Morse
2020-03-26 18:07 ` [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image James Morse
2020-03-27 0:43 ` Anshuman Khandual
2020-03-27 2:54 ` Baoquan He
2020-03-27 15:46 ` James Morse
2020-03-27 2:34 ` Baoquan He
2020-03-27 9:30 ` David Hildenbrand
2020-03-27 16:56 ` James Morse
2020-03-27 17:06 ` David Hildenbrand
2020-03-27 18:07 ` James Morse
2020-03-27 18:52 ` David Hildenbrand
2020-03-30 13:00 ` James Morse [this message]
2020-03-30 13:13 ` David Hildenbrand
2020-03-30 17:17 ` James Morse
2020-03-30 18:14 ` David Hildenbrand
2020-04-10 19:10 ` Andrew Morton
2020-04-11 3:44 ` Baoquan He
2020-04-11 9:30 ` Russell King - ARM Linux admin
2020-04-11 9:58 ` David Hildenbrand
2020-04-12 5:35 ` Baoquan He
2020-04-12 8:08 ` Russell King - ARM Linux admin
2020-04-12 19:52 ` Eric W. Biederman
2020-04-12 20:37 ` Bhupesh SHARMA
2020-04-13 2:37 ` Baoquan He
2020-04-13 13:15 ` Eric W. Biederman
2020-04-13 23:01 ` Andrew Morton
2020-04-14 6:13 ` Eric W. Biederman
2020-04-14 6:40 ` Baoquan He
2020-04-14 6:51 ` Baoquan He
2020-04-14 8:00 ` David Hildenbrand
2020-04-14 9:22 ` Baoquan He
2020-04-14 9:37 ` David Hildenbrand
2020-04-14 14:39 ` Baoquan He
2020-04-14 14:49 ` David Hildenbrand
2020-04-15 2:35 ` Baoquan He
2020-04-16 13:31 ` David Hildenbrand
2020-04-16 14:02 ` Baoquan He
2020-04-16 14:09 ` David Hildenbrand
2020-04-16 14:36 ` Baoquan He
2020-04-16 14:47 ` David Hildenbrand
2020-04-21 13:29 ` David Hildenbrand
2020-04-21 13:57 ` David Hildenbrand
2020-04-21 13:59 ` Eric W. Biederman
2020-04-21 14:30 ` David Hildenbrand
2020-04-22 9:17 ` Baoquan He
2020-04-22 9:24 ` David Hildenbrand
2020-04-22 9:57 ` Baoquan He
2020-04-22 10:05 ` David Hildenbrand
2020-04-22 10:36 ` Baoquan He
2020-04-14 9:16 ` Dave Young
2020-04-14 9:38 ` Dave Young
2020-04-14 7:05 ` David Hildenbrand
2020-04-14 16:55 ` James Morse
2020-04-14 17:41 ` David Hildenbrand
2020-04-15 20:33 ` Eric W. Biederman
2020-04-22 12:28 ` James Morse
2020-04-22 15:25 ` Eric W. Biederman
2020-04-22 16:40 ` David Hildenbrand
2020-04-23 16:29 ` Eric W. Biederman
2020-04-24 7:39 ` David Hildenbrand
2020-04-24 7:41 ` David Hildenbrand
2020-05-01 16:55 ` James Morse
2020-03-26 18:07 ` [PATCH 2/3] mm/memory_hotplug: Allow arch override of non boot memory resource names James Morse
2020-03-27 9:59 ` David Hildenbrand
2020-03-27 15:39 ` James Morse
2020-03-30 13:23 ` David Hildenbrand
2020-03-30 17:17 ` James Morse
2020-04-02 5:49 ` Dave Young
2020-04-02 6:12 ` piliu
2020-04-14 17:21 ` James Morse
2020-04-15 20:36 ` Eric W. Biederman
2020-04-22 12:14 ` James Morse
2020-05-09 0:45 ` Andrew Morton
2020-05-11 8:35 ` David Hildenbrand
2020-03-26 18:07 ` [PATCH 3/3] arm64: memory: Give hotplug memory a different resource name James Morse
2020-03-30 19:01 ` David Hildenbrand
2020-04-15 20:37 ` Eric W. Biederman
2020-04-22 12:14 ` James Morse
2020-03-27 2:11 ` [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use Baoquan He
2020-03-27 15:40 ` James Morse
2020-03-27 9:27 ` David Hildenbrand
2020-03-27 15:42 ` James Morse
2020-03-30 13:18 ` David Hildenbrand
2020-03-30 13:55 ` Baoquan He
2020-03-30 17:17 ` James Morse
2020-03-31 3:46 ` Dave Young
2020-04-14 17:31 ` James Morse
2020-03-31 3:38 ` Dave Young
2020-04-15 20:29 ` Eric W. Biederman
2020-04-22 12:14 ` James Morse
2020-04-22 13:04 ` Eric W. Biederman
2020-04-22 15:40 ` James Morse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=34274b02-60ba-eb78-eacd-6dc1146ed3cd@arm.com \
--to=james.morse@arm.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=bhsharma@redhat.com \
--cc=catalin.marinas@arm.com \
--cc=david@redhat.com \
--cc=ebiederm@xmission.com \
--cc=kexec@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).