linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: David Hildenbrand <david@redhat.com>
Cc: kexec@lists.infradead.org, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	Eric Biederman <ebiederm@xmission.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Bhupesh Sharma <bhsharma@redhat.com>
Subject: Re: [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image
Date: Mon, 30 Mar 2020 14:00:47 +0100	[thread overview]
Message-ID: <34274b02-60ba-eb78-eacd-6dc1146ed3cd@arm.com> (raw)
In-Reply-To: <72672e2c-a57a-8df9-0cff-8035cbce7740@redhat.com>

Hi David,

On 3/27/20 6:52 PM, David Hildenbrand wrote:
>>> 2. You do the kexec. The kexec kernel will only operate on a reserved
>>> memory region (reserved via e.g., kernel cmdline crashkernel=128M).
>>
>> I think you are merging the kexec and kdump behaviours.
>> (Wrong terminology? The things behind 'kexec -l Image' and 'kexec -p Image')
> 
> Oh, I see - I think your example below clarifies things. Something like
> that should go in the cover letter if we end up in this patch being
> required :)

Do you mean the commit message? I think its far too long...

Adding a sentence about the way kexec load works may help, the first paragraph
would read:

| Kexec allows user-space to specify the address that the kexec image should be
| loaded to. Because this memory may be in use, an image loaded for kexec is not
| stored in place, instead its segments are scattered through memory, and are
| re-assembled when needed. In the meantime, the target memory may have been
| removed.

Do you think thats clearer?


> (I missed that the problematic part is "random" addresses passed by user
> space to the kernel, where it wants data to be loaded to on kexec -e)

[...]

>> Load kexec:
>> | root@vm:/sys/devices/system/memory# kexec -l /root/bzImage --reuse-cmdline
>>
> 
> I assume this will trigger
> 
> kexec_load -> do_kexec_load -> kimage_load_segment ->
> kimage_load_normal_segment -> kimage_alloc_page -> kimage_alloc_pages
> 
> Which will just allocate a bunch of pages and mark them reserved.
> 
> Now, AFAIKs, all allocations will be unmovable. So none of the kexec
> segment allocations will actually end up on your DIMM (as it is onlined
> online_movable).
> 
> So, the loaded image (with its segments) from user won't be problematic
> and not get placed on your DIMM.
> 
> 
> Now, the problematic part is (via man kexec_load) "mem and memsz specify
> a physical address range that is the target of the copy."
> 
> So the place where the image will be "assembled" at when doing the
> reboot. Understood :)

Yup.

[...]

> I wonder if we should instead make the "kexec -e" fail. It tries to
> touch random system memory.

Heh, isn't touching random system memory what kexec does?!

Its all described to user-space as 'System RAM'. Teaching it to probe
/sys/devices/memory/... would require a user-space change.


> Denying to offline MOVABLE memory should be avoided - and what kexec
> does here sounds dangerous to me (allowing it to write random system
> memory).

> Roughly what I am thinking is this:
> 
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index ba1d91e868ca..70c39a5307e5 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1135,6 +1135,10 @@ int kernel_kexec(void)
>                 error = -EINVAL;
>                 goto Unlock;
>         }
> +       if (!kexec_image_validate()) {
> +               error = -EINVAL;
> +               goto Unlock;
> +       }
> 
>  #ifdef CONFIG_KEXEC_JUMP
>         if (kexec_image->preserve_context) {
> 
> 
> kexec_image_validate() would go over all segments and validate that the
> involved pages are actual valid memory (pfn_to_online_page()).
> 
> All we have to do is protect from memory hotplug until we switch to the
> new kernel.

(migrate_to_reboot_cpu() can sleep), I think you'd end up with something like
this patch, but only while kexec_in_progress. I don't think letting kexec fail
if the events occur in a different order is good for user-space.


> Will probably need some thought. But it will actually also bail out when
> user space passes wrong physical memory addresses, instead of
> triple-faulting silently.

With this change, the reboot(LINUX_REBOOT_CMD_KEXEC), call would fail. This
thing doesn't usually return, so we're likely to trigger error-handling that has
never run before.

(Last time I debugged one of these, it turned out kexec had taken the network
interfaces down, meaning the nfsroot was no longer accessible)

How can user-space know whether kexec is going to succeed, or fail like this?
Any loaded kexec kernel could secretly be in this broken state.

Can user-space know what caused this to become unreliable? (without reading the
kernel source)


Given kexec can be unloaded by user-space, I think its better to prevent us
getting into the broken state, preferably giving the hint that kexec us using
that memory. The user can 'kexec -u', then retry removing the memory.

I think forbidding the memory-offline is simpler for user-space to deal with.


Thanks,

James


  reply	other threads:[~2020-03-30 13:01 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-26 18:07 [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use James Morse
2020-03-26 18:07 ` [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image James Morse
2020-03-27  0:43   ` Anshuman Khandual
2020-03-27  2:54     ` Baoquan He
2020-03-27 15:46     ` James Morse
2020-03-27  2:34   ` Baoquan He
2020-03-27  9:30   ` David Hildenbrand
2020-03-27 16:56     ` James Morse
2020-03-27 17:06       ` David Hildenbrand
2020-03-27 18:07         ` James Morse
2020-03-27 18:52           ` David Hildenbrand
2020-03-30 13:00             ` James Morse [this message]
2020-03-30 13:13               ` David Hildenbrand
2020-03-30 17:17                 ` James Morse
2020-03-30 18:14                   ` David Hildenbrand
2020-04-10 19:10                     ` Andrew Morton
2020-04-11  3:44                       ` Baoquan He
2020-04-11  9:30                         ` Russell King - ARM Linux admin
2020-04-11  9:58                           ` David Hildenbrand
2020-04-12  5:35                           ` Baoquan He
2020-04-12  8:08                             ` Russell King - ARM Linux admin
2020-04-12 19:52                               ` Eric W. Biederman
2020-04-12 20:37                                 ` Bhupesh SHARMA
2020-04-13  2:37                                 ` Baoquan He
2020-04-13 13:15                                   ` Eric W. Biederman
2020-04-13 23:01                                     ` Andrew Morton
2020-04-14  6:13                                       ` Eric W. Biederman
2020-04-14  6:40                                     ` Baoquan He
2020-04-14  6:51                                       ` Baoquan He
2020-04-14  8:00                                       ` David Hildenbrand
2020-04-14  9:22                                         ` Baoquan He
2020-04-14  9:37                                           ` David Hildenbrand
2020-04-14 14:39                                             ` Baoquan He
2020-04-14 14:49                                               ` David Hildenbrand
2020-04-15  2:35                                                 ` Baoquan He
2020-04-16 13:31                                                   ` David Hildenbrand
2020-04-16 14:02                                                     ` Baoquan He
2020-04-16 14:09                                                       ` David Hildenbrand
2020-04-16 14:36                                                         ` Baoquan He
2020-04-16 14:47                                                           ` David Hildenbrand
2020-04-21 13:29                                                             ` David Hildenbrand
2020-04-21 13:57                                                               ` David Hildenbrand
2020-04-21 13:59                                                               ` Eric W. Biederman
2020-04-21 14:30                                                                 ` David Hildenbrand
2020-04-22  9:17                                                               ` Baoquan He
2020-04-22  9:24                                                                 ` David Hildenbrand
2020-04-22  9:57                                                                   ` Baoquan He
2020-04-22 10:05                                                                     ` David Hildenbrand
2020-04-22 10:36                                                                       ` Baoquan He
2020-04-14  9:16                                     ` Dave Young
2020-04-14  9:38                                       ` Dave Young
2020-04-14  7:05                       ` David Hildenbrand
2020-04-14 16:55                         ` James Morse
2020-04-14 17:41                           ` David Hildenbrand
2020-04-15 20:33   ` Eric W. Biederman
2020-04-22 12:28     ` James Morse
2020-04-22 15:25       ` Eric W. Biederman
2020-04-22 16:40         ` David Hildenbrand
2020-04-23 16:29           ` Eric W. Biederman
2020-04-24  7:39             ` David Hildenbrand
2020-04-24  7:41               ` David Hildenbrand
2020-05-01 16:55           ` James Morse
2020-03-26 18:07 ` [PATCH 2/3] mm/memory_hotplug: Allow arch override of non boot memory resource names James Morse
2020-03-27  9:59   ` David Hildenbrand
2020-03-27 15:39     ` James Morse
2020-03-30 13:23       ` David Hildenbrand
2020-03-30 17:17         ` James Morse
2020-04-02  5:49   ` Dave Young
2020-04-02  6:12     ` piliu
2020-04-14 17:21       ` James Morse
2020-04-15 20:36   ` Eric W. Biederman
2020-04-22 12:14     ` James Morse
2020-05-09  0:45   ` Andrew Morton
2020-05-11  8:35     ` David Hildenbrand
2020-03-26 18:07 ` [PATCH 3/3] arm64: memory: Give hotplug memory a different resource name James Morse
2020-03-30 19:01   ` David Hildenbrand
2020-04-15 20:37   ` Eric W. Biederman
2020-04-22 12:14     ` James Morse
2020-03-27  2:11 ` [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use Baoquan He
2020-03-27 15:40   ` James Morse
2020-03-27  9:27 ` David Hildenbrand
2020-03-27 15:42   ` James Morse
2020-03-30 13:18     ` David Hildenbrand
2020-03-30 13:55 ` Baoquan He
2020-03-30 17:17   ` James Morse
2020-03-31  3:46     ` Dave Young
2020-04-14 17:31       ` James Morse
2020-03-31  3:38 ` Dave Young
2020-04-15 20:29 ` Eric W. Biederman
2020-04-22 12:14   ` James Morse
2020-04-22 13:04     ` Eric W. Biederman
2020-04-22 15:40       ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34274b02-60ba-eb78-eacd-6dc1146ed3cd@arm.com \
    --to=james.morse@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=bhsharma@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=david@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).