linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: James Morse <james.morse@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Bhupesh Sharma <bhsharma@redhat.com>,
	kexec@lists.infradead.org, linux-mm@kvack.org,
	Eric Biederman <ebiederm@xmission.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image
Date: Mon, 30 Mar 2020 15:13:28 +0200	[thread overview]
Message-ID: <80e4d1d7-f493-3f66-f700-86f18002d692@redhat.com> (raw)
In-Reply-To: <34274b02-60ba-eb78-eacd-6dc1146ed3cd@arm.com>

> Adding a sentence about the way kexec load works may help, the first paragraph
> would read:
> 
> | Kexec allows user-space to specify the address that the kexec image should be
> | loaded to. Because this memory may be in use, an image loaded for kexec is not
> | stored in place, instead its segments are scattered through memory, and are
> | re-assembled when needed. In the meantime, the target memory may have been
> | removed.
> 
> Do you think thats clearer?

Yes, very much. Maybe add, that the target is described by user space
during kexec_load() and that user space - right now - parses /proc/iomem
to find applicable system memory.

> [...]
> 
>>> Load kexec:
>>> | root@vm:/sys/devices/system/memory# kexec -l /root/bzImage --reuse-cmdline
>>>
>>
>> I assume this will trigger
>>
>> kexec_load -> do_kexec_load -> kimage_load_segment ->
>> kimage_load_normal_segment -> kimage_alloc_page -> kimage_alloc_pages
>>
>> Which will just allocate a bunch of pages and mark them reserved.
>>
>> Now, AFAIKs, all allocations will be unmovable. So none of the kexec
>> segment allocations will actually end up on your DIMM (as it is onlined
>> online_movable).
>>
>> So, the loaded image (with its segments) from user won't be problematic
>> and not get placed on your DIMM.
>>
>>
>> Now, the problematic part is (via man kexec_load) "mem and memsz specify
>> a physical address range that is the target of the copy."
>>
>> So the place where the image will be "assembled" at when doing the
>> reboot. Understood :)
> 
> Yup.
> 
> [...]
> 
>> I wonder if we should instead make the "kexec -e" fail. It tries to
>> touch random system memory.
> 
> Heh, isn't touching random system memory what kexec does?!

Having a racy user interface that can trigger kernel crashes feels very
wrong. We should limit the impact.

> 
> Its all described to user-space as 'System RAM'. Teaching it to probe
> /sys/devices/memory/... would require a user-space change.

I think we should really rename hotplugged memory on all architectures.

Especially also relevant for virtio-mem/hyper-v balloon, where some
pieces of (hotplugged )memory blocks are partially unavailable and
should not be touched - accessing them results in unpredictable behavior
(e.g., crashes or discarded writes).

[...]

>> Will probably need some thought. But it will actually also bail out when
>> user space passes wrong physical memory addresses, instead of
>> triple-faulting silently.
> 
> With this change, the reboot(LINUX_REBOOT_CMD_KEXEC), call would fail. This
> thing doesn't usually return, so we're likely to trigger error-handling that has
> never run before.
> 
> (Last time I debugged one of these, it turned out kexec had taken the network
> interfaces down, meaning the nfsroot was no longer accessible)
> 
> How can user-space know whether kexec is going to succeed, or fail like this?
> Any loaded kexec kernel could secretly be in this broken state.
> 
> Can user-space know what caused this to become unreliable? (without reading the
> kernel source)
> 
> 
> Given kexec can be unloaded by user-space, I think its better to prevent us
> getting into the broken state, preferably giving the hint that kexec us using
> that memory. The user can 'kexec -u', then retry removing the memory.
> 
> I think forbidding the memory-offline is simpler for user-space to deal with.

I thought about this over the weekend, and I don't think it's the right
approach.

1. It's racy. If memory is getting offlined/unplugged just while user
space is about to trigger the kexec_load(), you end up with the very
same triple-fault.

2. It's semantically wrong. kexec does not need online memory ("managed
by the buddy"), but still you disallow offlining memory.


I would really much rather want to see user-space choosing boot memory
(e.g., renaming hotplugged memory on all architectures), and checking
during "kexec -e" if the selected memory is actually "there", before
trying to write to it.

-- 
Thanks,

David / dhildenb


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2020-03-30 13:13 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-26 18:07 [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use James Morse
2020-03-26 18:07 ` [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image James Morse
2020-03-27  0:43   ` Anshuman Khandual
2020-03-27  2:54     ` Baoquan He
2020-03-27 15:46     ` James Morse
2020-03-27  2:34   ` Baoquan He
2020-03-27  9:30   ` David Hildenbrand
2020-03-27 16:56     ` James Morse
2020-03-27 17:06       ` David Hildenbrand
2020-03-27 18:07         ` James Morse
2020-03-27 18:52           ` David Hildenbrand
2020-03-30 13:00             ` James Morse
2020-03-30 13:13               ` David Hildenbrand [this message]
2020-03-30 17:17                 ` James Morse
2020-03-30 18:14                   ` David Hildenbrand
2020-04-10 19:10                     ` Andrew Morton
2020-04-11  3:44                       ` Baoquan He
2020-04-11  9:30                         ` Russell King - ARM Linux admin
2020-04-11  9:58                           ` David Hildenbrand
2020-04-12  5:35                           ` Baoquan He
2020-04-12  8:08                             ` Russell King - ARM Linux admin
2020-04-12 19:52                               ` Eric W. Biederman
2020-04-12 20:37                                 ` Bhupesh SHARMA
2020-04-13  2:37                                 ` Baoquan He
2020-04-13 13:15                                   ` Eric W. Biederman
2020-04-13 23:01                                     ` Andrew Morton
2020-04-14  6:13                                       ` Eric W. Biederman
2020-04-14  6:40                                     ` Baoquan He
2020-04-14  6:51                                       ` Baoquan He
2020-04-14  8:00                                       ` David Hildenbrand
2020-04-14  9:22                                         ` Baoquan He
2020-04-14  9:37                                           ` David Hildenbrand
2020-04-14 14:39                                             ` Baoquan He
2020-04-14 14:49                                               ` David Hildenbrand
2020-04-15  2:35                                                 ` Baoquan He
2020-04-16 13:31                                                   ` David Hildenbrand
2020-04-16 14:02                                                     ` Baoquan He
2020-04-16 14:09                                                       ` David Hildenbrand
2020-04-16 14:36                                                         ` Baoquan He
2020-04-16 14:47                                                           ` David Hildenbrand
2020-04-21 13:29                                                             ` David Hildenbrand
2020-04-21 13:57                                                               ` David Hildenbrand
2020-04-21 13:59                                                               ` Eric W. Biederman
2020-04-21 14:30                                                                 ` David Hildenbrand
2020-04-22  9:17                                                               ` Baoquan He
2020-04-22  9:24                                                                 ` David Hildenbrand
2020-04-22  9:57                                                                   ` Baoquan He
2020-04-22 10:05                                                                     ` David Hildenbrand
2020-04-22 10:36                                                                       ` Baoquan He
2020-04-14  9:16                                     ` Dave Young
2020-04-14  9:38                                       ` Dave Young
2020-04-14  7:05                       ` David Hildenbrand
2020-04-14 16:55                         ` James Morse
2020-04-14 17:41                           ` David Hildenbrand
2020-04-15 20:33   ` Eric W. Biederman
2020-04-22 12:28     ` James Morse
2020-04-22 15:25       ` Eric W. Biederman
2020-04-22 16:40         ` David Hildenbrand
2020-04-23 16:29           ` Eric W. Biederman
2020-04-24  7:39             ` David Hildenbrand
2020-04-24  7:41               ` David Hildenbrand
2020-05-01 16:55           ` James Morse
2020-03-26 18:07 ` [PATCH 2/3] mm/memory_hotplug: Allow arch override of non boot memory resource names James Morse
2020-03-27  9:59   ` David Hildenbrand
2020-03-27 15:39     ` James Morse
2020-03-30 13:23       ` David Hildenbrand
2020-03-30 17:17         ` James Morse
2020-04-02  5:49   ` Dave Young
2020-04-02  6:12     ` piliu
2020-04-14 17:21       ` James Morse
2020-04-15 20:36   ` Eric W. Biederman
2020-04-22 12:14     ` James Morse
2020-05-09  0:45   ` Andrew Morton
2020-05-11  8:35     ` David Hildenbrand
2020-03-26 18:07 ` [PATCH 3/3] arm64: memory: Give hotplug memory a different resource name James Morse
2020-03-30 19:01   ` David Hildenbrand
2020-04-15 20:37   ` Eric W. Biederman
2020-04-22 12:14     ` James Morse
2020-03-27  2:11 ` [PATCH 0/3] kexec/memory_hotplug: Prevent removal and accidental use Baoquan He
2020-03-27 15:40   ` James Morse
2020-03-27  9:27 ` David Hildenbrand
2020-03-27 15:42   ` James Morse
2020-03-30 13:18     ` David Hildenbrand
2020-03-30 13:55 ` Baoquan He
2020-03-30 17:17   ` James Morse
2020-03-31  3:46     ` Dave Young
2020-04-14 17:31       ` James Morse
2020-03-31  3:38 ` Dave Young
2020-04-15 20:29 ` Eric W. Biederman
2020-04-22 12:14   ` James Morse
2020-04-22 13:04     ` Eric W. Biederman
2020-04-22 15:40       ` James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=80e4d1d7-f493-3f66-f700-86f18002d692@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=bhsharma@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=ebiederm@xmission.com \
    --cc=james.morse@arm.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).