From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6B95DC55189 for ; Wed, 22 Apr 2020 09:17:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1C8E1206B9 for ; Wed, 22 Apr 2020 09:17:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Z4ZsTQpB" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1C8E1206B9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8FDC08E0005; Wed, 22 Apr 2020 05:17:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8AE978E0003; Wed, 22 Apr 2020 05:17:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C4338E0005; Wed, 22 Apr 2020 05:17:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0116.hostedemail.com [216.40.44.116]) by kanga.kvack.org (Postfix) with ESMTP id 65F0A8E0003 for ; Wed, 22 Apr 2020 05:17:34 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 211D29439 for ; Wed, 22 Apr 2020 09:17:34 +0000 (UTC) X-FDA: 76734938028.17.nose07_7d10800b3539 X-HE-Tag: nose07_7d10800b3539 X-Filterd-Recvd-Size: 6982 Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Wed, 22 Apr 2020 09:17:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1587547053; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3UXOekQh1gufcT9h9w5ibKg9wQnXHwD6nWUd1hZugik=; b=Z4ZsTQpB2OjFP6EZU0nch5m6FiRbo3f8v4GhxPRuT0MECEMDJ8/8R3ORIjD4AKSHdAbwGk liDsHLRzUmZ9L7KRrqklRDYEZK0dMWzYRuaIqcLnOB7ROXPFiWxhk5Q4n/EUJiaK+/YWs4 dVv2UI4QuCrA6LI3gA6j1whUiBqLpHQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-294-fgr5oz9PMCeI437KpNxurw-1; Wed, 22 Apr 2020 05:17:28 -0400 X-MC-Unique: fgr5oz9PMCeI437KpNxurw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0852613FE; Wed, 22 Apr 2020 09:17:26 +0000 (UTC) Received: from localhost (ovpn-12-47.pek2.redhat.com [10.72.12.47]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 8D5E076E68; Wed, 22 Apr 2020 09:17:21 +0000 (UTC) Date: Wed, 22 Apr 2020 17:17:18 +0800 From: Baoquan He To: David Hildenbrand Cc: Andrew Morton , "Eric W. Biederman" , Russell King - ARM Linux admin , Anshuman Khandual , Catalin Marinas , Bhupesh Sharma , kexec@lists.infradead.org, linux-mm@kvack.org, James Morse , Will Deacon , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, piliu@redhat.com Subject: Re: [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image Message-ID: <20200422091718.GT4247@MiWiFi-R3L-srv> References: <20200414143912.GE4247@MiWiFi-R3L-srv> <0085f460-b0c7-b25f-36a7-fa3bafaab6fe@redhat.com> <20200415023524.GG4247@MiWiFi-R3L-srv> <18cf6afd-c651-25c7-aca3-3ca3c0e07547@redhat.com> <20200416140247.GA12723@MiWiFi-R3L-srv> <4e1546eb-4416-dc6d-d549-62d1cecccbc8@redhat.com> <20200416143634.GH4247@MiWiFi-R3L-srv> <2525cc9c-3566-6275-105b-7f4af8f980bc@redhat.com> <9a4eb1d7-33bf-8707-9c0c-1ca657c3e502@redhat.com> MIME-Version: 1.0 In-Reply-To: <9a4eb1d7-33bf-8707-9c0c-1ca657c3e502@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.11 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 04/21/20 at 03:29pm, David Hildenbrand wrote: > >> ACPI SRAT is embeded into efi, need read out the rsdp pointer. If we d= on't > >> pass the efi, it won't get the SRAT table correctly, if I remember > >> correctly. Yeah, I remeber kvm guest can get memory hotplugged with > >> ACPI only, this won't happen on bare metal though. Need check carefull= y.=20 > >> I have been using kvm guest with uefi firmwire recently. > >=20 > > Yeah, I can imagine that bare metal is different. kvm only uses ACPI. > >=20 > > I'm also asking because of virtio-mem. Memory added via virtio-mem is > > not part of any efi tables or whatsoever. So I assume the kexec kernel > > will not detect it automatically (good!), instead load the virtio-mem > > driver and let it add memory back to the system. > >=20 > > I should probably play with kexec and virtio-mem once I have some spare > > cycles ... to find out what's broken and needs to be addressed :) >=20 > FWIW, I just gave virtio-mem and kexec/kdump a try. >=20 > a) kdump seems to work. Memory added by virtio-mem is getting dumped. > The kexec kernel only uses memory in the crash region. The virtio-mem > driver properly bails out due to is_kdump_kernel(). Right, kdump is not impacted later added memory. >=20 > b) "kexec -s -l" seems to work fine. For now, the kernel does not seem > to get placed on virtio-mem memory (pure luck due to the left-to-right > search). Memory added by virtio-mem is not getting added to the e820 > map. Once the virtio-mem driver comes back up in the kexec kernel, the > right memory is readded. kexec_file_load just behaves as you tested. It doesn't collect later added memory to e820 because it uses e820_table_kexec directly to pass e820 to kexec-ed kernel. However, this e820_table_kexec is only updated during boot stage. I tried hot adding DIMM after boot, kexec-ed kernel doesn't have it in e820 during bootup, but it's recoginized and added when ACPI scanning. I think we should update e820_table_kexec when hot add/remove memory, at least for DIMM. Not sure if DLPAR, virtio-mem, balloon will need be added into e820_table_kexec too, and if this is expected behaviour. But whatever we do, it won't impact the kexec file_loading, because of the searching strategy bottom up. Just adding them into e820_table_kexec will make it consistent with cold reboot which get recognizes and get them into e820 during bootup. >=20 > c) "kexec -c -l" does not work properly. All memory added by virtio-mem > is added to the e820 map, which is wrong. Memory that should not be > touched will be touched by the kexec kernel. I assume kexec-tools just > goes ahead and adds anything it can find in /proc/iomem (or > /sys/firmware/memmap/) to the e820 map of the new kernel. >=20 > Due to c), I assume all hotplugged memory (e.g., ACPI DIMMs) is > similarly added to the e820 map and, therefore, won't be able to be > onlined MOVABLE easily. Yes, kexec_load will read memory regions from /sys/firmware/memmap/ or /proc/iomem. Making it right seems a little harder, we can export them to /proc/iomem or /sys/firmware/memmap/ with mark them with 'hotplug', but the attribute that which zone they belongs to is not easy to tell. We are proactive on widely testing kexec_file_load on x86_64, s390, arm64 by adding test cases into CKI. >=20 >=20 > At least for virtio-mem, I would either have to > a) Not support "kexec -c -l". A viable option if we would be planning on > not supporting it either way in the long term. I could block this > in-kernel somehow eventually. >=20 > b) Teach kexec-tools to leave virtio-mem added memory alone. E.g., by > indicating it in /proc/iomem in a special way ("System RAM > (hotplugged)"/"System RAM (virtio-mem)"). >=20 > Baoquan, any opinion on that? >=20 > --=20 > Thanks, >=20 > David / dhildenb