From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7575CC55186 for ; Wed, 22 Apr 2020 09:57:48 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0FC522073A for ; Wed, 22 Apr 2020 09:57:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="GEQtSyjl" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0FC522073A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6476B8E0005; Wed, 22 Apr 2020 05:57:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5F88F8E0003; Wed, 22 Apr 2020 05:57:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4C0E18E0005; Wed, 22 Apr 2020 05:57:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0059.hostedemail.com [216.40.44.59]) by kanga.kvack.org (Postfix) with ESMTP id 34C538E0003 for ; Wed, 22 Apr 2020 05:57:47 -0400 (EDT) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DB29FC5B3 for ; Wed, 22 Apr 2020 09:57:46 +0000 (UTC) X-FDA: 76735039332.01.metal81_43e2549405661 X-HE-Tag: metal81_43e2549405661 X-Filterd-Recvd-Size: 6881 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Wed, 22 Apr 2020 09:57:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1587549465; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hhx6+9/yUkF/nk4Uh0ZCbhmd6ofN3B1CpDeYalo2t0o=; b=GEQtSyjlpx6CAvWhJp2Ue7KO+p3rLYgJ6ddBCkIjYo9eCuzlekOIbTDa8uM7mSJKDGjx2K e8/AWy226xXMwcBEqzxwBeDRfb1c4ULAJ16fYXn6g73AYyw966JL96hZLXRDP0OXwusSeK RmFv/IgKVOcCfjm+ivf+qWRn6yOy8XI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-244-w_hNL0WFNgGM7SUHTWPx2A-1; Wed, 22 Apr 2020 05:57:42 -0400 X-MC-Unique: w_hNL0WFNgGM7SUHTWPx2A-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4F71F8018A6; Wed, 22 Apr 2020 09:57:40 +0000 (UTC) Received: from localhost (ovpn-12-47.pek2.redhat.com [10.72.12.47]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2A7655C553; Wed, 22 Apr 2020 09:57:36 +0000 (UTC) Date: Wed, 22 Apr 2020 17:57:33 +0800 From: Baoquan He To: David Hildenbrand Cc: Andrew Morton , "Eric W. Biederman" , Russell King - ARM Linux admin , Anshuman Khandual , Catalin Marinas , Bhupesh Sharma , kexec@lists.infradead.org, linux-mm@kvack.org, James Morse , Will Deacon , linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, piliu@redhat.com Subject: Re: [PATCH 1/3] kexec: Prevent removal of memory in use by a loaded kexec image Message-ID: <20200422095733.GU4247@MiWiFi-R3L-srv> References: <0085f460-b0c7-b25f-36a7-fa3bafaab6fe@redhat.com> <20200415023524.GG4247@MiWiFi-R3L-srv> <18cf6afd-c651-25c7-aca3-3ca3c0e07547@redhat.com> <20200416140247.GA12723@MiWiFi-R3L-srv> <4e1546eb-4416-dc6d-d549-62d1cecccbc8@redhat.com> <20200416143634.GH4247@MiWiFi-R3L-srv> <2525cc9c-3566-6275-105b-7f4af8f980bc@redhat.com> <9a4eb1d7-33bf-8707-9c0c-1ca657c3e502@redhat.com> <20200422091718.GT4247@MiWiFi-R3L-srv> MIME-Version: 1.0 In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 04/22/20 at 11:24am, David Hildenbrand wrote: > On 22.04.20 11:17, Baoquan He wrote: > > On 04/21/20 at 03:29pm, David Hildenbrand wrote: > >>>> ACPI SRAT is embeded into efi, need read out the rsdp pointer. If we= don't > >>>> pass the efi, it won't get the SRAT table correctly, if I remember > >>>> correctly. Yeah, I remeber kvm guest can get memory hotplugged with > >>>> ACPI only, this won't happen on bare metal though. Need check carefu= lly.=20 > >>>> I have been using kvm guest with uefi firmwire recently. > >>> > >>> Yeah, I can imagine that bare metal is different. kvm only uses ACPI. > >>> > >>> I'm also asking because of virtio-mem. Memory added via virtio-mem is > >>> not part of any efi tables or whatsoever. So I assume the kexec kerne= l > >>> will not detect it automatically (good!), instead load the virtio-mem > >>> driver and let it add memory back to the system. > >>> > >>> I should probably play with kexec and virtio-mem once I have some spa= re > >>> cycles ... to find out what's broken and needs to be addressed :) > >> > >> FWIW, I just gave virtio-mem and kexec/kdump a try. > >> > >> a) kdump seems to work. Memory added by virtio-mem is getting dumped. > >> The kexec kernel only uses memory in the crash region. The virtio-mem > >> driver properly bails out due to is_kdump_kernel(). > >=20 > > Right, kdump is not impacted later added memory. > >=20 > >> > >> b) "kexec -s -l" seems to work fine. For now, the kernel does not seem > >> to get placed on virtio-mem memory (pure luck due to the left-to-right > >> search). Memory added by virtio-mem is not getting added to the e820 > >> map. Once the virtio-mem driver comes back up in the kexec kernel, the > >> right memory is readded. > >=20 > > kexec_file_load just behaves as you tested. It doesn't collect later > > added memory to e820 because it uses e820_table_kexec directly to pass > > e820 to kexec-ed kernel. However, this e820_table_kexec is only updated > > during boot stage. I tried hot adding DIMM after boot, kexec-ed kernel > > doesn't have it in e820 during bootup, but it's recoginized and added > > when ACPI scanning. I think we should update e820_table_kexec when hot > > add/remove memory, at least for DIMM. Not sure if DLPAR, virtio-mem, > > balloon will need be added into e820_table_kexec too, and if this is > > expected behaviour. > >=20 > > But whatever we do, it won't impact the kexec file_loading, because of > > the searching strategy bottom up. Just adding them into e820_table_kexe= c > > will make it consistent with cold reboot which get recognizes and get > > them into e820 during bootup. >=20 > Yeah, I think whatever a cold-booted kernel will see is what kexec-ed > kernel should see. Not more, not less. >=20 > Regarding virtio-mem: Not in e820 on cold-boot. > Regarding DIMMs: DIMMs under KVM will never show up in the e820 map > IIRC. I think on real HW it can be different. Yeah, DIMMs under KVM won't show up in e820 map. While this is not feature of QEMU/KVM, but a defect of it. I ever asked Igor who is developer of QEMU/KVM guest in this area, why we don't make kvm guest recognize hotpluggable DIMM and add it into e820 map, he said he had tried to make it, but this will corrupt guest on HyperV. So he had to revert the commit on qemu. So I think we can leave it for now for both real HW and kvm, or update the e820_table_kexec to include added DIMM for both real HW and KVM. I hope one day KVM dev will find a way to conquer the defect on HyperV and make the e820map consistent with bare metal. After all, kvm guest is trying to imitate real HW for the most part. Anyway, I will think about the e820_table_kexec updating. See if we can do something about it.