From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F246228DCB for ; Thu, 21 Mar 2024 09:23:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711012999; cv=none; b=RuGiVtJ99+vRkgKu2/LkBtboyEp57qu0YPdtKZ+rnkM58eyiqr4FTY4NGHQmfXePDVBcqVmCrjltSwBI8F2SwkZW/lp+1TVcA1gWYlsG5uK7Aq0Bl7tC0MZQge1IVA+G+eUO0TjDfeU7l4AKJ7LX2GbR5jBJ5WAgFHl8aoFzjGE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1711012999; c=relaxed/simple; bh=Q43tX5IqNhiO7UwIWFH9LqtYREscYgG7OABwqbffrkc=; h=Date:From:To:Cc:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition; b=MbabeuUdem6l7J4JtoBlwf118DQnlQ+6qrOrbRwTTrx50RogHc14TXMCfAu1YudjE7RERV6GpNChIznSiFTXKyu7tynZ/HTEY5RGSNHychNcJk2185I9Agv9PCcvexfDcW5CdF0qrT2uJb6+zOCIOvKt82Urc/3oEk+vuqrb498= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=J+UN1ZbX; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="J+UN1ZbX" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1711012995; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=zxkwrp3mdbNU5mo25cZdyKhgtO4FahVvi8Wm46GQNE4=; b=J+UN1ZbX2FngOUzObxHPgNvrc/dr7c0vTx+Sql1fLPCOScfUSUEXtk6KaT+GZMLJ32x/Ju 6xa8h0jJlukISd5WDPoaQ0CXAZzraYt1+K9mlDrcz+JegcOVNsckwGE8Ql6te6aKXu1+qd yTtfngDhmsS+gZHg65ThQhie/iltQJw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-587-Ii_wWUY7PnmwFdksqGhh0Q-1; Thu, 21 Mar 2024 05:23:12 -0400 X-MC-Unique: Ii_wWUY7PnmwFdksqGhh0Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 360A2101D223; Thu, 21 Mar 2024 09:23:11 +0000 (UTC) Received: from darkstar.users.ipa.redhat.com (unknown [10.72.116.143]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 03B87C04122; Thu, 21 Mar 2024 09:23:05 +0000 (UTC) Date: Thu, 21 Mar 2024 17:23:20 +0800 From: Dave Young To: x86@kernel.org Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , linux-kernel@vger.kernel.org, kexec@lists.infradead.org, Baoquan He , Eric Biederman , "Huang, Kai" Subject: [PATCH V2] x86/kexec: do not update E820 kexec table for setup_data Message-ID: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.8 crashkernel reservation failed on a Thinkpad t440s laptop recently. Actually the memblock reservation succeeded, but later insert_resource() failed. Test steps: kexec load -> /* make sure add crashkernel param eg. crashkernel=160M */ kexec reboot -> dmesg|grep "crashkernel reserved"; crashkernel memory range like below reserved successfully: 0x00000000d0000000 - 0x00000000da000000 But no such "Crash kernel" region in /proc/iomem The background story is like below: Currently E820 code reserves setup_data regions for both the current kernel and the kexec kernel, and it inserts them into the resources list. Before the kexec kernel reboots nobody passes the old setup_data, and kexec only passes fresh SETUP_EFI and SETUP_IMA if needed. Thus the old setup data memory is not used at all. Due to old kernel updates the kexec e820 table as well so kexec kernel sees them as E820_TYPE_RESERVED_KERN regions, and later the old setup_data regions are inserted into resources list in the kexec kernel by e820__reserve_resources(). Note, due to no setup_data is passed in for those old regions they are not early reserved (by function early_reserve_memory), and the crashkernel memblock reservation will just treat them as usable memory and it could reserve the crashkernel region which overlaps with the old setup_data regions. And just like the bug I noticed here, kdump insert_resource failed because e820__reserve_resources has added the overlapped chunks in /proc/iomem already. Finally, looking at the code, the old setup_data regions are not used at all as no setup_data is passed in by the kexec boot loader. Although something like SETUP_PCI etc could be needed, kexec should pass the info as new setup_data so that kexec kernel can take care of them. This should be taken care of in other separate patches if needed. Thus drop the useless buggy code here. Signed-off-by: Dave Young --- V2: changelog grammar fixes [suggestions from Huang Kai] arch/x86/kernel/e820.c | 16 +--------------- 1 file changed, 1 insertion(+), 15 deletions(-) Index: linux/arch/x86/kernel/e820.c =================================================================== --- linux.orig/arch/x86/kernel/e820.c +++ linux/arch/x86/kernel/e820.c @@ -1015,16 +1015,6 @@ void __init e820__reserve_setup_data(voi pa_next = data->next; e820__range_update(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - - /* - * SETUP_EFI and SETUP_IMA are supplied by kexec and do not need - * to be reserved. - */ - if (data->type != SETUP_EFI && data->type != SETUP_IMA) - e820__range_update_kexec(pa_data, - sizeof(*data) + data->len, - E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - if (data->type == SETUP_INDIRECT) { len += data->len; early_memunmap(data, sizeof(*data)); @@ -1036,12 +1026,9 @@ void __init e820__reserve_setup_data(voi indirect = (struct setup_indirect *)data->data; - if (indirect->type != SETUP_INDIRECT) { + if (indirect->type != SETUP_INDIRECT) e820__range_update(indirect->addr, indirect->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - e820__range_update_kexec(indirect->addr, indirect->len, - E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - } } pa_data = pa_next; @@ -1049,7 +1036,6 @@ void __init e820__reserve_setup_data(voi } e820__update_table(e820_table); - e820__update_table(e820_table_kexec); pr_info("extended physical RAM map:\n"); e820__print_table("reserve setup_data"); From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7B103C54E68 for ; Thu, 21 Mar 2024 09:23:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:Subject:Cc:To: From:Date:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=EDEmfh9eCMevzSJ6/sbgIwHl0BsuRWLKEtYSkEZcBFc=; b=P9Ho6qUhVBRFq6 HH3Xu1+jqCLCQkAP8tpIJKXoGZ4QfdtHkcTiJJgXyNaGEzD0Ia85SU6VGS2FQ0AKU4OPMyHVXwkDI Rj8nD8uShVkIlfFX1K5ZueTRUv80LdFQeEMPQg+4z4IGjO4VeSBq++SdSHVqCJpBzvdHH7Pp+ireh D+XXsM3LtFPHNmWvxlp4pr+TpT0/IlNC1C4A7bTgoeKgQIdPwu97tIC2wBN5G0KIKJQf7vtqOOpms OAmxPJp8ZQ2wo3OT+cgYqCQIzjVI+B/ymkBYKqsUQIPt4Nf6SUzWtlOSuAIdp8VI0POmkzD9QLZRi aIgnphMaCiKUHOORw/3w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rnEe5-00000002TTX-42HM; Thu, 21 Mar 2024 09:23:35 +0000 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rnEdw-00000002TQN-39eF for kexec@lists.infradead.org; Thu, 21 Mar 2024 09:23:25 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1711012995; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=zxkwrp3mdbNU5mo25cZdyKhgtO4FahVvi8Wm46GQNE4=; b=J+UN1ZbX2FngOUzObxHPgNvrc/dr7c0vTx+Sql1fLPCOScfUSUEXtk6KaT+GZMLJ32x/Ju 6xa8h0jJlukISd5WDPoaQ0CXAZzraYt1+K9mlDrcz+JegcOVNsckwGE8Ql6te6aKXu1+qd yTtfngDhmsS+gZHg65ThQhie/iltQJw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-587-Ii_wWUY7PnmwFdksqGhh0Q-1; Thu, 21 Mar 2024 05:23:12 -0400 X-MC-Unique: Ii_wWUY7PnmwFdksqGhh0Q-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 360A2101D223; Thu, 21 Mar 2024 09:23:11 +0000 (UTC) Received: from darkstar.users.ipa.redhat.com (unknown [10.72.116.143]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 03B87C04122; Thu, 21 Mar 2024 09:23:05 +0000 (UTC) Date: Thu, 21 Mar 2024 17:23:20 +0800 From: Dave Young To: x86@kernel.org Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , linux-kernel@vger.kernel.org, kexec@lists.infradead.org, Baoquan He , Eric Biederman , "Huang, Kai" Subject: [PATCH V2] x86/kexec: do not update E820 kexec table for setup_data Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.8 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240321_022324_871124_28BC9864 X-CRM114-Status: GOOD ( 17.51 ) X-BeenThere: kexec@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "kexec" Errors-To: kexec-bounces+kexec=archiver.kernel.org@lists.infradead.org crashkernel reservation failed on a Thinkpad t440s laptop recently. Actually the memblock reservation succeeded, but later insert_resource() failed. Test steps: kexec load -> /* make sure add crashkernel param eg. crashkernel=160M */ kexec reboot -> dmesg|grep "crashkernel reserved"; crashkernel memory range like below reserved successfully: 0x00000000d0000000 - 0x00000000da000000 But no such "Crash kernel" region in /proc/iomem The background story is like below: Currently E820 code reserves setup_data regions for both the current kernel and the kexec kernel, and it inserts them into the resources list. Before the kexec kernel reboots nobody passes the old setup_data, and kexec only passes fresh SETUP_EFI and SETUP_IMA if needed. Thus the old setup data memory is not used at all. Due to old kernel updates the kexec e820 table as well so kexec kernel sees them as E820_TYPE_RESERVED_KERN regions, and later the old setup_data regions are inserted into resources list in the kexec kernel by e820__reserve_resources(). Note, due to no setup_data is passed in for those old regions they are not early reserved (by function early_reserve_memory), and the crashkernel memblock reservation will just treat them as usable memory and it could reserve the crashkernel region which overlaps with the old setup_data regions. And just like the bug I noticed here, kdump insert_resource failed because e820__reserve_resources has added the overlapped chunks in /proc/iomem already. Finally, looking at the code, the old setup_data regions are not used at all as no setup_data is passed in by the kexec boot loader. Although something like SETUP_PCI etc could be needed, kexec should pass the info as new setup_data so that kexec kernel can take care of them. This should be taken care of in other separate patches if needed. Thus drop the useless buggy code here. Signed-off-by: Dave Young --- V2: changelog grammar fixes [suggestions from Huang Kai] arch/x86/kernel/e820.c | 16 +--------------- 1 file changed, 1 insertion(+), 15 deletions(-) Index: linux/arch/x86/kernel/e820.c =================================================================== --- linux.orig/arch/x86/kernel/e820.c +++ linux/arch/x86/kernel/e820.c @@ -1015,16 +1015,6 @@ void __init e820__reserve_setup_data(voi pa_next = data->next; e820__range_update(pa_data, sizeof(*data)+data->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - - /* - * SETUP_EFI and SETUP_IMA are supplied by kexec and do not need - * to be reserved. - */ - if (data->type != SETUP_EFI && data->type != SETUP_IMA) - e820__range_update_kexec(pa_data, - sizeof(*data) + data->len, - E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - if (data->type == SETUP_INDIRECT) { len += data->len; early_memunmap(data, sizeof(*data)); @@ -1036,12 +1026,9 @@ void __init e820__reserve_setup_data(voi indirect = (struct setup_indirect *)data->data; - if (indirect->type != SETUP_INDIRECT) { + if (indirect->type != SETUP_INDIRECT) e820__range_update(indirect->addr, indirect->len, E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - e820__range_update_kexec(indirect->addr, indirect->len, - E820_TYPE_RAM, E820_TYPE_RESERVED_KERN); - } } pa_data = pa_next; @@ -1049,7 +1036,6 @@ void __init e820__reserve_setup_data(voi } e820__update_table(e820_table); - e820__update_table(e820_table_kexec); pr_info("extended physical RAM map:\n"); e820__print_table("reserve setup_data"); _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec