From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 07081C433EF for ; Mon, 11 Apr 2022 08:47:45 +0000 (UTC) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4KcMtl3vZ9z3cDL for ; Mon, 11 Apr 2022 18:47:43 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=g9b2KGC8; dkim-atps=neutral Received: from gandalf.ozlabs.org (gandalf.ozlabs.org [150.107.74.76]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4KcMpy5FBjz3bXG for ; Mon, 11 Apr 2022 18:44:26 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=g9b2KGC8; dkim-atps=neutral Received: from gandalf.ozlabs.org (mail.ozlabs.org [IPv6:2404:9400:2221:ea00::3]) by gandalf.ozlabs.org (Postfix) with ESMTP id 4KcMpy57f9z4xR9 for ; Mon, 11 Apr 2022 18:44:26 +1000 (AEST) Received: by gandalf.ozlabs.org (Postfix) id 4KcMpy53f6z4xXK; Mon, 11 Apr 2022 18:44:26 +1000 (AEST) Authentication-Results: gandalf.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=linux.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=sourabhjain@linux.ibm.com; receiver=) Authentication-Results: gandalf.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=g9b2KGC8; dkim-atps=neutral Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gandalf.ozlabs.org (Postfix) with ESMTPS id 4KcMpy1wsqz4xR9 for ; Mon, 11 Apr 2022 18:44:26 +1000 (AEST) Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.1.2/8.16.1.2) with SMTP id 23B7g5E7005260; Mon, 11 Apr 2022 08:44:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : content-transfer-encoding : mime-version; s=pp1; bh=B5Gd+p8RZJ7xRw+6ApWMQoiFUu9SxWBPNa2iiMxbfXg=; b=g9b2KGC80OLMkZJvjFsQCZKNvMTj/MROHifupN8Is9qkKL7iPr7S3vNjuFdzc/4mG7pi /XuUcCUZK1+1p+S0vwEuwvhOmGh4USgd9/lquITKtI5p6OptjU5FIp+1NzKCqZO2+cXG 8P0etYcH/iKjivdFKIRs2eJZrFk3GQuS75XhECYk+Z78eomF4qkpDytas6m/nWDsWRer N1UFXJmiwJdYbx2HoWzK50cQbFAouXJFu63gdBqAZVqphE67ID9EZt+yNfPIQM5DIxT2 Zpsc+UGtvnku3B4opS5dx3BA/qHC5WlINChJTYcM0e9caRlrzIgNanJ4yoee3q3qCRzk 6g== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 3fcg5us40k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 Apr 2022 08:44:06 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 23B7lmC7023486; Mon, 11 Apr 2022 08:44:06 GMT Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0b-001b2d01.pphosted.com with ESMTP id 3fcg5us400-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 Apr 2022 08:44:06 +0000 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 23B8hdI3021482; Mon, 11 Apr 2022 08:44:04 GMT Received: from b06cxnps3075.portsmouth.uk.ibm.com (d06relay10.portsmouth.uk.ibm.com [9.149.109.195]) by ppma02fra.de.ibm.com with ESMTP id 3fb1s8j5vr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 11 Apr 2022 08:44:04 +0000 Received: from d06av21.portsmouth.uk.ibm.com (d06av21.portsmouth.uk.ibm.com [9.149.105.232]) by b06cxnps3075.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 23B8i03C52887810 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 11 Apr 2022 08:44:01 GMT Received: from d06av21.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C64F252054; Mon, 11 Apr 2022 08:44:00 +0000 (GMT) Received: from sjain014.ibmuc.com (unknown [9.43.41.3]) by d06av21.portsmouth.uk.ibm.com (Postfix) with ESMTP id 6D76952050; Mon, 11 Apr 2022 08:43:58 +0000 (GMT) From: Sourabh Jain To: linuxppc-dev@ozlabs.org, mpe@ellerman.id.au Subject: [RFC v4 PATCH 0/5] In kernel handling of CPU hotplug events for crash kernel Date: Mon, 11 Apr 2022 14:13:52 +0530 Message-Id: <20220411084357.157308-1-sourabhjain@linux.ibm.com> X-Mailer: git-send-email 2.35.1 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: oK0VyGB2bTlFi-vyNMHQRLDrn0SLIAAa X-Proofpoint-ORIG-GUID: IcJIGpckIPlUkU596ypAkTOL_9XTfsWi Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.425,FMLib:17.11.64.514 definitions=2022-04-11_03,2022-04-08_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 phishscore=0 bulkscore=0 lowpriorityscore=0 clxscore=1015 priorityscore=1501 impostorscore=0 spamscore=0 mlxlogscore=999 suspectscore=0 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2204110047 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: eric.devolder@oracle.com, bhe@redhat.com, mahesh@linux.vnet.ibm.com, kexec@lists.infradead.org, ldufour@linux.ibm.com, hbathini@linux.ibm.com Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" This patch series implements the crash hotplug handler on PowerPC introduced by https://lkml.org/lkml/2022/3/3/674 patch series. The Problem: ============ Post hotplug/DLPAR events the capture kernel holds stale information about the system. Dump collection with stale capture kernel might end up in dump capture failure or an inaccurate dump collection. Existing solution: ================== The existing solution to keep the capture kernel up-to-date is observe the hotplug event via udev rule and trigger a full capture kernel reload post hotplug event. Shortcomings: ------------------------------------------------ - Leaves a window where kernel crash might not lead to successful dump collection. - Reloading all kexec components for each hotplug is inefficient. Since only one or two kexec components need to be updated due to hotplug event reloading all kexec component is redundant. - udev rules are prone to races if hotplug events are frequent. More about issues with an existing solution is posted here: - https://lkml.org/lkml/2020/12/14/532 - https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-February/240254.html Proposed Solution: ================== Instead of reloading all kexec segments on hotplug event, this patch series focuses on updating only the relevant kexec segment. Once the kexec segments are loaded in the kernel reserved area then an arch-specific hotplug handler will update the relevant kexec segment based on hotplug event type. As mentioned above this patch series implemented a PowerPC crash hotplug handler for the CPU. The crash hotplug handler memory is in our TODO list. A couple of minor changes are required to realize the benefit of the patch series: - disalble the udev rule: comment out the below line in kdump udev rule file: RHEL: /usr/lib/udev/rules.d/98-kexec.rules # SUBSYSTEM=="cpu", ACTION=="online", GOTO="kdump_reload_cpu" - kexec tool needs to be updated with patch for kexec_load system call to work (not needed if -s option is used during kexec panic load): --- diff --git a/kexec/arch/ppc64/kexec-elf-ppc64.c b/kexec/arch/ppc64/kexec-elf-ppc64.c index 695b8b0..1dc6490 100644 --- a/kexec/arch/ppc64/kexec-elf-ppc64.c +++ b/kexec/arch/ppc64/kexec-elf-ppc64.c @@ -45,6 +45,29 @@ uint64_t initrd_base, initrd_size; unsigned char reuse_initrd = 0; const char *ramdisk; +#define MAX_CORE 256 +#define PER_CORE_NODE_SIZE 1500 + +/** + * get_crash_fdt_mem_sz() - calcuate mem size for crash kernel FDT + * @fdt: pointer to crash kernel FDT + * + * Calculate the buffer space needed to add more CPU nodes in FDT after + * capture kenrel load due to hot add events. + * + * Some assumption are made to calculate the additional buffer size needed + * to accommodate future hot add CPUs to the crash FDT. The maximum core count + * in the system would not go beyond MAX_CORE and memory needed to store per core + * date in FDT is PER_CORE_NODE_SIZE. + * + * Certainly MAX_CORE count can be replaced with possible core count and + * PER_CORE_NODE_SIZE to some standard value instead of randomly observed + * core size value on Power9 LPAR. + */ +static unsigned int get_crash_fdt_mem_sz(void *fdt) { + return fdt_totalsize(fdt) + (PER_CORE_NODE_SIZE * MAX_CORE); +} + int elf_ppc64_probe(const char *buf, off_t len) { struct mem_ehdr ehdr; @@ -179,6 +202,7 @@ int elf_ppc64_load(int argc, char **argv, const char *buf, off_t len, uint64_t max_addr, hole_addr; char *seg_buf = NULL; off_t seg_size = 0; + unsigned int mem_sz = 0; struct mem_phdr *phdr; size_t size; #ifdef NEED_RESERVE_DTB @@ -329,7 +353,13 @@ int elf_ppc64_load(int argc, char **argv, const char *buf, off_t len, if (result < 0) return result; - my_dt_offset = add_buffer(info, seg_buf, seg_size, seg_size, + if (info->kexec_flags & KEXEC_ON_CRASH) { + mem_sz = get_crash_fdt_mem_sz((void *)seg_buf); + fdt_set_totalsize(seg_buf, mem_sz); + info->fdt_index = info->nr_segments; + } + + my_dt_offset = add_buffer(info, seg_buf, seg_size, mem_sz, 0, 0, max_addr, -1); #ifdef NEED_RESERVE_DTB diff --git a/kexec/kexec.c b/kexec/kexec.c index f63b36b..846b1a8 100644 --- a/kexec/kexec.c +++ b/kexec/kexec.c @@ -672,6 +672,9 @@ static void update_purgatory(struct kexec_info *info) if (info->segment[i].mem == (void *)info->rhdr.rel_addr) { continue; } + if (info->fdt_index == i) + continue; + sha256_update(&ctx, info->segment[i].buf, info->segment[i].bufsz); nullsz = info->segment[i].memsz - info->segment[i].bufsz; diff --git a/kexec/kexec.h b/kexec/kexec.h index 595dd68..0906a1b 100644 --- a/kexec/kexec.h +++ b/kexec/kexec.h @@ -169,6 +169,7 @@ struct kexec_info { int command_line_len; int skip_checks; + // Given that we might need to update mutliple kexec segments + // then having array to keep indexes of all hotplug kexec segments + // will be helpful. + unsigned int fdt_index; }; struct arch_map_entry { --- --- Changelog: v1 -> v2: - Use generic hotplug handler introduced by https://lkml.org/lkml/2022/2/9/1406, a significant change from v1. v2 -> v3 - Move fdt_index and fdt_index_vaild variables to kimage_arch struct. - Rebase patche on top of https://lkml.org/lkml/2022/3/3/674 [v5] - Fixed warning reported by checpatch script v3 -> v4: - Update the logic to find the additional space needed for hotadd CPUs post kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash hotplug support for kexec_file_load" patch to know more about the change. - Fix a couple of typo. - Replace pr_err to pr_info_once to warn user about memory hotplug support. - In crash hotplug handle exit the for loop if FDT segment is found. --- Sourabh Jain (5): powerpc/kexec: make update_cpus_node non-static powerpc/crash hp: introduce a new config option CRASH_HOTPLUG powrepc/crash hp: update kimage_arch struct powerpc/crash hp: add crash hotplug support for kexec_file_load powerpc/crash hp: add crash hotplug support for kexec_load arch/powerpc/Kconfig | 11 +++ arch/powerpc/include/asm/kexec.h | 3 + arch/powerpc/kexec/core_64.c | 154 ++++++++++++++++++++++++++++++ arch/powerpc/kexec/elf_64.c | 74 ++++++++++++++ arch/powerpc/kexec/file_load_64.c | 87 ----------------- 5 files changed, 242 insertions(+), 87 deletions(-) -- 2.35.1 From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sourabh Jain Date: Mon, 11 Apr 2022 14:13:52 +0530 Subject: [RFC v4 PATCH 0/5] In kernel handling of CPU hotplug events for crash kernel Message-ID: <20220411084357.157308-1-sourabhjain@linux.ibm.com> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: kexec@lists.infradead.org This patch series implements the crash hotplug handler on PowerPC introduced by https://lkml.org/lkml/2022/3/3/674 patch series. The Problem: ============ Post hotplug/DLPAR events the capture kernel holds stale information about the system. Dump collection with stale capture kernel might end up in dump capture failure or an inaccurate dump collection. Existing solution: ================== The existing solution to keep the capture kernel up-to-date is observe the hotplug event via udev rule and trigger a full capture kernel reload post hotplug event. Shortcomings: ------------------------------------------------ - Leaves a window where kernel crash might not lead to successful dump collection. - Reloading all kexec components for each hotplug is inefficient. Since only one or two kexec components need to be updated due to hotplug event reloading all kexec component is redundant. - udev rules are prone to races if hotplug events are frequent. More about issues with an existing solution is posted here: - https://lkml.org/lkml/2020/12/14/532 - https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-February/240254.html Proposed Solution: ================== Instead of reloading all kexec segments on hotplug event, this patch series focuses on updating only the relevant kexec segment. Once the kexec segments are loaded in the kernel reserved area then an arch-specific hotplug handler will update the relevant kexec segment based on hotplug event type. As mentioned above this patch series implemented a PowerPC crash hotplug handler for the CPU. The crash hotplug handler memory is in our TODO list. A couple of minor changes are required to realize the benefit of the patch series: - disalble the udev rule: comment out the below line in kdump udev rule file: RHEL: /usr/lib/udev/rules.d/98-kexec.rules # SUBSYSTEM=="cpu", ACTION=="online", GOTO="kdump_reload_cpu" - kexec tool needs to be updated with patch for kexec_load system call to work (not needed if -s option is used during kexec panic load): --- diff --git a/kexec/arch/ppc64/kexec-elf-ppc64.c b/kexec/arch/ppc64/kexec-elf-ppc64.c index 695b8b0..1dc6490 100644 --- a/kexec/arch/ppc64/kexec-elf-ppc64.c +++ b/kexec/arch/ppc64/kexec-elf-ppc64.c @@ -45,6 +45,29 @@ uint64_t initrd_base, initrd_size; unsigned char reuse_initrd = 0; const char *ramdisk; +#define MAX_CORE 256 +#define PER_CORE_NODE_SIZE 1500 + +/** + * get_crash_fdt_mem_sz() - calcuate mem size for crash kernel FDT + * @fdt: pointer to crash kernel FDT + * + * Calculate the buffer space needed to add more CPU nodes in FDT after + * capture kenrel load due to hot add events. + * + * Some assumption are made to calculate the additional buffer size needed + * to accommodate future hot add CPUs to the crash FDT. The maximum core count + * in the system would not go beyond MAX_CORE and memory needed to store per core + * date in FDT is PER_CORE_NODE_SIZE. + * + * Certainly MAX_CORE count can be replaced with possible core count and + * PER_CORE_NODE_SIZE to some standard value instead of randomly observed + * core size value on Power9 LPAR. + */ +static unsigned int get_crash_fdt_mem_sz(void *fdt) { + return fdt_totalsize(fdt) + (PER_CORE_NODE_SIZE * MAX_CORE); +} + int elf_ppc64_probe(const char *buf, off_t len) { struct mem_ehdr ehdr; @@ -179,6 +202,7 @@ int elf_ppc64_load(int argc, char **argv, const char *buf, off_t len, uint64_t max_addr, hole_addr; char *seg_buf = NULL; off_t seg_size = 0; + unsigned int mem_sz = 0; struct mem_phdr *phdr; size_t size; #ifdef NEED_RESERVE_DTB @@ -329,7 +353,13 @@ int elf_ppc64_load(int argc, char **argv, const char *buf, off_t len, if (result < 0) return result; - my_dt_offset = add_buffer(info, seg_buf, seg_size, seg_size, + if (info->kexec_flags & KEXEC_ON_CRASH) { + mem_sz = get_crash_fdt_mem_sz((void *)seg_buf); + fdt_set_totalsize(seg_buf, mem_sz); + info->fdt_index = info->nr_segments; + } + + my_dt_offset = add_buffer(info, seg_buf, seg_size, mem_sz, 0, 0, max_addr, -1); #ifdef NEED_RESERVE_DTB diff --git a/kexec/kexec.c b/kexec/kexec.c index f63b36b..846b1a8 100644 --- a/kexec/kexec.c +++ b/kexec/kexec.c @@ -672,6 +672,9 @@ static void update_purgatory(struct kexec_info *info) if (info->segment[i].mem == (void *)info->rhdr.rel_addr) { continue; } + if (info->fdt_index == i) + continue; + sha256_update(&ctx, info->segment[i].buf, info->segment[i].bufsz); nullsz = info->segment[i].memsz - info->segment[i].bufsz; diff --git a/kexec/kexec.h b/kexec/kexec.h index 595dd68..0906a1b 100644 --- a/kexec/kexec.h +++ b/kexec/kexec.h @@ -169,6 +169,7 @@ struct kexec_info { int command_line_len; int skip_checks; + // Given that we might need to update mutliple kexec segments + // then having array to keep indexes of all hotplug kexec segments + // will be helpful. + unsigned int fdt_index; }; struct arch_map_entry { --- --- Changelog: v1 -> v2: - Use generic hotplug handler introduced by https://lkml.org/lkml/2022/2/9/1406, a significant change from v1. v2 -> v3 - Move fdt_index and fdt_index_vaild variables to kimage_arch struct. - Rebase patche on top of https://lkml.org/lkml/2022/3/3/674 [v5] - Fixed warning reported by checpatch script v3 -> v4: - Update the logic to find the additional space needed for hotadd CPUs post kexec load. Refer "[RFC v4 PATCH 4/5] powerpc/crash hp: add crash hotplug support for kexec_file_load" patch to know more about the change. - Fix a couple of typo. - Replace pr_err to pr_info_once to warn user about memory hotplug support. - In crash hotplug handle exit the for loop if FDT segment is found. --- Sourabh Jain (5): powerpc/kexec: make update_cpus_node non-static powerpc/crash hp: introduce a new config option CRASH_HOTPLUG powrepc/crash hp: update kimage_arch struct powerpc/crash hp: add crash hotplug support for kexec_file_load powerpc/crash hp: add crash hotplug support for kexec_load arch/powerpc/Kconfig | 11 +++ arch/powerpc/include/asm/kexec.h | 3 + arch/powerpc/kexec/core_64.c | 154 ++++++++++++++++++++++++++++++ arch/powerpc/kexec/elf_64.c | 74 ++++++++++++++ arch/powerpc/kexec/file_load_64.c | 87 ----------------- 5 files changed, 242 insertions(+), 87 deletions(-) -- 2.35.1