From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A592FC4338F for ; Wed, 11 Aug 2021 08:50:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8604D603E7 for ; Wed, 11 Aug 2021 08:50:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236335AbhHKIu3 (ORCPT ); Wed, 11 Aug 2021 04:50:29 -0400 Received: from mail.kernel.org ([198.145.29.99]:51644 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236061AbhHKIu2 (ORCPT ); Wed, 11 Aug 2021 04:50:28 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id B6CF260F41; Wed, 11 Aug 2021 08:50:04 +0000 (UTC) Received: from sofa.misterjones.org ([185.219.108.64] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1mDjw2-004Gnd-N2; Wed, 11 Aug 2021 09:50:02 +0100 Date: Wed, 11 Aug 2021 09:50:02 +0100 Message-ID: <87y2989xhh.wl-maz@kernel.org> From: Marc Zyngier To: Valentin Schneider Cc: linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, Will Deacon , Mark Rutland , Thomas Gleixner , Sebastian Andrzej Siewior , Mel Gorman , Ard Biesheuvel Subject: Re: [SPLAT 2/3] irqchip/gic-v3-its: Sleeping spinlocks down gic_reserve_range() In-Reply-To: <20210810134127.1394269-3-valentin.schneider@arm.com> References: <20210810134127.1394269-1-valentin.schneider@arm.com> <20210810134127.1394269-3-valentin.schneider@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: valentin.schneider@arm.com, linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, will@kernel.org, mark.rutland@arm.com, tglx@linutronix.de, bigeasy@linutronix.de, mgorman@techsingularity.net, ardb@kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [+ Ard] On Tue, 10 Aug 2021 14:41:26 +0100, Valentin Schneider wrote: > > [ 0.134518] BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:35 > [ 0.134520] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 0, name: swapper/1 > [ 0.134522] 1 lock held by swapper/1/0: > [ 0.134523] #0: ffff008f3624f728 ((lock).lock){+.+.}-{2:2}, at: get_page_from_freelist (mm/page_alloc.c:3673 mm/page_alloc.c:3704 mm/page_alloc.c:4166) > [ 0.134533] irq event stamp: 0 > [ 0.134534] hardirqs last enabled at (0): 0x0 > [ 0.134538] hardirqs last disabled at (0): copy_process (./include/linux/lockdep.h:195 ./include/linux/lockdep.h:202 ./include/linux/lockdep.h:208 ./include/linux/seqlock.h:78 kernel/fork.c:2084) > [ 0.134542] softirqs last enabled at (0): copy_process (./include/linux/lockdep.h:195 ./include/linux/lockdep.h:202 ./include/linux/lockdep.h:208 ./include/linux/seqlock.h:78 kernel/fork.c:2084) > [ 0.134545] softirqs last disabled at (0): 0x0 > [ 0.134547] Preemption disabled at: > [ 0.134547] rt_mutex_slowunlock (kernel/locking/rtmutex.c:1223) > [ 0.134552] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.14.0-rc4-rt6-torture+ #56 > [ 0.134555] Call trace: > [ 0.134556] dump_backtrace (arch/arm64/kernel/stacktrace.c:151) > [ 0.134558] show_stack (arch/arm64/kernel/stacktrace.c:217) > [ 0.134559] dump_stack_lvl (lib/dump_stack.c:106) > [ 0.134563] dump_stack (lib/dump_stack.c:113) > [ 0.134565] ___might_sleep (kernel/sched/core.c:9306) > [ 0.134567] rt_spin_lock (kernel/locking/rtmutex.c:1641 (discriminator 4) kernel/locking/spinlock_rt.c:30 (discriminator 4) kernel/locking/spinlock_rt.c:36 (discriminator 4) kernel/locking/spinlock_rt.c:44 (discriminator 4)) > [ 0.134569] get_page_from_freelist (mm/page_alloc.c:3673 mm/page_alloc.c:3704 mm/page_alloc.c:4166) > [ 0.134571] __alloc_pages (mm/page_alloc.c:5391) > [ 0.134573] alloc_page_interleave (mm/mempolicy.c:2119) > [ 0.134576] alloc_pages (mm/mempolicy.c:2249) > [ 0.134577] new_slab (mm/slub.c:1740 mm/slub.c:1877 mm/slub.c:1940) > [ 0.134580] ___slab_alloc (mm/slub.c:2951) > [ 0.134582] __slab_alloc.isra.0 (mm/slub.c:3038) > [ 0.134584] kmem_cache_alloc_trace (mm/slub.c:3129 mm/slub.c:3171 mm/slub.c:3188) > [ 0.134587] efi_mem_reserve_iomem (drivers/firmware/efi/efi.c:905) > [ 0.134590] efi_mem_reserve_persistent (drivers/firmware/efi/efi.c:952) > [ 0.134593] its_cpu_init (drivers/irqchip/irq-gic-v3-its.c:3074 drivers/irqchip/irq-gic-v3-its.c:5196) > [ 0.134596] gic_starting_cpu (drivers/irqchip/irq-gic.c:798) > [ 0.134599] cpuhp_invoke_callback (kernel/cpu.c:180) > [ 0.134601] cpuhp_invoke_callback_range (kernel/cpu.c:656) > [ 0.134603] notify_cpu_starting (kernel/cpu.c:1270) > [ 0.134605] secondary_start_kernel (arch/arm64/kernel/smp.c:243) > [ 0.134608] __secondary_switched (arch/arm64/kernel/head.S:661) The issue is that although the redistributor tables have been allocated ahead of time (outside of any cpuhp callback), they cannot be programmed into the RDs until the corresponding CPUs have been brought up (the registers may not be accessible). For the same reason, we don't know whether we can free them (because there is already a table programmed there) or have to reserve them with an efi_mem_reserve_persistent() call. efi_mem_reserve_iomem() uses GFP_ATOMIC for its allocation, but this is not sufficient for RT anymore. We could postpone the reservation of the memory to a later point (it is only useful for kexec), but it isn't clear where that point is. The CPU is not quite up yet, and we can't easily IPI the boot CPU to do the reserve call. M. -- Without deviation from the norm, progress is not possible.