From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756927AbcBJLxr (ORCPT ); Wed, 10 Feb 2016 06:53:47 -0500 Received: from foss.arm.com ([217.140.101.70]:46536 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753631AbcBJLxq (ORCPT ); Wed, 10 Feb 2016 06:53:46 -0500 Message-ID: <56BB247F.6040202@arm.com> Date: Wed, 10 Feb 2016 11:52:31 +0000 From: James Morse User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.6.0 MIME-Version: 1.0 To: Will Deacon , Yang Shi CC: catalin.marinas@arm.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linaro-kernel@lists.linaro.org Subject: Re: [PATCH] arm64: use raw_smp_processor_id in stack backtrace dump References: <1455053182-31404-1-git-send-email-yang.shi@linaro.org> <20160210102939.GD1052@arm.com> In-Reply-To: <20160210102939.GD1052@arm.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/02/16 10:29, Will Deacon wrote: > On Tue, Feb 09, 2016 at 01:26:22PM -0800, Yang Shi wrote: >> dump_backtrace may be called in kthread context, which is not bound to a single >> cpu, i.e. khungtaskd, then calling smp_processor_id may trigger the below bug >> report: > > If we're preemptible here, it means that our irq_stack_ptr is potentially > bogus. Whilst this isn't an issue for kthreads, it does feel like we > could make this slightly more robust in the face of potential frame > corruption. Maybe just zero the IRQ stack pointer if we're in preemptible > context? Switching between stacks is only valid if we are tracing ourselves while on the irq_stack, we should probably prevent it for other tasks too. Something like (untested): --------------------- if (tsk == current && in_atomic()) irq_stack_ptr = IRQ_STACK_PTR(smp_processor_id()); else irq_stack_ptr = 0; --------------------- This would work when we trace ourselves while on the irq_stack, but break* tracing a running task on a remote cpu (khungtaskd doesn't do this). The same fix would apply to unwind_frame(), we have 'tsk' in both functions. Thoughts? James * If this were to ever happen, we would fail to switch to the original stack if tracing a remote irq stack, and print the wrong exception args. I'm not aware of any users of this, (panic() sends an IPI). From mboxrd@z Thu Jan 1 00:00:00 1970 From: james.morse@arm.com (James Morse) Date: Wed, 10 Feb 2016 11:52:31 +0000 Subject: [PATCH] arm64: use raw_smp_processor_id in stack backtrace dump In-Reply-To: <20160210102939.GD1052@arm.com> References: <1455053182-31404-1-git-send-email-yang.shi@linaro.org> <20160210102939.GD1052@arm.com> Message-ID: <56BB247F.6040202@arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On 10/02/16 10:29, Will Deacon wrote: > On Tue, Feb 09, 2016 at 01:26:22PM -0800, Yang Shi wrote: >> dump_backtrace may be called in kthread context, which is not bound to a single >> cpu, i.e. khungtaskd, then calling smp_processor_id may trigger the below bug >> report: > > If we're preemptible here, it means that our irq_stack_ptr is potentially > bogus. Whilst this isn't an issue for kthreads, it does feel like we > could make this slightly more robust in the face of potential frame > corruption. Maybe just zero the IRQ stack pointer if we're in preemptible > context? Switching between stacks is only valid if we are tracing ourselves while on the irq_stack, we should probably prevent it for other tasks too. Something like (untested): --------------------- if (tsk == current && in_atomic()) irq_stack_ptr = IRQ_STACK_PTR(smp_processor_id()); else irq_stack_ptr = 0; --------------------- This would work when we trace ourselves while on the irq_stack, but break* tracing a running task on a remote cpu (khungtaskd doesn't do this). The same fix would apply to unwind_frame(), we have 'tsk' in both functions. Thoughts? James * If this were to ever happen, we would fail to switch to the original stack if tracing a remote irq stack, and print the wrong exception args. I'm not aware of any users of this, (panic() sends an IPI).