From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1756927AbcBJLxr (ORCPT <rfc822;w@1wt.eu>);
	Wed, 10 Feb 2016 06:53:47 -0500
Received: from foss.arm.com ([217.140.101.70]:46536 "EHLO foss.arm.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1753631AbcBJLxq (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Wed, 10 Feb 2016 06:53:46 -0500
Message-ID: <56BB247F.6040202@arm.com>
Date: Wed, 10 Feb 2016 11:52:31 +0000
From: James Morse <james.morse@arm.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.6.0
MIME-Version: 1.0
To: Will Deacon <will.deacon@arm.com>, Yang Shi <yang.shi@linaro.org>
CC: catalin.marinas@arm.com, linux-kernel@vger.kernel.org,
        linux-arm-kernel@lists.infradead.org, linaro-kernel@lists.linaro.org
Subject: Re: [PATCH] arm64: use raw_smp_processor_id in stack backtrace dump
References: <1455053182-31404-1-git-send-email-yang.shi@linaro.org> <20160210102939.GD1052@arm.com>
In-Reply-To: <20160210102939.GD1052@arm.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 10/02/16 10:29, Will Deacon wrote:
> On Tue, Feb 09, 2016 at 01:26:22PM -0800, Yang Shi wrote:
>> dump_backtrace may be called in kthread context, which is not bound to a single
>> cpu, i.e. khungtaskd, then calling smp_processor_id may trigger the below bug
>> report:
> 
> If we're preemptible here, it means that our irq_stack_ptr is potentially
> bogus. Whilst this isn't an issue for kthreads, it does feel like we
> could make this slightly more robust in the face of potential frame
> corruption. Maybe just zero the IRQ stack pointer if we're in preemptible
> context?

Switching between stacks is only valid if we are tracing ourselves while on the
irq_stack, we should probably prevent it for other tasks too.

Something like (untested):
---------------------
if (tsk == current && in_atomic())
	irq_stack_ptr = IRQ_STACK_PTR(smp_processor_id());
else
	irq_stack_ptr = 0;
---------------------

This would work when we trace ourselves while on the irq_stack, but break*
tracing a running task on a remote cpu (khungtaskd doesn't do this).

The same fix would apply to unwind_frame(), we have 'tsk' in both functions.

Thoughts?


James


* If this were to ever happen, we would fail to switch to the original stack if
  tracing a remote irq stack, and print the wrong exception args. I'm not aware
  of any users of this, (panic() sends an IPI).

From mboxrd@z Thu Jan  1 00:00:00 1970
From: james.morse@arm.com (James Morse)
Date: Wed, 10 Feb 2016 11:52:31 +0000
Subject: [PATCH] arm64: use raw_smp_processor_id in stack backtrace dump
In-Reply-To: <20160210102939.GD1052@arm.com>
References: <1455053182-31404-1-git-send-email-yang.shi@linaro.org>
 <20160210102939.GD1052@arm.com>
Message-ID: <56BB247F.6040202@arm.com>
To: linux-arm-kernel@lists.infradead.org
List-Id: linux-arm-kernel.lists.infradead.org

On 10/02/16 10:29, Will Deacon wrote:
> On Tue, Feb 09, 2016 at 01:26:22PM -0800, Yang Shi wrote:
>> dump_backtrace may be called in kthread context, which is not bound to a single
>> cpu, i.e. khungtaskd, then calling smp_processor_id may trigger the below bug
>> report:
> 
> If we're preemptible here, it means that our irq_stack_ptr is potentially
> bogus. Whilst this isn't an issue for kthreads, it does feel like we
> could make this slightly more robust in the face of potential frame
> corruption. Maybe just zero the IRQ stack pointer if we're in preemptible
> context?

Switching between stacks is only valid if we are tracing ourselves while on the
irq_stack, we should probably prevent it for other tasks too.

Something like (untested):
---------------------
if (tsk == current && in_atomic())
	irq_stack_ptr = IRQ_STACK_PTR(smp_processor_id());
else
	irq_stack_ptr = 0;
---------------------

This would work when we trace ourselves while on the irq_stack, but break*
tracing a running task on a remote cpu (khungtaskd doesn't do this).

The same fix would apply to unwind_frame(), we have 'tsk' in both functions.

Thoughts?


James


* If this were to ever happen, we would fail to switch to the original stack if
  tracing a remote irq stack, and print the wrong exception args. I'm not aware
  of any users of this, (panic() sends an IPI).