All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mark Rutland <mark.rutland@arm.com>
To: Will Deacon <will.deacon@arm.com>
Cc: Linus Walleij <linus.walleij@linaro.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	rt@linuxtronix.de,
	Russell King - ARM Linux <linux@armlinux.org.uk>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	sboyd@codeaurora.org,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH 15/20] ARM/hw_breakpoint: Convert to hotplug state machine
Date: Thu, 5 Jan 2017 15:57:24 +0000	[thread overview]
Message-ID: <20170105155724.GC25868@leverpostej> (raw)
In-Reply-To: <20170104143205.GG18193@arm.com>

On Wed, Jan 04, 2017 at 02:32:06PM +0000, Will Deacon wrote:
> On Wed, Jan 04, 2017 at 01:56:44PM +0000, Mark Rutland wrote:
> > On Tue, Jan 03, 2017 at 09:33:36AM +0000, Mark Rutland wrote:
> > > On Mon, Jan 02, 2017 at 09:15:29PM +0100, Linus Walleij wrote:
> > > > On Mon, Jan 2, 2017 at 4:00 PM, Russell King - ARM Linux
> > > > <linux@armlinux.org.uk> wrote:
> > > > > On Mon, Jan 02, 2017 at 03:34:32PM +0100, Linus Walleij wrote:
> > > > >> in the first line of arch_hw_breakpoint_init() in
> > > > >> arch/arm/kernel/hw_breakpoint.c
> > > > >>
> > > > >> I suspect that is not an accepable solution ...
> > > > >>
> > > > >> It hangs at PC is at write_wb_reg+0x20c/0x330
> > > > >> Which is c03101dc, and looks like this in objdump -d:
> > > > >>
> > > > >> c031020c:       ee001eba        mcr     14, 0, r1, cr0, cr10, {5}
> > > > >> c0310210:       eaffffb3        b       c03100e4 <write_wb_reg+0x114>
> > > > >
> > > > > ... and this is several instructions after the address you mention above.
> > > > > Presumably c03101dc is accessing a higher numbered register?
> > > > 
> > > > Ah sorry. It looks like this:
> > > > 
> > > > c03101dc:       ee001ed0        mcr     14, 0, r1, cr0, cr0, {6}
> > > > c03101e0:       eaffffbf        b       c03100e4 <write_wb_reg+0x114>
> > > > c03101e4:       ee001ebf        mcr     14, 0, r1, cr0, cr15, {5}
> > > > c03101e8:       eaffffbd        b       c03100e4 <write_wb_reg+0x114>
> > > > c03101ec:       ee001ebe        mcr     14, 0, r1, cr0, cr14, {5}
> > > > c03101f0:       eaffffbb        b       c03100e4 <write_wb_reg+0x114>
> > > > c03101f4:       ee001ebd        mcr     14, 0, r1, cr0, cr13, {5}
> > > > c03101f8:       eaffffb9        b       c03100e4 <write_wb_reg+0x114>
> > > 
> > > FWIW, I was tracking an issue in this area before the holiday.
> > > 
> > > It looked like DBGPRSR.SPD is set unexpectedly over the default idle
> > > path (i.e. WFI), causing the (otherwise valid) register accesses above
> > > to be handled as undefined.
> > > 
> > > I haven't looked at the patch in detail, but I guess that it allows idle
> > > to occur between reset_ctrl_regs() and arch_hw_breakpoint_init().
> > 
> > I've just reproduced this locally on my dragonboard APQ8060.
> > 
> > It looks like the write_wb_reg() call that's exploding is from
> > get_max_wp_len(), which we call immediately after registering the
> > dbg_reset_online callback. Clearing DBGPRSR.SPD before the write_wb_reg() hides
> > the problem, so I suspect we're seeing the issue I mentioned above -- it just
> > so happens that we go idle in a new place.
> 
> When you say "go idle", are we just executing a WFI, 

>From my prior experiments, just executing a WFI as we happen to do in
the default cpu_v7_do_idle. I tried passing cpuidle.off=1, but that
didn't help. NOPing the WFI in cpu_v7_do_idle did mask the issue, from
what I recall.

> or is the power controller coming into play and we're actually
> powering down the non-debug logic?

As far as I can see, that isn't happening. We don't save/restore any
other CPU state in the default idle path, and the kernel is otherwise
happy.

> In the case of the latter, the PM notifier should clear SPD in
> reset_ctrl_regs, so this sounds like a hardware bug where the SPD bit is
> set unconditionally on WFI.
> 
> In that case, this code has always been dodgy -- what happens if you try
> to use hardware breakpoints in GDB in the face of WFI-based idle?

The kernel blows up similiarly to Linus's original report when the
kernel tries to program the breakpoint registers.

I also believe this has always been dodgy.

> > The below hack allows boot to continue, but is not a real fix. I'm not
> > immediately sure what to do.
> 
> If it's never worked, I suggest we blacklist the MIDR until somebody from
> Qualcomm can help us further.

I'll see about putting a patch together.

Thanks,
Mark.

> 
> Will

WARNING: multiple messages have this Message-ID (diff)
From: mark.rutland@arm.com (Mark Rutland)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 15/20] ARM/hw_breakpoint: Convert to hotplug state machine
Date: Thu, 5 Jan 2017 15:57:24 +0000	[thread overview]
Message-ID: <20170105155724.GC25868@leverpostej> (raw)
In-Reply-To: <20170104143205.GG18193@arm.com>

On Wed, Jan 04, 2017 at 02:32:06PM +0000, Will Deacon wrote:
> On Wed, Jan 04, 2017 at 01:56:44PM +0000, Mark Rutland wrote:
> > On Tue, Jan 03, 2017 at 09:33:36AM +0000, Mark Rutland wrote:
> > > On Mon, Jan 02, 2017 at 09:15:29PM +0100, Linus Walleij wrote:
> > > > On Mon, Jan 2, 2017 at 4:00 PM, Russell King - ARM Linux
> > > > <linux@armlinux.org.uk> wrote:
> > > > > On Mon, Jan 02, 2017 at 03:34:32PM +0100, Linus Walleij wrote:
> > > > >> in the first line of arch_hw_breakpoint_init() in
> > > > >> arch/arm/kernel/hw_breakpoint.c
> > > > >>
> > > > >> I suspect that is not an accepable solution ...
> > > > >>
> > > > >> It hangs at PC is at write_wb_reg+0x20c/0x330
> > > > >> Which is c03101dc, and looks like this in objdump -d:
> > > > >>
> > > > >> c031020c:       ee001eba        mcr     14, 0, r1, cr0, cr10, {5}
> > > > >> c0310210:       eaffffb3        b       c03100e4 <write_wb_reg+0x114>
> > > > >
> > > > > ... and this is several instructions after the address you mention above.
> > > > > Presumably c03101dc is accessing a higher numbered register?
> > > > 
> > > > Ah sorry. It looks like this:
> > > > 
> > > > c03101dc:       ee001ed0        mcr     14, 0, r1, cr0, cr0, {6}
> > > > c03101e0:       eaffffbf        b       c03100e4 <write_wb_reg+0x114>
> > > > c03101e4:       ee001ebf        mcr     14, 0, r1, cr0, cr15, {5}
> > > > c03101e8:       eaffffbd        b       c03100e4 <write_wb_reg+0x114>
> > > > c03101ec:       ee001ebe        mcr     14, 0, r1, cr0, cr14, {5}
> > > > c03101f0:       eaffffbb        b       c03100e4 <write_wb_reg+0x114>
> > > > c03101f4:       ee001ebd        mcr     14, 0, r1, cr0, cr13, {5}
> > > > c03101f8:       eaffffb9        b       c03100e4 <write_wb_reg+0x114>
> > > 
> > > FWIW, I was tracking an issue in this area before the holiday.
> > > 
> > > It looked like DBGPRSR.SPD is set unexpectedly over the default idle
> > > path (i.e. WFI), causing the (otherwise valid) register accesses above
> > > to be handled as undefined.
> > > 
> > > I haven't looked at the patch in detail, but I guess that it allows idle
> > > to occur between reset_ctrl_regs() and arch_hw_breakpoint_init().
> > 
> > I've just reproduced this locally on my dragonboard APQ8060.
> > 
> > It looks like the write_wb_reg() call that's exploding is from
> > get_max_wp_len(), which we call immediately after registering the
> > dbg_reset_online callback. Clearing DBGPRSR.SPD before the write_wb_reg() hides
> > the problem, so I suspect we're seeing the issue I mentioned above -- it just
> > so happens that we go idle in a new place.
> 
> When you say "go idle", are we just executing a WFI, 

>From my prior experiments, just executing a WFI as we happen to do in
the default cpu_v7_do_idle. I tried passing cpuidle.off=1, but that
didn't help. NOPing the WFI in cpu_v7_do_idle did mask the issue, from
what I recall.

> or is the power controller coming into play and we're actually
> powering down the non-debug logic?

As far as I can see, that isn't happening. We don't save/restore any
other CPU state in the default idle path, and the kernel is otherwise
happy.

> In the case of the latter, the PM notifier should clear SPD in
> reset_ctrl_regs, so this sounds like a hardware bug where the SPD bit is
> set unconditionally on WFI.
> 
> In that case, this code has always been dodgy -- what happens if you try
> to use hardware breakpoints in GDB in the face of WFI-based idle?

The kernel blows up similiarly to Linus's original report when the
kernel tries to program the breakpoint registers.

I also believe this has always been dodgy.

> > The below hack allows boot to continue, but is not a real fix. I'm not
> > immediately sure what to do.
> 
> If it's never worked, I suggest we blacklist the MIDR until somebody from
> Qualcomm can help us further.

I'll see about putting a patch together.

Thanks,
Mark.

> 
> Will

  reply	other threads:[~2017-01-05 15:58 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-17 18:35 cpu hotplug: convert more drivers (batch #5) Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 01/20] x86/mce/therm_throt: Convert to hotplug state machine Sebastian Andrzej Siewior
2016-11-21 15:46   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:39   ` tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 02/20] x86/cpuid: " Sebastian Andrzej Siewior
2016-11-21 15:47   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:40   ` tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 03/20] x86/msr: " Sebastian Andrzej Siewior
2016-11-21 15:48   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:41   ` tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 04/20] hwmon/coretemp: " Sebastian Andrzej Siewior
2016-11-20 22:30   ` [04/20] " Guenter Roeck
2016-11-21 22:35     ` Sebastian Andrzej Siewior
2016-11-21 15:56   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-21 21:32     ` Guenter Roeck
2016-11-21 21:53       ` Thomas Gleixner
2016-11-17 18:35 ` [PATCH 05/20] hwmon/via-cputemp: Remove pointless CPU check on each CPU Sebastian Andrzej Siewior
2016-11-19 17:23   ` [05/20] " Guenter Roeck
2016-11-19 22:53     ` Sebastian Andrzej Siewior
2016-11-20  3:53       ` Guenter Roeck
2016-11-20 20:34         ` Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 06/20] hwmon/via-cputemp: Convert to hotplug state machine Sebastian Andrzej Siewior
2016-11-18 15:09   ` [PATCH 06/20 v2] " Sebastian Andrzej Siewior
2016-11-23 15:29   ` [PATCH 06/20] " Guenter Roeck
2016-12-09 11:53   ` Thomas Gleixner
2016-12-09 18:17     ` Guenter Roeck
2016-12-09 18:27       ` Thomas Gleixner
2016-11-17 18:35 ` [PATCH 07/20] pci/xgene-msi: " Sebastian Andrzej Siewior
2016-11-17 18:35   ` Sebastian Andrzej Siewior
2016-11-17 18:35   ` Sebastian Andrzej Siewior
2016-11-21 15:48   ` [tip:smp/hotplug] PCI/xgene-msi: " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:42   ` tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 08/20] powercap/intel_rapl: Add missing domain data update on hotplug Sebastian Andrzej Siewior
2016-11-21 15:49   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2016-11-17 18:35 ` [PATCH 09/20] powercap/intel rapl: Convert to hotplug state machine Sebastian Andrzej Siewior
2016-11-21 15:49   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 10/20] powercap/intel_rapl: Cleanup duplicated init code Sebastian Andrzej Siewior
2016-11-21 15:50   ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2016-11-17 18:35 ` [PATCH 11/20] watchdog/octeon: Convert to hotplug state machine Sebastian Andrzej Siewior
2016-11-21 15:50   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:43   ` tip-bot for Sebastian Andrzej Siewior
2016-11-24 16:10   ` [11/20] " Guenter Roeck
2016-11-17 18:35 ` [PATCH 12/20] net/iucv: " Sebastian Andrzej Siewior
2016-11-21 15:51   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:43   ` tip-bot for Sebastian Andrzej Siewior
2016-11-23 18:04   ` [PATCH 12/20] " Ursula Braun
2016-11-24  9:10     ` Sebastian Andrzej Siewior
2016-11-24 14:14       ` [PATCH 12/20 v2] " Sebastian Andrzej Siewior
2016-11-24 16:10         ` [PATCH] net/iucv: use explicit clean up labels in iucv_init() Sebastian Andrzej Siewior
2016-11-24 19:57           ` [tip:smp/hotplug] net/iucv: Use " tip-bot for Sebastian Andrzej Siewior
2016-11-28 16:24           ` [PATCH] net/iucv: use " David Miller
2016-11-28 16:31             ` Thomas Gleixner
2016-11-28 16:37           ` [tip:smp/hotplug] net/iucv: Use " tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 13/20] sched/nohz: Convert to hotplug state machine Sebastian Andrzej Siewior
2016-11-21 15:51   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:44   ` tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 14/20] arm/bL_switcher: " Sebastian Andrzej Siewior
2016-11-17 18:35   ` Sebastian Andrzej Siewior
2016-11-21 15:52   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:44   ` tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 15/20] ARM/hw_breakpoint: " Sebastian Andrzej Siewior
2016-11-17 18:35   ` Sebastian Andrzej Siewior
2016-11-18 12:04   ` Will Deacon
2016-11-18 12:04     ` Will Deacon
2016-11-18 13:11     ` Thomas Gleixner
2016-11-18 13:11       ` Thomas Gleixner
2016-11-18 13:29       ` Will Deacon
2016-11-18 13:29         ` Will Deacon
2016-11-18 13:42         ` Thomas Gleixner
2016-11-18 13:42           ` Thomas Gleixner
2016-11-18 13:48           ` Will Deacon
2016-11-18 13:48             ` Will Deacon
2016-11-18 13:59             ` Thomas Gleixner
2016-11-18 13:59               ` Thomas Gleixner
2016-11-18 14:11               ` Will Deacon
2016-11-18 14:11                 ` Will Deacon
2016-11-21 15:52   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:45   ` tip-bot for Sebastian Andrzej Siewior
2017-01-02 14:15   ` [PATCH 15/20] " Linus Walleij
2017-01-02 14:15     ` Linus Walleij
2017-01-02 14:34     ` Linus Walleij
2017-01-02 14:34       ` Linus Walleij
2017-01-02 15:00       ` Russell King - ARM Linux
2017-01-02 15:00         ` Russell King - ARM Linux
2017-01-02 20:15         ` Linus Walleij
2017-01-02 20:15           ` Linus Walleij
2017-01-03  9:33           ` Mark Rutland
2017-01-03  9:33             ` Mark Rutland
2017-01-04 11:27             ` Sebastian Andrzej Siewior
2017-01-04 11:27               ` Sebastian Andrzej Siewior
2017-01-04 13:56             ` Mark Rutland
2017-01-04 13:56               ` Mark Rutland
2017-01-04 14:32               ` Will Deacon
2017-01-04 14:32                 ` Will Deacon
2017-01-05 15:57                 ` Mark Rutland [this message]
2017-01-05 15:57                   ` Mark Rutland
2017-01-05 15:26               ` Linus Walleij
2017-01-05 15:26                 ` Linus Walleij
2017-01-05 17:14                 ` Mark Rutland
2017-01-05 17:14                   ` Mark Rutland
2016-11-17 18:35 ` [PATCH 16/20] powerpc/sysfs: " Sebastian Andrzej Siewior
2016-11-17 18:35   ` Sebastian Andrzej Siewior
2016-11-21 15:53   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:45   ` tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 17/20] sparc/sysfs: " Sebastian Andrzej Siewior
2016-11-17 18:35   ` Sebastian Andrzej Siewior
2016-11-17 18:39   ` David Miller
2016-11-17 18:39     ` David Miller
2016-11-21 15:54   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:46   ` tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 18/20] x86/oprofile/nmi: Remove superfluous smp_function_call_single() Sebastian Andrzej Siewior
2016-11-21 15:54   ` [tip:smp/hotplug] " tip-bot for Anna-Maria Gleixner
2016-11-22 22:46   ` tip-bot for Anna-Maria Gleixner
2016-11-17 18:35 ` [PATCH 19/20] x86/oprofile/nmi: Convert to hotplug state machine Sebastian Andrzej Siewior
2016-11-21 15:55   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:47   ` tip-bot for Sebastian Andrzej Siewior
2016-11-17 18:35 ` [PATCH 20/20] x86/pci/amd-bus: " Sebastian Andrzej Siewior
2016-11-17 18:35   ` Sebastian Andrzej Siewior
2016-11-21 15:55   ` [tip:smp/hotplug] " tip-bot for Sebastian Andrzej Siewior
2016-11-22 22:47   ` tip-bot for Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170105155724.GC25868@leverpostej \
    --to=mark.rutland@arm.com \
    --cc=bigeasy@linutronix.de \
    --cc=linus.walleij@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@armlinux.org.uk \
    --cc=rt@linuxtronix.de \
    --cc=sboyd@codeaurora.org \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.