linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [Patch 0/1] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver XIV
@ 2010-03-08 18:12 K.Prasad
  2010-03-08 18:14 ` [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64 K.Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: K.Prasad @ 2010-03-08 18:12 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Frederic Weisbecker, David Gibson, paulus, Alan Stern,
	Roland McGrath

Hi Ben,
        Please find the version XIV of the patch that ports the
hw-breakpoint interfaces (in kernel/hw_breakpoint.c) to PPC64. The new
patch includes the changes as listed below.

Kindly include this patch as a part of powerpc -next tree to enable wider
testing and to be eventually pushed upstream.

Changelog - ver XIV
--------------------
(Version XIII: linuxppc-dev message-id: 20100215055605.GB3670@in.ibm.com)

- Removed the 'name' field from 'struct arch_hw_breakpoint'.
- All callback invocations through bp->overflow_handler() are replaced
  with perf_bp_event().
- Removed the check for pre-existing single-stepping mode in
  hw_breakpoint_handler() as this check is unreliable while in kernel-space.
  Side effect of this change is the non-triggering of hw-breakpoints while
  single-stepping kernel through KGDB or Xmon.
- Minor code-cleanups and addition of comments in hw_breakpoint_handler() and
  single_step_dabr_instruction().
- Re-based to commit 25cf84cf377c0aae5dbcf937ea89bc7893db5176 of
  linux-2.6

Thanks,
K.Prasad

Changelog - ver XIII
--------------------
(Version XII: linuxppc-dev ref: 20100121084640.GA3252@in.ibm.com)

- Fixed a bug for user-space breakpoints (triggered following the
  failure of a breakpoint request) causing data exception (due to access
  of NULL/spurious address).
- Re-based on commit 724e6d3fe8003c3f60bf404bf22e4e331327c596 of
  linux-2.6
 
Changelog - ver XII
--------------------
(Version XI: linuxppc-dev ref: 20100119091234.GA9971@in.ibm.com)

- Unset MSR_SE only if kernel was not previously in single-step mode.
- Pre-emption is now enabled before returning from the hw-breakpoint exception
  handler.
- Variables to track the source of single-step exception (breakpoint from kernel,
  user-space vs single-stepping due to other requests) are added.
- Extraneous hw-breakpoint exceptions (due to memory accesses lying outside
  monitored symbol length) is now done for both kernel and user-space
  (previously only user-space).
- single_step_dabr_instruction() now returns NOTIFY_DONE if kernel was in
  single-step mode even before the hw-breakpoint. This enables other users of
  single-step mode to be notified of the exception.
- User-space instructions are not emulated from kernel-space, they are instead
  single-stepped.
  
Changelog - ver XI
------------------
(Version X: linuxppc-dev ref: 20091211160144.GA23156@in.ibm.com)
- Conditionally unset MSR_SE in the single-step handler
- Added comments to explain the duration and need for pre-emption
disable following hw-breakpoint exception.

Changelog - ver X
------------------
- Re-write the PPC64 patches for the new implementation of hw-breakpoints that
  uses the perf-layer.
- Rebased to commit 7622fc234190a37d4e9fe3ed944a2b61a63fca03 of -tip.

Changelog - ver IX
-------------------
- Invocation of user-defined callback will be 'trigger-after-execute' (except
  for ptrace).
- Creation of a new global per-CPU breakpoint structure to help invocation of
  user-defined callback from single-step handler.
(Changes made now)
- Validation before registration will fail only if the address does not match
  the kernel symbol's (if specified) resolved address
  (through kallsyms_lookup_name()).
- 'symbolsize' value is expected to within the range contained by the symbol's
  starting address and the end of a double-word boundary (8 Bytes).
- PPC64's arch-dependant code is now aware of 'cpumask' in 'struct hw_breakpoint'
  and can accomodate requests for a subset of CPUs in the system.
- Introduced arch_disable_hw_breakpoint() required for
  <enable><disable>_hw_breakpoint() APIs.

Changelog - ver VIII
-------------------
- Reverting changes to allow one-shot breakpoints only for ptrace requests.
- Minor changes in sanity checking in arch_validate_hwbkpt_settings().
- put_cpu_no_resched() is no longer available. Converted to put_cpu().

Changelog - ver VII
-------------------
- Allow the one-shot behaviour for exception handlers to be defined by the user.
  A new 'is_one_shot' flag is added to 'struct arch_hw_breakpoint'.

Changelog - ver VI
------------------
The task of identifying 'genuine' breakpoint exceptions from those caused by
'out-of-range' accesses turned out to be more tricky than originally thought.
Some changes to this effect were made in version IV of this patchset, but they
were not sufficient for user-space. Basically the breakpoint address received
through ptrace is always aligned to 8-bytes since ptrace receives an encoded
'data' (consisting of address | translation_enable | bkpt_type), and the size of
the symbol is not known. However for kernel-space addresses, the symbol-size can
be determined using kallsyms_lookup_size_offset() and this is used to check if
DAR (in the exception context) is
'bkpt_address <= DAR <= (bkpt_address + symbol_size)', failing which we conclude
it as a stray exception.

The following changes are made to enable check:
- Addition of a symbolsize field in 'struct arch_hw_breakpoint' field.
- Store the size of the 'watched' kernel symbol into 'symbolsize' field in
  arch_store_info(0 routine.
- Verify if the above described condition is true when is_one_shot is FALSE in
  hw_breakpoint_handler().

Changelog - ver V
------------------
- Breakpoint requests from ptrace (for user-space) are designed to be one-shot
in PPC64. The patch contains changes to retain this behaviour by returning early
in hw_breakpoint_handler() [without re-initialising DABR] and unregistering the
user-space request in ptrace_triggered(). It is safe to make a
unregister_user_hw_breakpoint() call from the breakpoint exception context
[through ptrace_triggered()] without giving rise to circular locking-dependancy.
This is because there can be no kernel code running on the CPU (which received
the exception) with the same spinlock held.

- Minor change in 'type' member of 'struct arch_hw_breakpoint' from u8 to 'int'.

Changelog - ver IV
------------------
- While DABR register requires double-word (8 bytes) aligned addresses, i.e.
the breakpoint is active over a range of 8 bytes, PPC64 allows byte-level
addressability. This may lead to stray exceptions which have to be ignored in
hw_breakpoint_handler(), when DAR != (Breakpoint request address). However DABR
will be populated with the requested breakpoint address aligned to the previous
double-word address. The code is now modified to store user-requested address
in 'bp->info.address' but update the DABR with a double-word aligned address.

- Please note that the Data Breakpoint facility in Xmon is broken as of 2.6.29
and the same has not been integrated into this facility as described in Ver I.

Changelog - ver III
------------------
- Patches are based on commit 08f16e060bf54bdc34f800ed8b5362cdeda75d8b of -tip
tree.

- The declarations in arch/powerpc/include/asm/hw_breakpoint.h are done only if
CONFIG_PPC64 is defined. This eliminates the need to conditionally include this
header file.

- load_debug_registers() is done in start_secondary() i.e. during CPU
initialisation.

- arch_check_va_<> routines in hw_breakpoint.c are now replaced with a much
simpler is_kernel_addr() check in arch_validate_hwbkpt_settings()

- Return code of hw_breakpoint_handler() when triggered due to Lazy debug
register switching is now changed to NOTIFY_STOP.

- The ptrace code no longer sets the TIF_DEBUG task flag as it is proposed to
be done in register_user_hw_breakpoint() routine.

- hw_breakpoint_handler() is now modified to use hbp_kernel_pos value to
  determine if the trigger was a user/kernel space address. The DAR register
  value is checked with the address stored in 'struct hw_breakpoint' to avoid
  handling of exceptions that belong to kprobe/Xmon.


Changelog - ver II
------------------
- Split the monolithic patch into six logical patches

- Changed the signature of arch_check_va_in_<user><kernel>space functions. They
  are now marked static.

- HB_NUM is now called as HBP_NUM (to preserve a consistent short-name
  convention)

- Introduced hw_breakpoint_disable() and changes to kexec code to disable
  breakpoints before a reboot.
  
- Minor changes in ptrace code to use macro-defined constants instead of
  numbers.

- Introduced a new constant definition INSTRUCTION_LEN in reg.h

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-03-08 18:12 [Patch 0/1] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver XIV K.Prasad
@ 2010-03-08 18:14 ` K.Prasad
  2010-03-12  6:19   ` Benjamin Herrenschmidt
  2010-03-23  5:33   ` Paul Mackerras
  0 siblings, 2 replies; 26+ messages in thread
From: K.Prasad @ 2010-03-08 18:14 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Frederic Weisbecker, David Gibson, paulus, Alan Stern,
	Roland McGrath

Implement perf-events based hw-breakpoint interfaces for PPC64 processors.
These interfaces help arbitrate requests from various users and schedules
them as appropriate.

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                     |    1 
 arch/powerpc/include/asm/hw_breakpoint.h |   54 ++++
 arch/powerpc/include/asm/processor.h     |    6 
 arch/powerpc/include/asm/reg.h           |    1 
 arch/powerpc/kernel/Makefile             |    2 
 arch/powerpc/kernel/hw_breakpoint.c      |  339 +++++++++++++++++++++++++++++++
 arch/powerpc/kernel/process.c            |    5 
 arch/powerpc/kernel/ptrace.c             |   79 +++++++
 arch/powerpc/mm/fault.c                  |   14 -
 9 files changed, 492 insertions(+), 9 deletions(-)

Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/hw_breakpoint.h
===================================================================
--- /dev/null
+++ linux-2.6.ppc64_test/arch/powerpc/include/asm/hw_breakpoint.h
@@ -0,0 +1,54 @@
+#ifndef	_PPC64_HW_BREAKPOINT_H
+#define	_PPC64_HW_BREAKPOINT_H
+
+#ifdef	__KERNEL__
+#define	__ARCH_HW_BREAKPOINT_H
+#ifdef CONFIG_PPC64
+
+struct arch_hw_breakpoint {
+	u8		len; /* length of the target symbol */
+	int		type;
+	unsigned long	address;
+};
+
+#include <linux/kdebug.h>
+#include <asm/reg.h>
+#include <asm/system.h>
+
+/* Total number of available HW breakpoint registers */
+#define HBP_NUM 1
+
+struct perf_event;
+struct pmu;
+struct perf_sample_data;
+
+#define HW_BREAKPOINT_ALIGN 0x7
+/* Maximum permissible length of any HW Breakpoint */
+#define HW_BREAKPOINT_LEN 0x8
+
+extern int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+						unsigned long val, void *data);
+int arch_install_hw_breakpoint(struct perf_event *bp);
+void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+void hw_breakpoint_pmu_read(struct perf_event *bp);
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp);
+extern void flush_ptrace_hw_breakpoint(struct task_struct *tsk);
+
+extern struct pmu perf_ops_bp;
+extern void ptrace_triggered(struct perf_event *bp, int nmi,
+			struct perf_sample_data *data, struct pt_regs *regs);
+static inline void hw_breakpoint_disable(void)
+{
+	set_dabr(0);
+}
+
+#else
+static inline void hw_breakpoint_disable(void)
+{
+	/* Function is defined only on PPC64 for now */
+}
+#endif	/* CONFIG_PPC64 */
+#endif	/* __KERNEL__ */
+#endif	/* _PPC64_HW_BREAKPOINT_H */
Index: linux-2.6.ppc64_test/arch/powerpc/kernel/hw_breakpoint.c
===================================================================
--- /dev/null
+++ linux-2.6.ppc64_test/arch/powerpc/kernel/hw_breakpoint.c
@@ -0,0 +1,339 @@
+/*
+ * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
+ * using the CPU's debug registers. Derived from
+ * "arch/x86/kernel/hw_breakpoint.c"
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright 2009 IBM Corporation
+ * Author: K.Prasad <prasad@linux.vnet.ibm.com>
+ *
+ */
+
+#include <linux/hw_breakpoint.h>
+#include <linux/notifier.h>
+#include <linux/kprobes.h>
+#include <linux/percpu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/smp.h>
+
+#include <asm/hw_breakpoint.h>
+#include <asm/processor.h>
+#include <asm/sstep.h>
+
+/*
+ * Store the 'bp' that caused the hw-breakpoint exception just before we
+ * single-step. Used to distinguish a single-step exception (due to a previous
+ * hw-breakpoint exception) from a normal one
+ */
+static DEFINE_PER_CPU(struct perf_event *, last_hit_bp);
+
+/*
+ * Stores the breakpoints currently in use on each breakpoint address
+ * register for each cpus
+ */
+static DEFINE_PER_CPU(struct perf_event *, bp_per_reg);
+
+/*
+ * Install a perf counter breakpoint.
+ *
+ * We seek a free debug address register and use it for this
+ * breakpoint.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (!*slot)
+		*slot = bp;
+	else {
+		WARN_ONCE(1, "Can't find any breakpoint slot");
+		return -EBUSY;
+	}
+
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	return 0;
+}
+
+/*
+ * Uninstall the breakpoint contained in the given counter.
+ *
+ * First we search the debug address register it uses and then we disable
+ * it.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (*slot == bp)
+		*slot = NULL;
+	else {
+		WARN_ONCE(1, "Can't find the breakpoint slot");
+		return;
+	}
+	set_dabr(0);
+}
+
+/*
+ * Validate the arch-specific HW Breakpoint register settings
+ */
+int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk)
+{
+	int is_kernel, ret = -EINVAL;
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+
+	if (!bp)
+		return ret;
+
+	switch (bp->attr.bp_type) {
+	case HW_BREAKPOINT_R:
+		info->type = DABR_DATA_READ;
+		break;
+	case HW_BREAKPOINT_W:
+		info->type = DABR_DATA_WRITE;
+		break;
+	case HW_BREAKPOINT_R | HW_BREAKPOINT_W:
+		info->type = (DABR_DATA_READ | DABR_DATA_WRITE);
+		break;
+	default:
+		return ret;
+	}
+	/* TODO: Check for a valid triggered function */
+	/* if (!bp->triggered)
+		return -EINVAL; */
+
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+	if ((tsk && is_kernel) || (!tsk && !is_kernel))
+		return -EINVAL;
+
+	info->address = bp->attr.bp_addr;
+	info->len = bp->attr.bp_len;
+
+	/*
+	 * Since breakpoint length can be a maximum of HW_BREAKPOINT_LEN(8)
+	 * and breakpoint addresses are aligned to nearest double-word
+	 * HW_BREAKPOINT_ALIGN by rounding off to the lower address, the
+	 * 'symbolsize' should satisfy the check below.
+	 */
+	if (info->len >
+	    (HW_BREAKPOINT_LEN - (info->address & HW_BREAKPOINT_ALIGN)))
+		return -EINVAL;
+	return 0;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_handler(struct die_args *args)
+{
+	int rc = NOTIFY_STOP;
+	struct perf_event *bp;
+	struct pt_regs *regs = args->regs;
+	unsigned long dar = regs->dar;
+	int cpu, is_kernel, stepped = 1;
+	struct arch_hw_breakpoint *info;
+
+	/* Disable breakpoints during exception handling */
+	set_dabr(0);
+	cpu = get_cpu();
+	/*
+	 * The counter may be concurrently released but that can only
+	 * occur from a call_rcu() path. We can then safely fetch
+	 * the breakpoint, use its callback, touch its counter
+	 * while we are in an rcu_read_lock() path.
+	 */
+	rcu_read_lock();
+
+	bp = per_cpu(bp_per_reg, cpu);
+	if (!bp)
+		goto out;
+	info = counter_arch_bp(bp);
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+
+	/*
+	 * Verify if dar lies within the address range occupied by the symbol
+	 * being watched to filter extraneous exceptions.
+	 */
+	if (!((bp->attr.bp_addr <= dar) &&
+	    (dar <= (bp->attr.bp_addr + bp->attr.bp_len))))
+		/*
+		 * This exception is triggered not because of a memory access on
+		 * the monitored variable but in the double-word address range
+		 * in which it is contained. We will consume this exception,
+		 * considering it as 'noise'.
+		 */
+		goto restore_bp;
+
+	/*
+	 * Return early after invoking user-callback function without restoring
+	 * DABR if the breakpoint is from ptrace which always operates in
+	 * one-shot mode
+	 */
+	if (bp->overflow_handler == ptrace_triggered) {
+		perf_bp_event(bp, regs);
+		rc = NOTIFY_DONE;
+		goto out;
+	}
+
+	/*
+	 * Do not emulate user-space instructions from kernel-space,
+	 * instead single-step them.
+	 */
+	if (!is_kernel) {
+		current->thread.last_hit_ubp = bp;
+		regs->msr |= MSR_SE;
+		goto out;
+	}
+
+	stepped = emulate_step(regs, regs->nip);
+	/* emulate_step() could not execute it, single-step them */
+	if (stepped == 0) {
+		regs->msr |= MSR_SE;
+		per_cpu(last_hit_bp, cpu) = bp;
+		goto out;
+	}
+	/*
+	 * As a policy, the callback is invoked in a 'trigger-after-execute'
+	 * fashion
+	 */
+	perf_bp_event(bp, regs);
+
+restore_bp:
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+out:
+	rcu_read_unlock();
+	put_cpu();
+	return rc;
+}
+
+/*
+ * Handle single-step exceptions following a DABR hit.
+ */
+int __kprobes single_step_dabr_instruction(struct die_args *args)
+{
+	struct pt_regs *regs = args->regs;
+	int cpu = get_cpu();
+	int ret = NOTIFY_DONE;
+	siginfo_t info;
+	struct perf_event *bp = NULL, *kernel_bp, *user_bp;
+	struct arch_hw_breakpoint *bp_info;
+
+	/*
+	 * Identify the cause of single-stepping and find the corresponding
+	 * breakpoint structure
+	 */
+	user_bp = current->thread.last_hit_ubp;
+	kernel_bp = per_cpu(last_hit_bp, cpu);
+	if (user_bp) {
+		bp = user_bp;
+		current->thread.last_hit_ubp = NULL;
+	} else if (kernel_bp) {
+		bp = kernel_bp;
+		per_cpu(last_hit_bp, cpu) = NULL;
+	}
+
+	/*
+	 * Check if we are single-stepping as a result of a
+	 * previous HW Breakpoint exception
+	 */
+	if (!bp)
+		goto out;
+
+	bp_info = counter_arch_bp(bp);
+
+	/*
+	 * We shall invoke the user-defined callback function in the single
+	 * stepping handler to confirm to 'trigger-after-execute' semantics
+	 */
+	perf_bp_event(bp, regs);
+
+	/*
+	 * Do not disable MSR_SE if the process was already in
+	 * single-stepping mode. We cannot reliable detect single-step mode
+	 * for kernel-space breakpoints, so this cannot work along with other
+	 * debuggers (like KGDB, xmon) which may be single-stepping kernel code.
+	 */
+	if (!(user_bp && test_thread_flag(TIF_SINGLESTEP)))
+		regs->msr &= ~MSR_SE;
+
+	/* Deliver signal to user-space */
+	if (user_bp) {
+		info.si_signo = SIGTRAP;
+		info.si_errno = 0;
+		info.si_code = TRAP_HWBKPT;
+		info.si_addr = (void __user *)bp_info->address;
+		force_sig_info(SIGTRAP, &info, current);
+	}
+
+	set_dabr(bp_info->address | bp_info->type | DABR_TRANSLATION);
+	ret = NOTIFY_STOP;
+out:
+	put_cpu();
+	return ret;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_exceptions_notify(
+		struct notifier_block *unused, unsigned long val, void *data)
+{
+	int ret = NOTIFY_DONE;
+
+	switch (val) {
+	case DIE_DABR_MATCH:
+		ret = hw_breakpoint_handler(data);
+		break;
+	case DIE_SSTEP:
+		ret = single_step_dabr_instruction(data);
+		break;
+	}
+
+	return ret;
+}
+
+/*
+ * Release the user breakpoints used by ptrace
+ */
+void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
+{
+	struct thread_struct *t = &tsk->thread;
+
+	unregister_hw_breakpoint(t->ptrace_bps[0]);
+	t->ptrace_bps[0] = NULL;
+}
+
+void hw_breakpoint_pmu_read(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+
Index: linux-2.6.ppc64_test/arch/powerpc/Kconfig
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/Kconfig
+++ linux-2.6.ppc64_test/arch/powerpc/Kconfig
@@ -140,6 +140,7 @@ config PPC
 	select HAVE_SYSCALL_WRAPPERS if PPC64
 	select GENERIC_ATOMIC64 if PPC32
 	select HAVE_PERF_EVENTS
+	select HAVE_HW_BREAKPOINT if PPC64
 
 config EARLY_PRINTK
 	bool
Index: linux-2.6.ppc64_test/arch/powerpc/kernel/Makefile
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/Makefile
+++ linux-2.6.ppc64_test/arch/powerpc/kernel/Makefile
@@ -33,7 +33,7 @@ obj-y				:= cputable.o ptrace.o syscalls
 obj-y				+= vdso32/
 obj-$(CONFIG_PPC64)		+= setup_64.o sys_ppc32.o \
 				   signal_64.o ptrace32.o \
-				   paca.o nvram_64.o firmware.o
+				   paca.o nvram_64.o firmware.o hw_breakpoint.o
 obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_ppc970.o cpu_setup_pa6t.o
 obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
 obj-$(CONFIG_PPC_BOOK3E_64)	+= exceptions-64e.o
Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/reg.h
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/include/asm/reg.h
+++ linux-2.6.ppc64_test/arch/powerpc/include/asm/reg.h
@@ -180,6 +180,7 @@
 #define   CTRL_TE	0x00c00000	/* thread enable */
 #define   CTRL_RUNLATCH	0x1
 #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
+#define   HBP_NUM	1	/* Number of physical HW breakpoint registers */
 #define   DABR_TRANSLATION	(1UL << 2)
 #define   DABR_DATA_WRITE	(1UL << 1)
 #define   DABR_DATA_READ	(1UL << 0)
Index: linux-2.6.ppc64_test/arch/powerpc/mm/fault.c
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/mm/fault.c
+++ linux-2.6.ppc64_test/arch/powerpc/mm/fault.c
@@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
 		error_code &= 0x48200000;
 	else
 		is_write = error_code & DSISR_ISSTORE;
+
+	if (error_code & DSISR_DABRMATCH) {
+		/* DABR match */
+		do_dabr(regs, address, error_code);
+		return 0;
+	}
 #else
 	is_write = error_code & ESR_DST;
 #endif /* CONFIG_4xx || CONFIG_BOOKE */
@@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
 	if (!user_mode(regs) && (address >= TASK_SIZE))
 		return SIGSEGV;
 
-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
-  	if (error_code & DSISR_DABRMATCH) {
-		/* DABR match */
-		do_dabr(regs, address, error_code);
-		return 0;
-	}
-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
-
 	if (in_atomic() || mm == NULL) {
 		if (!user_mode(regs))
 			return SIGSEGV;
Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/processor.h
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/include/asm/processor.h
+++ linux-2.6.ppc64_test/arch/powerpc/include/asm/processor.h
@@ -209,6 +209,12 @@ struct thread_struct {
 #ifdef CONFIG_PPC64
 	unsigned long	start_tb;	/* Start purr when proc switched in */
 	unsigned long	accum_tb;	/* Total accumilated purr for process */
+	struct perf_event *ptrace_bps[HBP_NUM];
+	/*
+	 * Point to the hw-breakpoint last. Helps safe pre-emption and
+	 * hw-breakpoint re-enablement.
+	 */
+	struct perf_event *last_hit_ubp;
 #endif
 	unsigned long	dabr;		/* Data address breakpoint register */
 #ifdef CONFIG_ALTIVEC
Index: linux-2.6.ppc64_test/arch/powerpc/kernel/ptrace.c
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/ptrace.c
+++ linux-2.6.ppc64_test/arch/powerpc/kernel/ptrace.c
@@ -32,6 +32,8 @@
 #ifdef CONFIG_PPC32
 #include <linux/module.h>
 #endif
+#include <linux/hw_breakpoint.h>
+#include <linux/perf_event.h>
 
 #include <asm/uaccess.h>
 #include <asm/page.h>
@@ -763,9 +765,32 @@ void user_disable_single_step(struct tas
 	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
 }
 
+void ptrace_triggered(struct perf_event *bp, int nmi,
+		      struct perf_sample_data *data, struct pt_regs *regs)
+{
+	struct perf_event_attr attr;
+
+	/*
+	 * Disable the breakpoint request here since ptrace has defined a
+	 * one-shot behaviour for breakpoint exceptions in PPC64.
+	 * The SIGTRAP signal is generated automatically for us in do_dabr().
+	 * We don't have to do anything about that here
+	 */
+	attr = bp->attr;
+	attr.disabled = true;
+	modify_user_hw_breakpoint(bp, &attr);
+}
+
 int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
 			       unsigned long data)
 {
+#ifdef CONFIG_PPC64
+	int ret;
+	struct thread_struct *thread = &(task->thread);
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+#endif /* CONFIG_PPC64 */
+
 	/* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
 	 *  For embedded processors we support one DAC and no IAC's at the
 	 *  moment.
@@ -793,6 +818,60 @@ int ptrace_set_debugreg(struct task_stru
 	/* Ensure breakpoint translation bit is set */
 	if (data && !(data & DABR_TRANSLATION))
 		return -EIO;
+#ifdef CONFIG_PPC64
+	bp = thread->ptrace_bps[0];
+	if (data == 0) {
+		if (bp) {
+			unregister_hw_breakpoint(bp);
+			thread->ptrace_bps[0] = NULL;
+		}
+		return 0;
+	}
+	if (bp) {
+		attr = bp->attr;
+		attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+
+		switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+		case DABR_DATA_READ:
+			attr.bp_type = HW_BREAKPOINT_R;
+			break;
+		case DABR_DATA_WRITE:
+			attr.bp_type = HW_BREAKPOINT_W;
+			break;
+		case (DABR_DATA_WRITE | DABR_DATA_READ):
+			attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+			break;
+		}
+		ret =  modify_user_hw_breakpoint(bp, &attr);
+		if (ret)
+			return ret;
+		thread->ptrace_bps[0] = bp;
+		thread->dabr = data;
+		return 0;
+	}
+
+	/* Create a new breakpoint request if one doesn't exist already */
+	hw_breakpoint_init(&attr);
+	attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+	switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+	case DABR_DATA_READ:
+		attr.bp_type = HW_BREAKPOINT_R;
+		break;
+	case DABR_DATA_WRITE:
+		attr.bp_type = HW_BREAKPOINT_W;
+		break;
+	case (DABR_DATA_WRITE | DABR_DATA_READ):
+		attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+		break;
+	}
+	thread->ptrace_bps[0] = bp = register_user_hw_breakpoint(&attr,
+							ptrace_triggered, task);
+	if (IS_ERR(bp)) {
+		thread->ptrace_bps[0] = NULL;
+		return PTR_ERR(bp);
+	}
+
+#endif /* CONFIG_PPC64 */
 
 	/* Move contents to the DABR register */
 	task->thread.dabr = data;
Index: linux-2.6.ppc64_test/arch/powerpc/kernel/process.c
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/process.c
+++ linux-2.6.ppc64_test/arch/powerpc/kernel/process.c
@@ -48,6 +48,7 @@
 #include <asm/machdep.h>
 #include <asm/time.h>
 #include <asm/syscalls.h>
+#include <asm/hw_breakpoint.h>
 #ifdef CONFIG_PPC64
 #include <asm/firmware.h>
 #endif
@@ -459,8 +460,11 @@ struct task_struct *__switch_to(struct t
 #ifdef CONFIG_PPC_ADV_DEBUG_REGS
 	switch_booke_debug_regs(&new->thread);
 #else
+/* For PPC64, we use the hw-breakpoint interfaces that would schedule DABR */
+#ifndef CONFIG_PPC64
 	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
 		set_dabr(new->thread.dabr);
+#endif /* CONFIG_PPC64 */
 #endif
 
 
@@ -479,6 +483,7 @@ struct task_struct *__switch_to(struct t
 		old_thread->accum_tb += (current_tb - start_tb);
 		new_thread->start_tb = current_tb;
 	}
+	flush_ptrace_hw_breakpoint(current);
 #endif
 
 	local_irq_save(flags);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-03-08 18:14 ` [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64 K.Prasad
@ 2010-03-12  6:19   ` Benjamin Herrenschmidt
  2010-03-15  6:29     ` K.Prasad
  2010-03-23  5:33   ` Paul Mackerras
  1 sibling, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2010-03-12  6:19 UTC (permalink / raw)
  To: prasad
  Cc: Michael Neuling, shaggy, Frederic Weisbecker, David Gibson,
	linuxppc-dev, paulus, Alan Stern, Roland McGrath


> Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/hw_breakpoint.h
> ===================================================================
> --- /dev/null
> +++ linux-2.6.ppc64_test/arch/powerpc/include/asm/hw_breakpoint.h
> @@ -0,0 +1,54 @@
> +#ifndef	_PPC64_HW_BREAKPOINT_H
> +#define	_PPC64_HW_BREAKPOINT_H
> +
> +#ifdef	__KERNEL__
> +#define	__ARCH_HW_BREAKPOINT_H
> +#ifdef CONFIG_PPC64
> +
> +struct arch_hw_breakpoint {
> +	u8		len; /* length of the target symbol */

I don't understand the usage of the word "symbol" above, can you
explain ?

> +	int		type;
> +	unsigned long	address;
> +};
> +
> +#include <linux/kdebug.h>
> +#include <asm/reg.h>
> +#include <asm/system.h>
> +
> +/* Total number of available HW breakpoint registers */
> +#define HBP_NUM 1
> +
> +struct perf_event;
> +struct pmu;
> +struct perf_sample_data;
> +
> +#define HW_BREAKPOINT_ALIGN 0x7
> +/* Maximum permissible length of any HW Breakpoint */
> +#define HW_BREAKPOINT_LEN 0x8

That's a lot of server-only hard wired assumptions... I suppose the
DABR emulation of BookE will catch but do you intend to provide
proper BookE support at some stage ?

> +static inline void hw_breakpoint_disable(void)
> +{
> +	set_dabr(0);
> +}

How much of these set_dabr() I see here are going to interact with
ptrace ? Is there some exclusion going on between ptrace and perf
event use of the DABR or none at all ? Or are you replacing the ptrace
bits ?

> +/*
> + * Install a perf counter breakpoint.
> + *
> + * We seek a free debug address register and use it for this
> + * breakpoint.
> + *
> + * Atomic: we hold the counter->ctx->lock and we only handle variables
> + * and registers local to this cpu.
> + */
> +int arch_install_hw_breakpoint(struct perf_event *bp)
> +{
> +	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
> +	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
> +
> +	if (!*slot)
> +		*slot = bp;
> +	else {
> +		WARN_ONCE(1, "Can't find any breakpoint slot");
> +		return -EBUSY;
> +	}
> +
> +	set_dabr(info->address | info->type | DABR_TRANSLATION);
> +	return 0;
> +}

Under which circumstances will the upper layer call that more than
once ? If it's a legit thing to do, then the WARN_ONCE() is a heavy
hammer here. I wouldn't even printk.... or only pr_debug() if it's
really worth it.

Or is that something that should just not happen ?

I would also use this coding style which is more compact and avoids
the horrible (!*slot) :

	/* Check if the slot is busy */
	if (*slot)
		return -EBUSY;
	set_dabr(...);

> +/*
> + * Uninstall the breakpoint contained in the given counter.
> + *
> + * First we search the debug address register it uses and then we disable
> + * it.
> + *
> + * Atomic: we hold the counter->ctx->lock and we only handle variables
> + * and registers local to this cpu.
> + */
> +void arch_uninstall_hw_breakpoint(struct perf_event *bp)
> +{
> +	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
> +
> +	if (*slot == bp)
> +		*slot = NULL;
> +	else {
> +		WARN_ONCE(1, "Can't find the breakpoint slot");
> +		return;
> +	}
> +	set_dabr(0);
> +}

Similar coding style issues... That one might be worth the warning
as I suppose the core should -really- not try to uninstall a bp that
hasn't been installed in the first place.

> +/*
> + * Validate the arch-specific HW Breakpoint register settings
> + */
> +int arch_validate_hwbkpt_settings(struct perf_event *bp,
> +						struct task_struct *tsk)
> +{
> +	int is_kernel, ret = -EINVAL;
> +	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
> +
> +	if (!bp)
> +		return ret;
> +
> +	switch (bp->attr.bp_type) {
> +	case HW_BREAKPOINT_R:
> +		info->type = DABR_DATA_READ;
> +		break;
> +	case HW_BREAKPOINT_W:
> +		info->type = DABR_DATA_WRITE;
> +		break;
> +	case HW_BREAKPOINT_R | HW_BREAKPOINT_W:
> +		info->type = (DABR_DATA_READ | DABR_DATA_WRITE);
> +		break;
> +	default:
> +		return ret;
> +	}

I'm not -too- fan of the above, I suppose I would have written it a bit
differently using if's but that's not a big deal... however:

> +	/* TODO: Check for a valid triggered function */
> +	/* if (!bp->triggered)
> +		return -EINVAL; */

What is that ? Is the patch incomplete ? Don't leave commented out code
in there. If you think there's a worthwhile improvement, then add a
comment with maybe a bit more explanations, and make it clear that the
patch is still useful without the code, but don't just leave commented
out code like that without a good reason. A good reason would be some
optional debug stuff for example, but then an ifdef is preferrable to
comments.

> +	is_kernel = is_kernel_addr(bp->attr.bp_addr);
> +	if ((tsk && is_kernel) || (!tsk && !is_kernel))
> +		return -EINVAL;
> +
> +	info->address = bp->attr.bp_addr;
> +	info->len = bp->attr.bp_len;
> +
> +	/*
> +	 * Since breakpoint length can be a maximum of HW_BREAKPOINT_LEN(8)
> +	 * and breakpoint addresses are aligned to nearest double-word
> +	 * HW_BREAKPOINT_ALIGN by rounding off to the lower address, the
> +	 * 'symbolsize' should satisfy the check below.
> +	 */
> +	if (info->len >
> +	    (HW_BREAKPOINT_LEN - (info->address & HW_BREAKPOINT_ALIGN)))
> +		return -EINVAL;
> +	return 0;
> +}
> +
> +/*
> + * Handle debug exception notifications.
> + */
> +int __kprobes hw_breakpoint_handler(struct die_args *args)
> +{
> +	int rc = NOTIFY_STOP;
> +	struct perf_event *bp;
> +	struct pt_regs *regs = args->regs;
> +	unsigned long dar = regs->dar;
> +	int cpu, is_kernel, stepped = 1;
> +	struct arch_hw_breakpoint *info;
> +
> +	/* Disable breakpoints during exception handling */
> +	set_dabr(0);
> +	cpu = get_cpu();

So there's something a bit weird here. set_dabr() will clear the DABR on
the local CPU, and you do that before you disable preempt. So you may
have preempted and be on another CPU, is that allright ? IE. Are you
dealing with that original CPU still having the DABR active and you now
clearing a different one ?
 
> +	/*
> +	 * The counter may be concurrently released but that can only
> +	 * occur from a call_rcu() path. We can then safely fetch
> +	 * the breakpoint, use its callback, touch its counter
> +	 * while we are in an rcu_read_lock() path.
> +	 */
> +	rcu_read_lock();
> +
> +	bp = per_cpu(bp_per_reg, cpu);
> +	if (!bp)
> +		goto out;

So this is the bp_per_reg of a different CPU if you had migrated
earlier. So you -did- hit the BP on, let's say CPU 0, but since you are
now on CPU 1 you won't handle it ? Weird...

> +	info = counter_arch_bp(bp);
> +	is_kernel = is_kernel_addr(bp->attr.bp_addr);
> +
> +	/*
> +	 * Verify if dar lies within the address range occupied by the symbol
> +	 * being watched to filter extraneous exceptions.
> +	 */
> +	if (!((bp->attr.bp_addr <= dar) &&
> +	    (dar <= (bp->attr.bp_addr + bp->attr.bp_len))))
> +		/*
> +		 * This exception is triggered not because of a memory access on
> +		 * the monitored variable but in the double-word address range
> +		 * in which it is contained. We will consume this exception,
> +		 * considering it as 'noise'.
> +		 */
> +		goto restore_bp;
> +
> +	/*
> +	 * Return early after invoking user-callback function without restoring
> +	 * DABR if the breakpoint is from ptrace which always operates in
> +	 * one-shot mode
> +	 */
> +	if (bp->overflow_handler == ptrace_triggered) {
> +		perf_bp_event(bp, regs);
> +		rc = NOTIFY_DONE;
> +		goto out;
> +	}
> +
> +	/*
> +	 * Do not emulate user-space instructions from kernel-space,
> +	 * instead single-step them.
> +	 */
> +	if (!is_kernel) {
> +		current->thread.last_hit_ubp = bp;
> +		regs->msr |= MSR_SE;
> +		goto out;
> +	}

So what is this ? When you hit a bp, you switch to single step ? Out of
curiosity, why ?

> +	stepped = emulate_step(regs, regs->nip);
> +	/* emulate_step() could not execute it, single-step them */
> +	if (stepped == 0) {
> +		regs->msr |= MSR_SE;
> +		per_cpu(last_hit_bp, cpu) = bp;
> +		goto out;
> +	}
> +	/*
> +	 * As a policy, the callback is invoked in a 'trigger-after-execute'
> +	 * fashion
> +	 */
> +	perf_bp_event(bp, regs);
> +
> +restore_bp:
> +	set_dabr(info->address | info->type | DABR_TRANSLATION);

So in my preempt case, you hit the DABR on CPU 0, migrated to CPU 1
before you get into this function, and now you are modifying CPU 1
DABR... 

I think we need to change the asm so that you are called with interrupts
off from handle_page_fault() or so.

Basicallym in do_hash_page, make the andis 0xa450 go out of line, check
for DSISR_DABRMATCH specifically, and in this case go to an entirely
different function than handle_page_fault->do_page_fault(), something
like handle_dabr_fault->do_dabr() which uses DISABLE_INTS instead
of ENABLE_INTS :-)

We also need the same fix in 32-bit I suppose.

Note while looking at it that it looks like we have a similar issue with
program checks. We fixed it on 32-bit but not on 64-bit. We should keep
interrupts masked basically when going progranm_check_exception(). It
will unmask them if/when needed.

> +out:
> +	rcu_read_unlock();
> +	put_cpu();
> +	return rc;
> +}
> +
> +/*
> + * Handle single-step exceptions following a DABR hit.
> + */
> +int __kprobes single_step_dabr_instruction(struct die_args *args)
> +{
> +	struct pt_regs *regs = args->regs;
> +	int cpu = get_cpu();
> +	int ret = NOTIFY_DONE;
> +	siginfo_t info;
> +	struct perf_event *bp = NULL, *kernel_bp, *user_bp;
> +	struct arch_hw_breakpoint *bp_info;
> +
> +	/*
> +	 * Identify the cause of single-stepping and find the corresponding
> +	 * breakpoint structure
> +	 */
> +	user_bp = current->thread.last_hit_ubp;
> +	kernel_bp = per_cpu(last_hit_bp, cpu);
> +	if (user_bp) {
> +		bp = user_bp;
> +		current->thread.last_hit_ubp = NULL;
> +	} else if (kernel_bp) {
> +		bp = kernel_bp;
> +		per_cpu(last_hit_bp, cpu) = NULL;
> +	}

Hopefully you don't have this problem here, so you probably don't need
get/put_cpu() but that won't hurt, since single_step should hopefully
always have interrupts off.

> +	/*
> +	 * Check if we are single-stepping as a result of a
> +	 * previous HW Breakpoint exception
> +	 */
> +	if (!bp)
> +		goto out;
> +
> +	bp_info = counter_arch_bp(bp);
> +
> +	/*
> +	 * We shall invoke the user-defined callback function in the single
> +	 * stepping handler to confirm to 'trigger-after-execute' semantics
> +	 */
> +	perf_bp_event(bp, regs);
> +
> +	/*
> +	 * Do not disable MSR_SE if the process was already in
> +	 * single-stepping mode. We cannot reliable detect single-step mode
> +	 * for kernel-space breakpoints, so this cannot work along with other
> +	 * debuggers (like KGDB, xmon) which may be single-stepping kernel code.
> +	 */
> +	if (!(user_bp && test_thread_flag(TIF_SINGLESTEP)))
> +		regs->msr &= ~MSR_SE;
> +
> +	/* Deliver signal to user-space */
> +	if (user_bp) {
> +		info.si_signo = SIGTRAP;
> +		info.si_errno = 0;
> +		info.si_code = TRAP_HWBKPT;
> +		info.si_addr = (void __user *)bp_info->address;
> +		force_sig_info(SIGTRAP, &info, current);
> +	}
> +
> +	set_dabr(bp_info->address | bp_info->type | DABR_TRANSLATION);
> +	ret = NOTIFY_STOP;
> +out:
> +	put_cpu();
> +	return ret;
> +}
> +
> +/*
> + * Handle debug exception notifications.
> + */
> +int __kprobes hw_breakpoint_exceptions_notify(
> +		struct notifier_block *unused, unsigned long val, void *data)
> +{
> +	int ret = NOTIFY_DONE;
> +
> +	switch (val) {
> +	case DIE_DABR_MATCH:
> +		ret = hw_breakpoint_handler(data);
> +		break;
> +	case DIE_SSTEP:
> +		ret = single_step_dabr_instruction(data);
> +		break;
> +	}
> +
> +	return ret;
> +}
> +
> +/*
> + * Release the user breakpoints used by ptrace
> + */
> +void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
> +{
> +	struct thread_struct *t = &tsk->thread;
> +
> +	unregister_hw_breakpoint(t->ptrace_bps[0]);
> +	t->ptrace_bps[0] = NULL;
> +}

Ok, so I see that you call that on context switch. But where do you
re-install the breakpoint for the "new" process ?

See below...

> +void hw_breakpoint_pmu_read(struct perf_event *bp)
> +{
> +	/* TODO */
> +}
> +
> +void hw_breakpoint_pmu_unthrottle(struct perf_event *bp)
> +{
> +	/* TODO */
> +}
> +
> +
> Index: linux-2.6.ppc64_test/arch/powerpc/Kconfig
> ===================================================================
> --- linux-2.6.ppc64_test.orig/arch/powerpc/Kconfig
> +++ linux-2.6.ppc64_test/arch/powerpc/Kconfig
> @@ -140,6 +140,7 @@ config PPC
>  	select HAVE_SYSCALL_WRAPPERS if PPC64
>  	select GENERIC_ATOMIC64 if PPC32
>  	select HAVE_PERF_EVENTS
> +	select HAVE_HW_BREAKPOINT if PPC64

Why 64-bit only ? ppc32 has DABR too. In fact BookE also provides DABR
emulation.

Also, all your PPC64 stuff are going to show up on BookE 64-bit which
might not be what you wanted...

>  config EARLY_PRINTK
>  	bool
> Index: linux-2.6.ppc64_test/arch/powerpc/kernel/Makefile
> ===================================================================
> --- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/Makefile
> +++ linux-2.6.ppc64_test/arch/powerpc/kernel/Makefile
> @@ -33,7 +33,7 @@ obj-y				:= cputable.o ptrace.o syscalls
>  obj-y				+= vdso32/
>  obj-$(CONFIG_PPC64)		+= setup_64.o sys_ppc32.o \
>  				   signal_64.o ptrace32.o \
> -				   paca.o nvram_64.o firmware.o
> +				   paca.o nvram_64.o firmware.o hw_breakpoint.o
>  obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_ppc970.o cpu_setup_pa6t.o
>  obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
>  obj-$(CONFIG_PPC_BOOK3E_64)	+= exceptions-64e.o
> Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/reg.h
> ===================================================================
> --- linux-2.6.ppc64_test.orig/arch/powerpc/include/asm/reg.h
> +++ linux-2.6.ppc64_test/arch/powerpc/include/asm/reg.h
> @@ -180,6 +180,7 @@
>  #define   CTRL_TE	0x00c00000	/* thread enable */
>  #define   CTRL_RUNLATCH	0x1
>  #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
> +#define   HBP_NUM	1	/* Number of physical HW breakpoint registers */

The above is not quite right. First you already define that in
hw_breakpoint.h. Then, this is too short an identifier for such a
generic file. Finally, it should not be in reg.h since it can vary
from processor to processor. If you want to do things properly, then
add some kind of info about the debug capabilities to cputable.

Please sync with Shaggy so it makes sense on BookE as well.

>  #define   DABR_TRANSLATION	(1UL << 2)
>  #define   DABR_DATA_WRITE	(1UL << 1)
>  #define   DABR_DATA_READ	(1UL << 0)
> Index: linux-2.6.ppc64_test/arch/powerpc/mm/fault.c
> ===================================================================
> --- linux-2.6.ppc64_test.orig/arch/powerpc/mm/fault.c
> +++ linux-2.6.ppc64_test/arch/powerpc/mm/fault.c
> @@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
>  		error_code &= 0x48200000;
>  	else
>  		is_write = error_code & DSISR_ISSTORE;
> +
> +	if (error_code & DSISR_DABRMATCH) {
> +		/* DABR match */
> +		do_dabr(regs, address, error_code);
> +		return 0;
> +	}

Now that's interesting. I have the feeling that the moving up of this
might actually be a bug fix :-) But it's still wrong due to interrupts
being enabled as I explained earlier. We probably want to make it a
different path out of head_*.S

>  #else
>  	is_write = error_code & ESR_DST;
>  #endif /* CONFIG_4xx || CONFIG_BOOKE */
> @@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
>  	if (!user_mode(regs) && (address >= TASK_SIZE))
>  		return SIGSEGV;
>  
> -#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
> -  	if (error_code & DSISR_DABRMATCH) {
> -		/* DABR match */
> -		do_dabr(regs, address, error_code);
> -		return 0;
> -	}
> -#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
> -
>  	if (in_atomic() || mm == NULL) {
>  		if (!user_mode(regs))
>  			return SIGSEGV;
> Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/processor.h
> ===================================================================
> --- linux-2.6.ppc64_test.orig/arch/powerpc/include/asm/processor.h
> +++ linux-2.6.ppc64_test/arch/powerpc/include/asm/processor.h
> @@ -209,6 +209,12 @@ struct thread_struct {
>  #ifdef CONFIG_PPC64
>  	unsigned long	start_tb;	/* Start purr when proc switched in */
>  	unsigned long	accum_tb;	/* Total accumilated purr for process */
> +	struct perf_event *ptrace_bps[HBP_NUM];

So you should probably call that MAX_HW_BREAKPOINTS and reflect the fact
that it can be bigger. Or you have a pointer to some optional ptrace
BP structure that handle what is needed, and can be allocated lazily
by ptrace only when needed rather than always carrying this around in
the thread_struct.

> +	/*
> +	 * Point to the hw-breakpoint last. Helps safe pre-emption and
> +	 * hw-breakpoint re-enablement.
> +	 */
> +	struct perf_event *last_hit_ubp;

The comment doesn't make much sense. Preemption doesn't seem quite right
to me unless I missed something and the comment is either too much or
not enough to understand what this is for.

>  #endif
>  	unsigned long	dabr;		/* Data address breakpoint register */
>  #ifdef CONFIG_ALTIVEC
> Index: linux-2.6.ppc64_test/arch/powerpc/kernel/ptrace.c
> ===================================================================
> --- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/ptrace.c
> +++ linux-2.6.ppc64_test/arch/powerpc/kernel/ptrace.c
> @@ -32,6 +32,8 @@
>  #ifdef CONFIG_PPC32
>  #include <linux/module.h>
>  #endif
> +#include <linux/hw_breakpoint.h>
> +#include <linux/perf_event.h>
>  
>  #include <asm/uaccess.h>
>  #include <asm/page.h>
> @@ -763,9 +765,32 @@ void user_disable_single_step(struct tas
>  	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
>  }
>  
> +void ptrace_triggered(struct perf_event *bp, int nmi,
> +		      struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	struct perf_event_attr attr;
> +
> +	/*
> +	 * Disable the breakpoint request here since ptrace has defined a
> +	 * one-shot behaviour for breakpoint exceptions in PPC64.
> +	 * The SIGTRAP signal is generated automatically for us in do_dabr().
> +	 * We don't have to do anything about that here
> +	 */
> +	attr = bp->attr;
> +	attr.disabled = true;
> +	modify_user_hw_breakpoint(bp, &attr);
> +}
> +
>  int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
>  			       unsigned long data)
>  {
> +#ifdef CONFIG_PPC64
> +	int ret;
> +	struct thread_struct *thread = &(task->thread);
> +	struct perf_event *bp;
> +	struct perf_event_attr attr;
> +#endif /* CONFIG_PPC64 */
> +
>  	/* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
>  	 *  For embedded processors we support one DAC and no IAC's at the
>  	 *  moment.
> @@ -793,6 +818,60 @@ int ptrace_set_debugreg(struct task_stru
>  	/* Ensure breakpoint translation bit is set */
>  	if (data && !(data & DABR_TRANSLATION))
>  		return -EIO;
> +#ifdef CONFIG_PPC64
> +	bp = thread->ptrace_bps[0];
> +	if (data == 0) {
> +		if (bp) {
> +			unregister_hw_breakpoint(bp);
> +			thread->ptrace_bps[0] = NULL;
> +		}
> +		return 0;
> +	}
> +	if (bp) {
> +		attr = bp->attr;
> +		attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
> +
> +		switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
> +		case DABR_DATA_READ:
> +			attr.bp_type = HW_BREAKPOINT_R;
> +			break;
> +		case DABR_DATA_WRITE:
> +			attr.bp_type = HW_BREAKPOINT_W;
> +			break;
> +		case (DABR_DATA_WRITE | DABR_DATA_READ):
> +			attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
> +			break;
> +		}
> +		ret =  modify_user_hw_breakpoint(bp, &attr);
> +		if (ret)
> +			return ret;
> +		thread->ptrace_bps[0] = bp;
> +		thread->dabr = data;
> +		return 0;
> +	}
> +
> +	/* Create a new breakpoint request if one doesn't exist already */
> +	hw_breakpoint_init(&attr);
> +	attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
> +	switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
> +	case DABR_DATA_READ:
> +		attr.bp_type = HW_BREAKPOINT_R;
> +		break;
> +	case DABR_DATA_WRITE:
> +		attr.bp_type = HW_BREAKPOINT_W;
> +		break;
> +	case (DABR_DATA_WRITE | DABR_DATA_READ):
> +		attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
> +		break;
> +	}
> +	thread->ptrace_bps[0] = bp = register_user_hw_breakpoint(&attr,
> +							ptrace_triggered, task);
> +	if (IS_ERR(bp)) {
> +		thread->ptrace_bps[0] = NULL;
> +		return PTR_ERR(bp);
> +	}
> +
> +#endif /* CONFIG_PPC64 */
>  
>  	/* Move contents to the DABR register */
>  	task->thread.dabr = data;
> Index: linux-2.6.ppc64_test/arch/powerpc/kernel/process.c
> ===================================================================
> --- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/process.c
> +++ linux-2.6.ppc64_test/arch/powerpc/kernel/process.c
> @@ -48,6 +48,7 @@
>  #include <asm/machdep.h>
>  #include <asm/time.h>
>  #include <asm/syscalls.h>
> +#include <asm/hw_breakpoint.h>
>  #ifdef CONFIG_PPC64
>  #include <asm/firmware.h>
>  #endif
> @@ -459,8 +460,11 @@ struct task_struct *__switch_to(struct t
>  #ifdef CONFIG_PPC_ADV_DEBUG_REGS
>  	switch_booke_debug_regs(&new->thread);
>  #else
> +/* For PPC64, we use the hw-breakpoint interfaces that would schedule DABR */
> +#ifndef CONFIG_PPC64
>  	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
>  		set_dabr(new->thread.dabr);
> +#endif /* CONFIG_PPC64 */
>  #endif
>  
> 
> @@ -479,6 +483,7 @@ struct task_struct *__switch_to(struct t
>  		old_thread->accum_tb += (current_tb - start_tb);
>  		new_thread->start_tb = current_tb;
>  	}
> +	flush_ptrace_hw_breakpoint(current);
>  #endif
>  
>  	local_irq_save(flags);
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-03-12  6:19   ` Benjamin Herrenschmidt
@ 2010-03-15  6:29     ` K.Prasad
  2010-04-07  8:03       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 26+ messages in thread
From: K.Prasad @ 2010-03-15  6:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Neuling, shaggy, Frederic Weisbecker, David Gibson,
	linuxppc-dev, paulus, Alan Stern, Roland McGrath

On Fri, Mar 12, 2010 at 05:19:36PM +1100, Benjamin Herrenschmidt wrote:
> 
> > Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/hw_breakpoint.h
> > ===================================================================
> > --- /dev/null
> > +++ linux-2.6.ppc64_test/arch/powerpc/include/asm/hw_breakpoint.h
> > @@ -0,0 +1,54 @@
> > +#ifndef	_PPC64_HW_BREAKPOINT_H
> > +#define	_PPC64_HW_BREAKPOINT_H
> > +
> > +#ifdef	__KERNEL__
> > +#define	__ARCH_HW_BREAKPOINT_H
> > +#ifdef CONFIG_PPC64
> > +
> > +struct arch_hw_breakpoint {
> > +	u8		len; /* length of the target symbol */
> 
> I don't understand the usage of the word "symbol" above, can you
> explain ?
>

The word "symbol", here, carries a meaning similar to what is derived from
its usage in the word "kernel data symbols" - although it is
used to store lengths for both kernel and user-space breakpoints.

Since the desired length of the breakpoint is typically determined by the
size of the "symbol" (variable) being monitored (not exceeding 8-Bytes),
the comment was such. I shall change it to a more descriptive one, such as
"length of the target kernel/user data symbol" if you prefer that.
 
> > +	int		type;
> > +	unsigned long	address;
> > +};
> > +
> > +#include <linux/kdebug.h>
> > +#include <asm/reg.h>
> > +#include <asm/system.h>
> > +
> > +/* Total number of available HW breakpoint registers */
> > +#define HBP_NUM 1
> > +
> > +struct perf_event;
> > +struct pmu;
> > +struct perf_sample_data;
> > +
> > +#define HW_BREAKPOINT_ALIGN 0x7
> > +/* Maximum permissible length of any HW Breakpoint */
> > +#define HW_BREAKPOINT_LEN 0x8
> 
> That's a lot of server-only hard wired assumptions... I suppose the
> DABR emulation of BookE will catch but do you intend to provide
> proper BookE support at some stage ?
> 

Yes, I intend to extend support for Book-E sometime soon during which
the above code would have to be either a) enclosed inside #ifdef PPC64
or b) a new header file for BookE debug register definitions are used
(guess Shaggy's code would have many of them done already).

> > +static inline void hw_breakpoint_disable(void)
> > +{
> > +	set_dabr(0);
> > +}
> 
> How much of these set_dabr() I see here are going to interact with
> ptrace ? Is there some exclusion going on between ptrace and perf
> event use of the DABR or none at all ? Or are you replacing the ptrace
> bits ?
> 

This set_dabr() in hw_breakpoint_disable() is to be called only once during
machine_kexec() [which I realise is missing in the patch...will add that]
and will not conflict with ptrace.

The other set_dabr() in arch_<un>install_hw_breakpoint() will perform the
actual debug register write, while the ptrace_<get><set>_debugreg() requests
are routed through them (via the hw-breakpoint interfaces for arbitration as
shown below) and are designed to not conflict with each other.

ptrace_set_debugreg()-->register_user_hw_breakpoint()
...
(sched)-->perf_event_task_sched_<in><out>()--->arch_<un>install_hw_breakpoint()

In short, an existing ptrace request will not be overwritten/replaced to
accommodate a  new kernel/user-space request (which will fail because of DABR
unavailability).

> > +/*
> > + * Install a perf counter breakpoint.
> > + *
> > + * We seek a free debug address register and use it for this
> > + * breakpoint.
> > + *
> > + * Atomic: we hold the counter->ctx->lock and we only handle variables
> > + * and registers local to this cpu.
> > + */
> > +int arch_install_hw_breakpoint(struct perf_event *bp)
> > +{
> > +	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
> > +	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
> > +
> > +	if (!*slot)
> > +		*slot = bp;
> > +	else {
> > +		WARN_ONCE(1, "Can't find any breakpoint slot");
> > +		return -EBUSY;
> > +	}
> > +
> > +	set_dabr(info->address | info->type | DABR_TRANSLATION);
> > +	return 0;
> > +}
> 
> Under which circumstances will the upper layer call that more than
> once ? If it's a legit thing to do, then the WARN_ONCE() is a heavy
> hammer here. I wouldn't even printk.... or only pr_debug() if it's
> really worth it.
> 
> Or is that something that should just not happen ?
> 

I don't really see when this can happen in PPC64 with one DABR. The slot
reservation mechanism wouldn't allow this to happen and the code is here
since it got inherited from x86.

I shall remove the check and hence the WARN_ONCE.

> I would also use this coding style which is more compact and avoids
> the horrible (!*slot) :
> 
> 	/* Check if the slot is busy */
> 	if (*slot)
> 		return -EBUSY;
> 	set_dabr(...);
> 
> > +/*
> > + * Uninstall the breakpoint contained in the given counter.
> > + *
> > + * First we search the debug address register it uses and then we disable
> > + * it.
> > + *
> > + * Atomic: we hold the counter->ctx->lock and we only handle variables
> > + * and registers local to this cpu.
> > + */
> > +void arch_uninstall_hw_breakpoint(struct perf_event *bp)
> > +{
> > +	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
> > +
> > +	if (*slot == bp)
> > +		*slot = NULL;
> > +	else {
> > +		WARN_ONCE(1, "Can't find the breakpoint slot");
> > +		return;
> > +	}
> > +	set_dabr(0);
> > +}
> 
> Similar coding style issues... That one might be worth the warning
> as I suppose the core should -really- not try to uninstall a bp that
> hasn't been installed in the first place.
>

Okay..will change the code style.
 
> > +/*
> > + * Validate the arch-specific HW Breakpoint register settings
> > + */
> > +int arch_validate_hwbkpt_settings(struct perf_event *bp,
> > +						struct task_struct *tsk)
> > +{
> > +	int is_kernel, ret = -EINVAL;
> > +	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
> > +
> > +	if (!bp)
> > +		return ret;
> > +
> > +	switch (bp->attr.bp_type) {
> > +	case HW_BREAKPOINT_R:
> > +		info->type = DABR_DATA_READ;
> > +		break;
> > +	case HW_BREAKPOINT_W:
> > +		info->type = DABR_DATA_WRITE;
> > +		break;
> > +	case HW_BREAKPOINT_R | HW_BREAKPOINT_W:
> > +		info->type = (DABR_DATA_READ | DABR_DATA_WRITE);
> > +		break;
> > +	default:
> > +		return ret;
> > +	}
> 
> I'm not -too- fan of the above, I suppose I would have written it a bit
> differently using if's but that's not a big deal... however:
> 
> > +	/* TODO: Check for a valid triggered function */
> > +	/* if (!bp->triggered)
> > +		return -EINVAL; */
> 
> What is that ? Is the patch incomplete ? Don't leave commented out code
> in there. If you think there's a worthwhile improvement, then add a
> comment with maybe a bit more explanations, and make it clear that the
> patch is still useful without the code, but don't just leave commented
> out code like that without a good reason. A good reason would be some
> optional debug stuff for example, but then an ifdef is preferrable to
> comments.
> 

Thanks for pointing that out....the TODO label here is obsolete..it will
be removed.

> > +	is_kernel = is_kernel_addr(bp->attr.bp_addr);
> > +	if ((tsk && is_kernel) || (!tsk && !is_kernel))
> > +		return -EINVAL;
> > +
> > +	info->address = bp->attr.bp_addr;
> > +	info->len = bp->attr.bp_len;
> > +
> > +	/*
> > +	 * Since breakpoint length can be a maximum of HW_BREAKPOINT_LEN(8)
> > +	 * and breakpoint addresses are aligned to nearest double-word
> > +	 * HW_BREAKPOINT_ALIGN by rounding off to the lower address, the
> > +	 * 'symbolsize' should satisfy the check below.
> > +	 */
> > +	if (info->len >
> > +	    (HW_BREAKPOINT_LEN - (info->address & HW_BREAKPOINT_ALIGN)))
> > +		return -EINVAL;
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Handle debug exception notifications.
> > + */
> > +int __kprobes hw_breakpoint_handler(struct die_args *args)
> > +{
> > +	int rc = NOTIFY_STOP;
> > +	struct perf_event *bp;
> > +	struct pt_regs *regs = args->regs;
> > +	unsigned long dar = regs->dar;
> > +	int cpu, is_kernel, stepped = 1;
> > +	struct arch_hw_breakpoint *info;
> > +
> > +	/* Disable breakpoints during exception handling */
> > +	set_dabr(0);
> > +	cpu = get_cpu();
> 
> So there's something a bit weird here. set_dabr() will clear the DABR on
> the local CPU, and you do that before you disable preempt. So you may
> have preempted and be on another CPU, is that allright ? IE. Are you
> dealing with that original CPU still having the DABR active and you now
> clearing a different one ?
>

First, the code here is written with the assumption that pre-emption is
disabled during do_page_fault() and hence hw-breakpoint exception handling.
If it is not the case (as you've pointed out below) the exception handling
is bound to go wrong. I will fix it by patching do_hash_page in
exceptions-64s.S as suggested by you.

Since the comment here (and some of those below) point out the problems that
may arise when pre-emption occurs (which is not supposed to happen),
they are all valid but will disappear when do_hash_page is fixed.
 
> > +	/*
> > +	 * The counter may be concurrently released but that can only
> > +	 * occur from a call_rcu() path. We can then safely fetch
> > +	 * the breakpoint, use its callback, touch its counter
> > +	 * while we are in an rcu_read_lock() path.
> > +	 */
> > +	rcu_read_lock();
> > +
> > +	bp = per_cpu(bp_per_reg, cpu);
> > +	if (!bp)
> > +		goto out;
> 
> So this is the bp_per_reg of a different CPU if you had migrated
> earlier. So you -did- hit the BP on, let's say CPU 0, but since you are
> now on CPU 1 you won't handle it ? Weird...
> 

Correct...I will fix as stated above.

> > +	info = counter_arch_bp(bp);
> > +	is_kernel = is_kernel_addr(bp->attr.bp_addr);
> > +
> > +	/*
> > +	 * Verify if dar lies within the address range occupied by the symbol
> > +	 * being watched to filter extraneous exceptions.
> > +	 */
> > +	if (!((bp->attr.bp_addr <= dar) &&
> > +	    (dar <= (bp->attr.bp_addr + bp->attr.bp_len))))
> > +		/*
> > +		 * This exception is triggered not because of a memory access on
> > +		 * the monitored variable but in the double-word address range
> > +		 * in which it is contained. We will consume this exception,
> > +		 * considering it as 'noise'.
> > +		 */
> > +		goto restore_bp;
> > +
> > +	/*
> > +	 * Return early after invoking user-callback function without restoring
> > +	 * DABR if the breakpoint is from ptrace which always operates in
> > +	 * one-shot mode
> > +	 */
> > +	if (bp->overflow_handler == ptrace_triggered) {
> > +		perf_bp_event(bp, regs);
> > +		rc = NOTIFY_DONE;
> > +		goto out;
> > +	}
> > +
> > +	/*
> > +	 * Do not emulate user-space instructions from kernel-space,
> > +	 * instead single-step them.
> > +	 */
> > +	if (!is_kernel) {
> > +		current->thread.last_hit_ubp = bp;
> > +		regs->msr |= MSR_SE;
> > +		goto out;
> > +	}
> 
> So what is this ? When you hit a bp, you switch to single step ? Out of
> curiosity, why ?
> 

For a user-space breakpoint request (which is not from ptrace syscall),
the behaviour is thus:
- It operates in 'continuous-trigger' mode (unlike ptrace requests
  which are 'one-shot'.
- The causative instruction is not emulated in kernel-space using
  emulate_step() but is rather executed in user-space.
- The SIGTRAP signal is sent to user-space 'after' execution of the
  causative instruction.

Since breakpoint exceptions in ppc64 operate in 'trigger-before-execute'
fashion, they have to be disabled upon occurrence and then re-enabled
after single-stepping the causative instruction (inside the single-step
handler) to avoid being hit by the breakpoint infinitely. I'm not sure
if I'm missing something obviously wrong that you're trying to point me
to (in the comment above).

> > +	stepped = emulate_step(regs, regs->nip);
> > +	/* emulate_step() could not execute it, single-step them */
> > +	if (stepped == 0) {
> > +		regs->msr |= MSR_SE;
> > +		per_cpu(last_hit_bp, cpu) = bp;
> > +		goto out;
> > +	}
> > +	/*
> > +	 * As a policy, the callback is invoked in a 'trigger-after-execute'
> > +	 * fashion
> > +	 */
> > +	perf_bp_event(bp, regs);
> > +
> > +restore_bp:
> > +	set_dabr(info->address | info->type | DABR_TRANSLATION);
> 
> So in my preempt case, you hit the DABR on CPU 0, migrated to CPU 1
> before you get into this function, and now you are modifying CPU 1
> DABR... 
> 
> I think we need to change the asm so that you are called with interrupts
> off from handle_page_fault() or so.
> 
> Basicallym in do_hash_page, make the andis 0xa450 go out of line, check
> for DSISR_DABRMATCH specifically, and in this case go to an entirely
> different function than handle_page_fault->do_page_fault(), something
> like handle_dabr_fault->do_dabr() which uses DISABLE_INTS instead
> of ENABLE_INTS :-)
> 

Sure, will fix this (and other comments) in the subsequent patch that I send.


> We also need the same fix in 32-bit I suppose.
> 
> Note while looking at it that it looks like we have a similar issue with
> program checks. We fixed it on 32-bit but not on 64-bit. We should keep
> interrupts masked basically when going progranm_check_exception(). It
> will unmask them if/when needed.
> 
> > +out:
> > +	rcu_read_unlock();
> > +	put_cpu();
> > +	return rc;
> > +}
> > +
> > +/*
> > + * Handle single-step exceptions following a DABR hit.
> > + */
> > +int __kprobes single_step_dabr_instruction(struct die_args *args)
> > +{
> > +	struct pt_regs *regs = args->regs;
> > +	int cpu = get_cpu();
> > +	int ret = NOTIFY_DONE;
> > +	siginfo_t info;
> > +	struct perf_event *bp = NULL, *kernel_bp, *user_bp;
> > +	struct arch_hw_breakpoint *bp_info;
> > +
> > +	/*
> > +	 * Identify the cause of single-stepping and find the corresponding
> > +	 * breakpoint structure
> > +	 */
> > +	user_bp = current->thread.last_hit_ubp;
> > +	kernel_bp = per_cpu(last_hit_bp, cpu);
> > +	if (user_bp) {
> > +		bp = user_bp;
> > +		current->thread.last_hit_ubp = NULL;
> > +	} else if (kernel_bp) {
> > +		bp = kernel_bp;
> > +		per_cpu(last_hit_bp, cpu) = NULL;
> > +	}
> 
> Hopefully you don't have this problem here, so you probably don't need
> get/put_cpu() but that won't hurt, since single_step should hopefully
> always have interrupts off.
> 

Sure, I could have used smp_processor_id() instead....will use that.

> > +	/*
> > +	 * Check if we are single-stepping as a result of a
> > +	 * previous HW Breakpoint exception
> > +	 */
> > +	if (!bp)
> > +		goto out;
> > +
> > +	bp_info = counter_arch_bp(bp);
> > +
> > +	/*
> > +	 * We shall invoke the user-defined callback function in the single
> > +	 * stepping handler to confirm to 'trigger-after-execute' semantics
> > +	 */
> > +	perf_bp_event(bp, regs);
> > +
> > +	/*
> > +	 * Do not disable MSR_SE if the process was already in
> > +	 * single-stepping mode. We cannot reliable detect single-step mode
> > +	 * for kernel-space breakpoints, so this cannot work along with other
> > +	 * debuggers (like KGDB, xmon) which may be single-stepping kernel code.
> > +	 */
> > +	if (!(user_bp && test_thread_flag(TIF_SINGLESTEP)))
> > +		regs->msr &= ~MSR_SE;
> > +
> > +	/* Deliver signal to user-space */
> > +	if (user_bp) {
> > +		info.si_signo = SIGTRAP;
> > +		info.si_errno = 0;
> > +		info.si_code = TRAP_HWBKPT;
> > +		info.si_addr = (void __user *)bp_info->address;
> > +		force_sig_info(SIGTRAP, &info, current);
> > +	}
> > +
> > +	set_dabr(bp_info->address | bp_info->type | DABR_TRANSLATION);
> > +	ret = NOTIFY_STOP;
> > +out:
> > +	put_cpu();
> > +	return ret;
> > +}
> > +
> > +/*
> > + * Handle debug exception notifications.
> > + */
> > +int __kprobes hw_breakpoint_exceptions_notify(
> > +		struct notifier_block *unused, unsigned long val, void *data)
> > +{
> > +	int ret = NOTIFY_DONE;
> > +
> > +	switch (val) {
> > +	case DIE_DABR_MATCH:
> > +		ret = hw_breakpoint_handler(data);
> > +		break;
> > +	case DIE_SSTEP:
> > +		ret = single_step_dabr_instruction(data);
> > +		break;
> > +	}
> > +
> > +	return ret;
> > +}
> > +
> > +/*
> > + * Release the user breakpoints used by ptrace
> > + */
> > +void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
> > +{
> > +	struct thread_struct *t = &tsk->thread;
> > +
> > +	unregister_hw_breakpoint(t->ptrace_bps[0]);
> > +	t->ptrace_bps[0] = NULL;
> > +}
> 
> Ok, so I see that you call that on context switch. But where do you
> re-install the breakpoint for the "new" process ?
> 
> See below...
> 

This is being unconditionally invoked inside __switch_to() which is
wrong. In addition, it also leads to a lockdep warning which I will fix
in my subsequent patch.

> > +void hw_breakpoint_pmu_read(struct perf_event *bp)
> > +{
> > +	/* TODO */
> > +}
> > +
> > +void hw_breakpoint_pmu_unthrottle(struct perf_event *bp)
> > +{
> > +	/* TODO */
> > +}
> > +
> > +
> > Index: linux-2.6.ppc64_test/arch/powerpc/Kconfig
> > ===================================================================
> > --- linux-2.6.ppc64_test.orig/arch/powerpc/Kconfig
> > +++ linux-2.6.ppc64_test/arch/powerpc/Kconfig
> > @@ -140,6 +140,7 @@ config PPC
> >  	select HAVE_SYSCALL_WRAPPERS if PPC64
> >  	select GENERIC_ATOMIC64 if PPC32
> >  	select HAVE_PERF_EVENTS
> > +	select HAVE_HW_BREAKPOINT if PPC64
> 
> Why 64-bit only ? ppc32 has DABR too. In fact BookE also provides DABR
> emulation.
>

HAVE_HW_BREAKPOINT would be too generic, so I guess this can be changed
to HAVE_HW_BREAKPOINT_DABR (or something similar?). I think you're
referring to BookE's ability to individually use DAC registers are two
debug registers - which cannot be handled by this patch and can be done
during BookE implementation.

> Also, all your PPC64 stuff are going to show up on BookE 64-bit which
> might not be what you wanted...
>

True...I thought PPC64 would refer to the server class of processors
(which has just one DABR)...is there a config flag to refer to such
processors? Or should this patch create one?
 
> >  config EARLY_PRINTK
> >  	bool
> > Index: linux-2.6.ppc64_test/arch/powerpc/kernel/Makefile
> > ===================================================================
> > --- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/Makefile
> > +++ linux-2.6.ppc64_test/arch/powerpc/kernel/Makefile
> > @@ -33,7 +33,7 @@ obj-y				:= cputable.o ptrace.o syscalls
> >  obj-y				+= vdso32/
> >  obj-$(CONFIG_PPC64)		+= setup_64.o sys_ppc32.o \
> >  				   signal_64.o ptrace32.o \
> > -				   paca.o nvram_64.o firmware.o
> > +				   paca.o nvram_64.o firmware.o hw_breakpoint.o
> >  obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_ppc970.o cpu_setup_pa6t.o
> >  obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
> >  obj-$(CONFIG_PPC_BOOK3E_64)	+= exceptions-64e.o
> > Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/reg.h
> > ===================================================================
> > --- linux-2.6.ppc64_test.orig/arch/powerpc/include/asm/reg.h
> > +++ linux-2.6.ppc64_test/arch/powerpc/include/asm/reg.h
> > @@ -180,6 +180,7 @@
> >  #define   CTRL_TE	0x00c00000	/* thread enable */
> >  #define   CTRL_RUNLATCH	0x1
> >  #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
> > +#define   HBP_NUM	1	/* Number of physical HW breakpoint registers */
> 
> The above is not quite right. First you already define that in
> hw_breakpoint.h. Then, this is too short an identifier for such a
> generic file. Finally, it should not be in reg.h since it can vary
> from processor to processor. If you want to do things properly, then
> add some kind of info about the debug capabilities to cputable.
> 
> Please sync with Shaggy so it makes sense on BookE as well.
> 

I think cputable.h would be a fine place to define this...I will move
HBP_NUM there. However, renaming it would be difficult since the generic
code in kernel/hw_breakpoint.c (and references in x86) uses this
constant.

> >  #define   DABR_TRANSLATION	(1UL << 2)
> >  #define   DABR_DATA_WRITE	(1UL << 1)
> >  #define   DABR_DATA_READ	(1UL << 0)
> > Index: linux-2.6.ppc64_test/arch/powerpc/mm/fault.c
> > ===================================================================
> > --- linux-2.6.ppc64_test.orig/arch/powerpc/mm/fault.c
> > +++ linux-2.6.ppc64_test/arch/powerpc/mm/fault.c
> > @@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
> >  		error_code &= 0x48200000;
> >  	else
> >  		is_write = error_code & DSISR_ISSTORE;
> > +
> > +	if (error_code & DSISR_DABRMATCH) {
> > +		/* DABR match */
> > +		do_dabr(regs, address, error_code);
> > +		return 0;
> > +	}
> 
> Now that's interesting. I have the feeling that the moving up of this
> might actually be a bug fix :-) But it's still wrong due to interrupts
> being enabled as I explained earlier. We probably want to make it a
> different path out of head_*.S
> 

Agreed, this change would not be required after patching do_hash_page in
exceptions-64s.S.

> >  #else
> >  	is_write = error_code & ESR_DST;
> >  #endif /* CONFIG_4xx || CONFIG_BOOKE */
> > @@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
> >  	if (!user_mode(regs) && (address >= TASK_SIZE))
> >  		return SIGSEGV;
> >  
> > -#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
> > -  	if (error_code & DSISR_DABRMATCH) {
> > -		/* DABR match */
> > -		do_dabr(regs, address, error_code);
> > -		return 0;
> > -	}
> > -#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
> > -
> >  	if (in_atomic() || mm == NULL) {
> >  		if (!user_mode(regs))
> >  			return SIGSEGV;
> > Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/processor.h
> > ===================================================================
> > --- linux-2.6.ppc64_test.orig/arch/powerpc/include/asm/processor.h
> > +++ linux-2.6.ppc64_test/arch/powerpc/include/asm/processor.h
> > @@ -209,6 +209,12 @@ struct thread_struct {
> >  #ifdef CONFIG_PPC64
> >  	unsigned long	start_tb;	/* Start purr when proc switched in */
> >  	unsigned long	accum_tb;	/* Total accumilated purr for process */
> > +	struct perf_event *ptrace_bps[HBP_NUM];
> 
> So you should probably call that MAX_HW_BREAKPOINTS and reflect the fact
> that it can be bigger. Or you have a pointer to some optional ptrace
> BP structure that handle what is needed, and can be allocated lazily
> by ptrace only when needed rather than always carrying this around in
> the thread_struct.
> 

Since ptrace's request for debug registers are thread-specific, they are
stored in 'struct thread_struct' (and is analogous to its implementation
in x86). In fact previous attempts to store such values outside
thread_struct were discouraged by the LKML community (refer Ingo's mail
on LKML here: 20090310143543.GE3850@elte.hu) citing potential locking
issues when stored outside.

> > +	/*
> > +	 * Point to the hw-breakpoint last. Helps safe pre-emption and
> > +	 * hw-breakpoint re-enablement.
> > +	 */
> > +	struct perf_event *last_hit_ubp;
> 
> The comment doesn't make much sense. Preemption doesn't seem quite right
> to me unless I missed something and the comment is either too much or
> not enough to understand what this is for.
>

The single-step exception may arise due to two reasons - (a) a legitimate
user (like kgdb in kernel-, or ptrace in user-space) decides to
single-step for debugging purposes or (b) single-stepping enabled by
hw_breakpoint_handler() to restore the debug register values.

'last_hit_ubp' along with 'per_cpu(last_hit_bp)' are used to distinguish
single-step exceptions that are triggered because of (b).

These data structures will not be required if pre-emption between
hw_breakpoint_handler() and single-step exception is disabled, since it
will be straight-forward to distinguish the source of exception i.e. (a)
from (b). In such a case, with pre-emption disabled on the CPU that hit
the breakpoint, single-step exceptions following a hw_breakpoint_handler()
will always be the result of (b) and vice-versa.

I will make the comment more descriptive as below:

/*
 * Helps identify source of single-step exception and subsequent
 * hw-breakpoint enablement
 */

Thank you for the detailed review - they have helped unearth certain
important issues with the patch.

As stated before, I will send a subsequent version of the patch with the
fixes as agreed above.

Thanks,
K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-03-08 18:14 ` [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64 K.Prasad
  2010-03-12  6:19   ` Benjamin Herrenschmidt
@ 2010-03-23  5:33   ` Paul Mackerras
  2010-03-23  7:28     ` K.Prasad
  1 sibling, 1 reply; 26+ messages in thread
From: Paul Mackerras @ 2010-03-23  5:33 UTC (permalink / raw)
  To: K.Prasad
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Frederic Weisbecker, David Gibson, linuxppc-dev, Alan Stern,
	Roland McGrath

On Mon, Mar 08, 2010 at 11:44:48PM +0530, K.Prasad wrote:

> @@ -479,6 +483,7 @@ struct task_struct *__switch_to(struct t
>  		old_thread->accum_tb += (current_tb - start_tb);
>  		new_thread->start_tb = current_tb;
>  	}
> +	flush_ptrace_hw_breakpoint(current);
>  #endif
>  
>  	local_irq_save(flags);

This line should be in flush_thread(), not __switch_to().  In fact it
may not be necessary at all given that flush_ptrace_hw_breakpoint()
gets called in do_exit().

Paul.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-03-23  5:33   ` Paul Mackerras
@ 2010-03-23  7:28     ` K.Prasad
  0 siblings, 0 replies; 26+ messages in thread
From: K.Prasad @ 2010-03-23  7:28 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Frederic Weisbecker, David Gibson, linuxppc-dev, Alan Stern,
	Roland McGrath

On Tue, Mar 23, 2010 at 04:33:01PM +1100, Paul Mackerras wrote:
> On Mon, Mar 08, 2010 at 11:44:48PM +0530, K.Prasad wrote:
> 
> > @@ -479,6 +483,7 @@ struct task_struct *__switch_to(struct t
> >  		old_thread->accum_tb += (current_tb - start_tb);
> >  		new_thread->start_tb = current_tb;
> >  	}
> > +	flush_ptrace_hw_breakpoint(current);
> >  #endif
> >  
> >  	local_irq_save(flags);
> 
> This line should be in flush_thread(), not __switch_to().  In fact it
> may not be necessary at all given that flush_ptrace_hw_breakpoint()
> gets called in do_exit().
> 
> Paul.

Yes, I did realise it. The unintended movement of
flush_ptrace_hw_breakpoint() from flush_thread() is a result of a
patching error (while forward porting).

A fix for this issue along with those pointed out by BenH will be a part
of the next version of the patch, that I intend to send very soon.

Thanks for reviewing them.

-- K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-03-15  6:29     ` K.Prasad
@ 2010-04-07  8:03       ` Benjamin Herrenschmidt
  2010-04-14  3:53         ` K.Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: Benjamin Herrenschmidt @ 2010-04-07  8:03 UTC (permalink / raw)
  To: prasad
  Cc: Michael Neuling, shaggy, Frederic Weisbecker, David Gibson,
	linuxppc-dev, paulus, Alan Stern, Roland McGrath

Ok so too many problems with your last patch, I didn't have time to fix
them all, so it's not going into -next this week.

Please, test with a variety of defconfigs (iseries, cell, g5 for
example), and especially with CONFIG_PERF_EVENTS not set. There are
issues in the generic header for that (though I'm told some people are
working on a fix).

Basically, we need something like CONFIG_HW_BREAKPOINTS that is set
(maybe optionally, maybe not) with CONFIG_PERF_EVENTS and
CONFIG_HAVE_HW_BREAKPOINT. Then you can use that to properly fix the
ifdef'ing in include/linux/hw_breakpoints.h, and fix the various other
cases in powerpc code where you are testing for the wrong thing.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-04-07  8:03       ` Benjamin Herrenschmidt
@ 2010-04-14  3:53         ` K.Prasad
  0 siblings, 0 replies; 26+ messages in thread
From: K.Prasad @ 2010-04-14  3:53 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Michael Neuling, shaggy, Frederic Weisbecker, David Gibson,
	linuxppc-dev, paulus, Alan Stern, Roland McGrath

On Wed, Apr 07, 2010 at 06:03:31PM +1000, Benjamin Herrenschmidt wrote:
> Ok so too many problems with your last patch, I didn't have time to fix
> them all, so it's not going into -next this week.
> 
> Please, test with a variety of defconfigs (iseries, cell, g5 for
> example), and especially with CONFIG_PERF_EVENTS not set. There are
> issues in the generic header for that (though I'm told some people are
> working on a fix).
> 
> Basically, we need something like CONFIG_HW_BREAKPOINTS that is set
> (maybe optionally, maybe not) with CONFIG_PERF_EVENTS and
> CONFIG_HAVE_HW_BREAKPOINT. Then you can use that to properly fix the
> ifdef'ing in include/linux/hw_breakpoints.h, and fix the various other
> cases in powerpc code where you are testing for the wrong thing.
> 
> Cheers,
> Ben.
>

Hi Ben,
	I've sent a new patch (linuxppc-dev message-id ref:
20100414034340.GA6571@in.ibm.com) that builds against the defconfigs for
various architectures pointed out by you (I did see quite a few errors
that aren't induced by the patch).

The source tree is buildable even without CONFIG_PERF_EVENTS, and is
limited to scope using CONFIG_HAVE_HW_BREAKPOINT. At this stage, I
didnot find a need for a seperate CONFIG_HW_BREAKPOINTS though.

Let me know what you think.

Thanks,
K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-02-26  1:58       ` Frederic Weisbecker
  2010-03-08 23:57         ` David Gibson
@ 2010-03-09  2:14         ` K.Prasad
  1 sibling, 0 replies; 26+ messages in thread
From: K.Prasad @ 2010-03-09  2:14 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Mahesh Salgaonkar, Will Deacon, David Gibson, linuxppc-dev,
	Alan Stern, paulus, Roland McGrath

On Fri, Feb 26, 2010 at 02:58:12AM +0100, Frederic Weisbecker wrote:
> On Mon, Feb 22, 2010 at 06:47:46PM +0530, K.Prasad wrote:
> > The 'name' field here is actually a legacy inherited from x86 code. It
> > is part of x86's arch-specific hw-breakpoint structure since:
> > - inspired by the kprobe implementation which accepts symbol name as
> >   input.
> > - kallsyms_lookup_name() was 'unexported' and a module could not resolve
> >   symbol names externally, so the core-infrastructure had to provide
> >   such facilities.
> > - I wasn't sure about the discussions behind 'unexporting' of
> >   kallsyms_lookup_name(), so did not venture to export them again (which
> >   you rightfully did :-)
> > 
> > Having said that, I'll be glad to remove this field (along with that in
> > x86),
> 
> Cool, I'll integrate the x86 name field removal to the .24 series
> 

I've removed the .name field in the latest version of the patch sent
(linuxppc-dev message-id: 20100308181232.GA3406@in.ibm.com).

> > > > +void ptrace_triggered(struct perf_event *bp, int nmi,
> > > > +		      struct perf_sample_data *data, struct pt_regs *regs)
> > > > +{
> > > > +	struct perf_event_attr attr;
> > > > +
> > > > +	/*
> > > > +	 * Disable the breakpoint request here since ptrace has defined a
> > > > +	 * one-shot behaviour for breakpoint exceptions in PPC64.
> > > > +	 * The SIGTRAP signal is generated automatically for us in do_dabr().
> > > > +	 * We don't have to do anything about that here
> > > > +	 */
> > > 
> > > 
> > > Oh, why does ptrace use a one-shot behaviour in ppc? Breakpoints
> > > only trigger once?
> > > 
> > 
> > Yes, ptrace breakpoints on PPC64 are designed to trigger once and this
> > patch retains that behaviour. It is very convenient to use one-shot
> > behaviour on archs where exceptions are triggered-before-execute.
> 
> Ah, Why?
>

Because we don't have to create the single_step_dabr_instruction()
function :-)

With one-shot behaviour, the hw_breakpoint_handler() doesn't have to
worry about entering into an infinite-loop (since exceptions are
triggered before instruction execution, and if breakpoints are still
active every attempt to execute the causative instruction raises the
breakpoint exception). To circumvent infinite looping, we temporarily
disable the breakpoints, enable single-stepping (to single-step over
the causative instruction) and re-enable them inside single-step 
exception handler. For the one-shot type breakpoint, the pain of
re-registration is left to the end-user.

I've wondered why PowerPC was designed to be 'trigger-before-execute'
when handling the exception can grow complex as this. Perhaps it is
meant to give absolute control over the data value (stored in the target
address) before the instruction gets to see it. For instance,
breakpoints can be used to change the data to a 'desirable' value before
actually being 'seen' by instructions.

Thanks,
K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-02-26  1:58       ` Frederic Weisbecker
@ 2010-03-08 23:57         ` David Gibson
  2010-03-09  2:14         ` K.Prasad
  1 sibling, 0 replies; 26+ messages in thread
From: David Gibson @ 2010-03-08 23:57 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Mahesh Salgaonkar, Will Deacon, linuxppc-dev, Alan Stern, paulus,
	K.Prasad, Roland McGrath

On Fri, Feb 26, 2010 at 02:58:12AM +0100, Frederic Weisbecker wrote:
> On Mon, Feb 22, 2010 at 06:47:46PM +0530, K.Prasad wrote:
[snip]
> > > Oh, why does ptrace use a one-shot behaviour in ppc? Breakpoints
> > > only trigger once?
> > > 
> > 
> > Yes, ptrace breakpoints on PPC64 are designed to trigger once and this
> > patch retains that behaviour. It is very convenient to use one-shot
> > behaviour on archs where exceptions are triggered-before-execute.
> 
> Ah, Why?

Because otherwise you have jump through some tricky hoops so that when
gdb (or whatever) resumes after the breakpoint, you don't immediately
retrap in the same place.  I gather x86 has hardware assistance to do
this, but powerpc doesn't.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-02-23 10:57       ` K.Prasad
@ 2010-02-26 17:52         ` Frederic Weisbecker
  0 siblings, 0 replies; 26+ messages in thread
From: Frederic Weisbecker @ 2010-02-26 17:52 UTC (permalink / raw)
  To: K.Prasad
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Mahesh Salgaonkar, Will Deacon, David Gibson, linuxppc-dev,
	Alan Stern, paulus, Roland McGrath

On Tue, Feb 23, 2010 at 04:27:15PM +0530, K.Prasad wrote:
> On Mon, Feb 22, 2010 at 06:47:46PM +0530, K.Prasad wrote:
> > On Sun, Feb 21, 2010 at 02:01:37AM +0100, Frederic Weisbecker wrote:
> > > On Mon, Feb 15, 2010 at 11:29:14AM +0530, K.Prasad wrote:
> [snipped]
> > > Also, do you think addr/len/type is enough to abstract out
> > > any ppc breakpoints?
> > > 
> > > This looks enough to me to express range breakpoints and
> > > simple breakpoints. But what about value comparison?
> > > (And still, there may be other trickier implementations
> > > I don't know in ppc).
> > > 
> > 
> > The above implementation is for PPC64 architecture that supports only
> > 'simple' breakpoints of fixed length (no range breakpoints, no value
> > comparison). More on that below.
> >
> 
> Looks like I forgot the 'more on that below' part :-)....here are some
> thoughts...
> 
> Architectures like PPC Book-E have support for a variety of
> sophisticated debug features and our generic framework, in its present
> form, cannot easily port itself to these processors. In order to extend
> the framework for PPC Book-E, I intend the following to begin with:
> 
> - Implement support for data breakpoints through DAC registers with all
>   the 'bells and whistles'...support for instruction breakpoints through
>   IAC can come in later (without precluding its use through ptrace).
> 
> - Embed the flags/variables to store DVC, masked address mode, etc. in
>   'struct arch_hw_breakpoint', which will be populated by the user of
>   register_breakpoint interface.



Agreed.



> 
> Apart from the above extensions to the framework, changes in the generic
> code would be required as described in an earlier LKML mail (ref:
> message-id: 20091127190705.GB18408@in.ibm.com)....relevant contents
> pasted below:
> 
> "I think the register_<> interfaces can become wrappers around functions
> that do the following:
> 
> - arch_validate(): Validate request by invoking an arch-dependant
>   routine. Proceed if returned valid.
> - arch-specific debugreg availability: Do something like
>   if (arch_hw_breakpoint_availabile())
>         bp = perf_event_create_kernel_counter();



This is already what does register_hw_break....(), it fails
if a slot is not available:

perf_event_create_kernel_counter -> perf_bp_init() -> reserve_bp_slot()

Having a:

if (arch_hw_breakpoint_availabile())
         bp = perf_event_create_kernel_counter();

would be racy.



> 
>   perf_event_create_kernel_counter()--->arch_install_hw_breakpoint();
> 
> This way, all book-keeping related work (no. of pinned/flexible/per-cpu)
> will be moved to arch-specific files (will be helpful for PPC Book-E
> implementation having two types of debug registers). Every new
> architecture that intends to port to the new hw-breakpoint
> implementation must define their arch_validate(),
> arch_hw_breakpoint_available() and an arch_install_hw_breakpoint(),
> while the hw-breakpoint code will be flexible enough to extend itself to
> each of these archs."
> 
> Let me know what you think of the above.



We certainly need the slot reservation in arch (a part of it at least).
But we also need a kind of new interface for arch predefined attributes,
instead of generic attributes.

Probably we need a kind of perf_event_create_kernel_counter() that
can accept either a perf_event_attr (for perf syscall or ftrace)
and an arch structure that can be passed to the breakpoint API,
so that we don't need the generic translation.


> 
> Thanks,
> K.Prasad
> 

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-02-22 13:17     ` K.Prasad
  2010-02-23 10:57       ` K.Prasad
@ 2010-02-26  1:58       ` Frederic Weisbecker
  2010-03-08 23:57         ` David Gibson
  2010-03-09  2:14         ` K.Prasad
  1 sibling, 2 replies; 26+ messages in thread
From: Frederic Weisbecker @ 2010-02-26  1:58 UTC (permalink / raw)
  To: K.Prasad
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Mahesh Salgaonkar, Will Deacon, David Gibson, linuxppc-dev,
	Alan Stern, paulus, Roland McGrath

On Mon, Feb 22, 2010 at 06:47:46PM +0530, K.Prasad wrote:
> The 'name' field here is actually a legacy inherited from x86 code. It
> is part of x86's arch-specific hw-breakpoint structure since:
> - inspired by the kprobe implementation which accepts symbol name as
>   input.
> - kallsyms_lookup_name() was 'unexported' and a module could not resolve
>   symbol names externally, so the core-infrastructure had to provide
>   such facilities.
> - I wasn't sure about the discussions behind 'unexporting' of
>   kallsyms_lookup_name(), so did not venture to export them again (which
>   you rightfully did :-)
> 
> Having said that, I'll be glad to remove this field (along with that in
> x86),



Cool, I'll integrate the x86 name field removal to the .24 series



> provided we know that there's a way for the user to resolve symbol
> names on its own i.e. routines like kallsyms_lookup_name() will remain
> exported.


Yeah, I guess it's fine to keep kallsyms_lookup_name() exported.


 
> > Also, do you think addr/len/type is enough to abstract out
> > any ppc breakpoints?
> > 
> > This looks enough to me to express range breakpoints and
> > simple breakpoints. But what about value comparison?
> > (And still, there may be other trickier implementations
> > I don't know in ppc).
> > 
> 
> The above implementation is for PPC64 architecture that supports only
> 'simple' breakpoints of fixed length (no range breakpoints, no value
> comparison). More on that below.


Ok. I was just a bit confused in the middle of the several PPC breakpoint
implementations :)



> > > +	/*
> > > +	 * As a policy, the callback is invoked in a 'trigger-after-execute'
> > > +	 * fashion
> > > +	 */
> > > +	(bp->overflow_handler)(bp, 0, NULL, regs);
> > 
> > 
> > Why are you calling this explicitly instead of using the perf_bp_event()
> > thing? This looks like it won't work with perf as the event won't
> > be recorded by perf.
> > 
> 
> Yes, should have invoked perf_bp_event() for perf to work well (on a
> side note, it makes me wonder at the amount of 'extra' code that each
> breakpoint exception would execute if it were not called through perf
> sys-call...well, the costs of integrating with a generic infrastructure!)


It has the benefit of not adding extra checks in the breakpoint handler,
like checking the callback. Every breakpoint is treated the same way, which
makes the code more simple.


 
> > > +void ptrace_triggered(struct perf_event *bp, int nmi,
> > > +		      struct perf_sample_data *data, struct pt_regs *regs)
> > > +{
> > > +	struct perf_event_attr attr;
> > > +
> > > +	/*
> > > +	 * Disable the breakpoint request here since ptrace has defined a
> > > +	 * one-shot behaviour for breakpoint exceptions in PPC64.
> > > +	 * The SIGTRAP signal is generated automatically for us in do_dabr().
> > > +	 * We don't have to do anything about that here
> > > +	 */
> > 
> > 
> > Oh, why does ptrace use a one-shot behaviour in ppc? Breakpoints
> > only trigger once?
> > 
> 
> Yes, ptrace breakpoints on PPC64 are designed to trigger once and this
> patch retains that behaviour. It is very convenient to use one-shot
> behaviour on archs where exceptions are triggered-before-execute.


Ah, Why?



> > This looks fine for basic breakpoints. And this can probably be
> > improved to integrate ranges.
> > 
> > But I think we did something wrong with the generic breakpoint
> > interface. We are translating the arch values to generic
> > attributes. Then this all will be translated back to arch
> > values.
> > 
> > Having generic attributes is necessary for any perf event
> > use from userspace. But it looks like a waste for ptrace
> > that already gives us arch values. And the problem
> > is the same for x86.
> > 
> > So I think we should implement a register_ptrace_breakpoint()
> > that doesn't take perf_event_attr but specific arch informations,
> > so that we don't need to pass through a generic conversion, which:
> > 
> 
> I agree that the layers of conversion from generic to arch-specific
> breakpoint constants is wasteful.
> Can't the arch_bp_generic_fields() function be moved to
> arch/x86/kernel/ptrace.c instead of a new interface?


I'll answer in your subsequent mail :)


 
> > - is wasteful
> > - won't be able to express 100% of any arch capabilities. We
> >   can certainly express most arch breakpoints features through
> >   the generic interface, but not all of them (given how tricky
> >   the data value comparison features can be)
> > 
> > I will rework that during the next cycle.
> > 
> > Thanks.
> >
> 
> Thank you for the comments. I will re-send a new version of the patch
> with the perf_bp_event() substitution.


Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-02-22 13:17     ` K.Prasad
@ 2010-02-23 10:57       ` K.Prasad
  2010-02-26 17:52         ` Frederic Weisbecker
  2010-02-26  1:58       ` Frederic Weisbecker
  1 sibling, 1 reply; 26+ messages in thread
From: K.Prasad @ 2010-02-23 10:57 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Mahesh Salgaonkar, Will Deacon, David Gibson, linuxppc-dev,
	Alan Stern, paulus, Roland McGrath

On Mon, Feb 22, 2010 at 06:47:46PM +0530, K.Prasad wrote:
> On Sun, Feb 21, 2010 at 02:01:37AM +0100, Frederic Weisbecker wrote:
> > On Mon, Feb 15, 2010 at 11:29:14AM +0530, K.Prasad wrote:
[snipped]
> > Also, do you think addr/len/type is enough to abstract out
> > any ppc breakpoints?
> > 
> > This looks enough to me to express range breakpoints and
> > simple breakpoints. But what about value comparison?
> > (And still, there may be other trickier implementations
> > I don't know in ppc).
> > 
> 
> The above implementation is for PPC64 architecture that supports only
> 'simple' breakpoints of fixed length (no range breakpoints, no value
> comparison). More on that below.
>

Looks like I forgot the 'more on that below' part :-)....here are some
thoughts...

Architectures like PPC Book-E have support for a variety of
sophisticated debug features and our generic framework, in its present
form, cannot easily port itself to these processors. In order to extend
the framework for PPC Book-E, I intend the following to begin with:

- Implement support for data breakpoints through DAC registers with all
  the 'bells and whistles'...support for instruction breakpoints through
  IAC can come in later (without precluding its use through ptrace).

- Embed the flags/variables to store DVC, masked address mode, etc. in
  'struct arch_hw_breakpoint', which will be populated by the user of
  register_breakpoint interface.

Apart from the above extensions to the framework, changes in the generic
code would be required as described in an earlier LKML mail (ref:
message-id: 20091127190705.GB18408@in.ibm.com)....relevant contents
pasted below:

"I think the register_<> interfaces can become wrappers around functions
that do the following:

- arch_validate(): Validate request by invoking an arch-dependant
  routine. Proceed if returned valid.
- arch-specific debugreg availability: Do something like
  if (arch_hw_breakpoint_availabile())
        bp = perf_event_create_kernel_counter();

  perf_event_create_kernel_counter()--->arch_install_hw_breakpoint();

This way, all book-keeping related work (no. of pinned/flexible/per-cpu)
will be moved to arch-specific files (will be helpful for PPC Book-E
implementation having two types of debug registers). Every new
architecture that intends to port to the new hw-breakpoint
implementation must define their arch_validate(),
arch_hw_breakpoint_available() and an arch_install_hw_breakpoint(),
while the hw-breakpoint code will be flexible enough to extend itself to
each of these archs."

Let me know what you think of the above.

Thanks,
K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-02-21  1:01   ` Frederic Weisbecker
@ 2010-02-22 13:17     ` K.Prasad
  2010-02-23 10:57       ` K.Prasad
  2010-02-26  1:58       ` Frederic Weisbecker
  0 siblings, 2 replies; 26+ messages in thread
From: K.Prasad @ 2010-02-22 13:17 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Mahesh Salgaonkar, Will Deacon, David Gibson, linuxppc-dev,
	Alan Stern, paulus, Roland McGrath

On Sun, Feb 21, 2010 at 02:01:37AM +0100, Frederic Weisbecker wrote:
> On Mon, Feb 15, 2010 at 11:29:14AM +0530, K.Prasad wrote:
> > +struct arch_hw_breakpoint {
> > +	u8		len; /* length of the target symbol */
> > +	int		type;
> > +	char		*name; /* Contains name of the symbol to set bkpt */
> > +	unsigned long	address;
> > +};
> 
> 
> 
> 
> I don't think it's a good idea to integrate the name of
> the target. This is something that should be done in a higher
> level, not in an arch backend.
> We don't even need to store it anywhere as we can resolve
> back an address easily. Symbol awareness is not something
> the hardware breakpoint should care about, neither in the
> arch nor the generic level.
> 

The 'name' field here is actually a legacy inherited from x86 code. It
is part of x86's arch-specific hw-breakpoint structure since:
- inspired by the kprobe implementation which accepts symbol name as
  input.
- kallsyms_lookup_name() was 'unexported' and a module could not resolve
  symbol names externally, so the core-infrastructure had to provide
  such facilities.
- I wasn't sure about the discussions behind 'unexporting' of
  kallsyms_lookup_name(), so did not venture to export them again (which
  you rightfully did :-)

Having said that, I'll be glad to remove this field (along with that in
x86), provided we know that there's a way for the user to resolve symbol
names on its own i.e. routines like kallsyms_lookup_name() will remain
exported.

> Also, do you think addr/len/type is enough to abstract out
> any ppc breakpoints?
> 
> This looks enough to me to express range breakpoints and
> simple breakpoints. But what about value comparison?
> (And still, there may be other trickier implementations
> I don't know in ppc).
> 

The above implementation is for PPC64 architecture that supports only
'simple' breakpoints of fixed length (no range breakpoints, no value
comparison). More on that below.
 
> > +
> > +#include <linux/kdebug.h>
> > +#include <asm/reg.h>
> > +#include <asm/system.h>
> > +
> > +/* Total number of available HW breakpoint registers */
> > +#define HBP_NUM 1
> 
> 
> Looking at the G2 PowerPc implementation, DABR and DABR2 can either
> express two different watchpoints or one range watchpoint.
> 
> There are also IABR and IABR2 for instruction breakpoints that
> follow the same above scheme. I'm not sure we can abstract that
> using a constant max linear number of resources.
> 
> 

As stated above, the patch implements breakpoints for PPC64 processors
only (http://www.power.org/resources/downloads/PowerISA_V2.06_PUBLIC.pdf),
which does not support advanced breakpoint features (which its ppc
Book-E counterpart has).
 
> > +static inline void hw_breakpoint_disable(void)
> > +{
> > +	set_dabr(0);
> > +}
> 
> 
> So, this is only about data breakpoints?
> 
> 

Yes, newer PPC64 processors do not support IABR.

> > +	/*
> > +	 * As a policy, the callback is invoked in a 'trigger-after-execute'
> > +	 * fashion
> > +	 */
> > +	(bp->overflow_handler)(bp, 0, NULL, regs);
> 
> 
> Why are you calling this explicitly instead of using the perf_bp_event()
> thing? This looks like it won't work with perf as the event won't
> be recorded by perf.
> 

Yes, should have invoked perf_bp_event() for perf to work well (on a
side note, it makes me wonder at the amount of 'extra' code that each
breakpoint exception would execute if it were not called through perf
sys-call...well, the costs of integrating with a generic infrastructure!)

> > +void ptrace_triggered(struct perf_event *bp, int nmi,
> > +		      struct perf_sample_data *data, struct pt_regs *regs)
> > +{
> > +	struct perf_event_attr attr;
> > +
> > +	/*
> > +	 * Disable the breakpoint request here since ptrace has defined a
> > +	 * one-shot behaviour for breakpoint exceptions in PPC64.
> > +	 * The SIGTRAP signal is generated automatically for us in do_dabr().
> > +	 * We don't have to do anything about that here
> > +	 */
> 
> 
> Oh, why does ptrace use a one-shot behaviour in ppc? Breakpoints
> only trigger once?
> 

Yes, ptrace breakpoints on PPC64 are designed to trigger once and this
patch retains that behaviour. It is very convenient to use one-shot
behaviour on archs where exceptions are triggered-before-execute.

> > +	if (bp) {
> > +		attr = bp->attr;
> > +		attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
> > +
> > +		switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
> > +		case DABR_DATA_READ:
> > +			attr.bp_type = HW_BREAKPOINT_R;
> > +			break;
> > +		case DABR_DATA_WRITE:
> > +			attr.bp_type = HW_BREAKPOINT_W;
> > +			break;
> > +		case (DABR_DATA_WRITE | DABR_DATA_READ):
> > +			attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
> > +			break;
> > +		}
> > +		ret =  modify_user_hw_breakpoint(bp, &attr);
> > +		if (ret)
> > +			return ret;
> > +		thread->ptrace_bps[0] = bp;
> > +		thread->dabr = data;
> > +		return 0;
> > +	}
> > +
> > +	/* Create a new breakpoint request if one doesn't exist already */
> > +	hw_breakpoint_init(&attr);
> > +	attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
> > +	switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
> > +	case DABR_DATA_READ:
> > +		attr.bp_type = HW_BREAKPOINT_R;
> > +		break;
> > +	case DABR_DATA_WRITE:
> > +		attr.bp_type = HW_BREAKPOINT_W;
> > +		break;
> > +	case (DABR_DATA_WRITE | DABR_DATA_READ):
> > +		attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
> > +		break;
> > +	}
> > +	thread->ptrace_bps[0] = bp = register_user_hw_breakpoint(&attr,
> > +							ptrace_triggered, task);
> > +	if (IS_ERR(bp)) {
> > +		thread->ptrace_bps[0] = NULL;
> > +		return PTR_ERR(bp);
> > +	}
> > +
> > +#endif /* CONFIG_PPC64 */
> 
> 
> 
> This looks fine for basic breakpoints. And this can probably be
> improved to integrate ranges.
> 
> But I think we did something wrong with the generic breakpoint
> interface. We are translating the arch values to generic
> attributes. Then this all will be translated back to arch
> values.
> 
> Having generic attributes is necessary for any perf event
> use from userspace. But it looks like a waste for ptrace
> that already gives us arch values. And the problem
> is the same for x86.
> 
> So I think we should implement a register_ptrace_breakpoint()
> that doesn't take perf_event_attr but specific arch informations,
> so that we don't need to pass through a generic conversion, which:
> 

I agree that the layers of conversion from generic to arch-specific
breakpoint constants is wasteful.
Can't the arch_bp_generic_fields() function be moved to
arch/x86/kernel/ptrace.c instead of a new interface?

> - is wasteful
> - won't be able to express 100% of any arch capabilities. We
>   can certainly express most arch breakpoints features through
>   the generic interface, but not all of them (given how tricky
>   the data value comparison features can be)
> 
> I will rework that during the next cycle.
> 
> Thanks.
>

Thank you for the comments. I will re-send a new version of the patch
with the perf_bp_event() substitution.

-- K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-02-15  5:59 ` [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64 K.Prasad
@ 2010-02-21  1:01   ` Frederic Weisbecker
  2010-02-22 13:17     ` K.Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: Frederic Weisbecker @ 2010-02-21  1:01 UTC (permalink / raw)
  To: K.Prasad
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Mahesh Salgaonkar, Will Deacon, David Gibson, linuxppc-dev,
	Alan Stern, paulus, Roland McGrath

On Mon, Feb 15, 2010 at 11:29:14AM +0530, K.Prasad wrote:
> +struct arch_hw_breakpoint {
> +	u8		len; /* length of the target symbol */
> +	int		type;
> +	char		*name; /* Contains name of the symbol to set bkpt */
> +	unsigned long	address;
> +};




I don't think it's a good idea to integrate the name of
the target. This is something that should be done in a higher
level, not in an arch backend.
We don't even need to store it anywhere as we can resolve
back an address easily. Symbol awareness is not something
the hardware breakpoint should care about, neither in the
arch nor the generic level.

Also, do you think addr/len/type is enough to abstract out
any ppc breakpoints?

This looks enough to me to express range breakpoints and
simple breakpoints. But what about value comparison?
(And still, there may be other trickier implementations
I don't know in ppc).


> +
> +#include <linux/kdebug.h>
> +#include <asm/reg.h>
> +#include <asm/system.h>
> +
> +/* Total number of available HW breakpoint registers */
> +#define HBP_NUM 1


Looking at the G2 PowerPc implementation, DABR and DABR2 can either
express two different watchpoints or one range watchpoint.

There are also IABR and IABR2 for instruction breakpoints that
follow the same above scheme. I'm not sure we can abstract that
using a constant max linear number of resources.



> +static inline void hw_breakpoint_disable(void)
> +{
> +	set_dabr(0);
> +}



So, this is only about data breakpoints?



> +	/*
> +	 * As a policy, the callback is invoked in a 'trigger-after-execute'
> +	 * fashion
> +	 */
> +	(bp->overflow_handler)(bp, 0, NULL, regs);



Why are you calling this explicitly instead of using the perf_bp_event()
thing? This looks like it won't work with perf as the event won't
be recorded by perf.


> +void ptrace_triggered(struct perf_event *bp, int nmi,
> +		      struct perf_sample_data *data, struct pt_regs *regs)
> +{
> +	struct perf_event_attr attr;
> +
> +	/*
> +	 * Disable the breakpoint request here since ptrace has defined a
> +	 * one-shot behaviour for breakpoint exceptions in PPC64.
> +	 * The SIGTRAP signal is generated automatically for us in do_dabr().
> +	 * We don't have to do anything about that here
> +	 */



Oh, why does ptrace use a one-shot behaviour in ppc? Breakpoints
only trigger once?



> +	if (bp) {
> +		attr = bp->attr;
> +		attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
> +
> +		switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
> +		case DABR_DATA_READ:
> +			attr.bp_type = HW_BREAKPOINT_R;
> +			break;
> +		case DABR_DATA_WRITE:
> +			attr.bp_type = HW_BREAKPOINT_W;
> +			break;
> +		case (DABR_DATA_WRITE | DABR_DATA_READ):
> +			attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
> +			break;
> +		}
> +		ret =  modify_user_hw_breakpoint(bp, &attr);
> +		if (ret)
> +			return ret;
> +		thread->ptrace_bps[0] = bp;
> +		thread->dabr = data;
> +		return 0;
> +	}
> +
> +	/* Create a new breakpoint request if one doesn't exist already */
> +	hw_breakpoint_init(&attr);
> +	attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
> +	switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
> +	case DABR_DATA_READ:
> +		attr.bp_type = HW_BREAKPOINT_R;
> +		break;
> +	case DABR_DATA_WRITE:
> +		attr.bp_type = HW_BREAKPOINT_W;
> +		break;
> +	case (DABR_DATA_WRITE | DABR_DATA_READ):
> +		attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
> +		break;
> +	}
> +	thread->ptrace_bps[0] = bp = register_user_hw_breakpoint(&attr,
> +							ptrace_triggered, task);
> +	if (IS_ERR(bp)) {
> +		thread->ptrace_bps[0] = NULL;
> +		return PTR_ERR(bp);
> +	}
> +
> +#endif /* CONFIG_PPC64 */



This looks fine for basic breakpoints. And this can probably be
improved to integrate ranges.

But I think we did something wrong with the generic breakpoint
interface. We are translating the arch values to generic
attributes. Then this all will be translated back to arch
values.

Having generic attributes is necessary for any perf event
use from userspace. But it looks like a waste for ptrace
that already gives us arch values. And the problem
is the same for x86.

So I think we should implement a register_ptrace_breakpoint()
that doesn't take perf_event_attr but specific arch informations,
so that we don't need to pass through a generic conversion, which:

- is wasteful
- won't be able to express 100% of any arch capabilities. We
  can certainly express most arch breakpoints features through
  the generic interface, but not all of them (given how tricky
  the data value comparison features can be)

I will rework that during the next cycle.

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-02-15  5:56 [Patch 0/1] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver XIII K.Prasad
@ 2010-02-15  5:59 ` K.Prasad
  2010-02-21  1:01   ` Frederic Weisbecker
  0 siblings, 1 reply; 26+ messages in thread
From: K.Prasad @ 2010-02-15  5:59 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Frederic Weisbecker, David Gibson, Alan Stern, paulus,
	Roland McGrath

Implement perf-events based hw-breakpoint interfaces for PPC64 processors.
These interfaces help arbitrate requests from various users and schedules
them as appropriate.

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                     |    1 
 arch/powerpc/include/asm/hw_breakpoint.h |   55 ++++
 arch/powerpc/include/asm/processor.h     |    6 
 arch/powerpc/include/asm/reg.h           |    1 
 arch/powerpc/kernel/Makefile             |    2 
 arch/powerpc/kernel/hw_breakpoint.c      |  363 +++++++++++++++++++++++++++++++
 arch/powerpc/kernel/process.c            |    5 
 arch/powerpc/kernel/ptrace.c             |   77 ++++++
 arch/powerpc/mm/fault.c                  |   14 -
 9 files changed, 515 insertions(+), 9 deletions(-)

Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/hw_breakpoint.h
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.ppc64_test/arch/powerpc/include/asm/hw_breakpoint.h	2010-02-15 02:10:43.000000000 +0530
@@ -0,0 +1,55 @@
+#ifndef	_PPC64_HW_BREAKPOINT_H
+#define	_PPC64_HW_BREAKPOINT_H
+
+#ifdef	__KERNEL__
+#define	__ARCH_HW_BREAKPOINT_H
+#ifdef CONFIG_PPC64
+
+struct arch_hw_breakpoint {
+	u8		len; /* length of the target symbol */
+	int		type;
+	char		*name; /* Contains name of the symbol to set bkpt */
+	unsigned long	address;
+};
+
+#include <linux/kdebug.h>
+#include <asm/reg.h>
+#include <asm/system.h>
+
+/* Total number of available HW breakpoint registers */
+#define HBP_NUM 1
+
+struct perf_event;
+struct pmu;
+struct perf_sample_data;
+
+#define HW_BREAKPOINT_ALIGN 0x7
+/* Maximum permissible length of any HW Breakpoint */
+#define HW_BREAKPOINT_LEN 0x8
+
+extern int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+						unsigned long val, void *data);
+int arch_install_hw_breakpoint(struct perf_event *bp);
+void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+void hw_breakpoint_pmu_read(struct perf_event *bp);
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp);
+extern void flush_ptrace_hw_breakpoint(struct task_struct *tsk);
+
+extern struct pmu perf_ops_bp;
+extern void ptrace_triggered(struct perf_event *bp, int nmi,
+			struct perf_sample_data *data, struct pt_regs *regs);
+static inline void hw_breakpoint_disable(void)
+{
+	set_dabr(0);
+}
+
+#else
+static inline void hw_breakpoint_disable(void)
+{
+	/* Function is defined only on PPC64 for now */
+}
+#endif	/* CONFIG_PPC64 */
+#endif	/* __KERNEL__ */
+#endif	/* _PPC64_HW_BREAKPOINT_H */
Index: linux-2.6.ppc64_test/arch/powerpc/kernel/hw_breakpoint.c
===================================================================
--- /dev/null	1970-01-01 00:00:00.000000000 +0000
+++ linux-2.6.ppc64_test/arch/powerpc/kernel/hw_breakpoint.c	2010-02-15 02:10:43.000000000 +0530
@@ -0,0 +1,363 @@
+/*
+ * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
+ * using the CPU's debug registers. Derived from
+ * "arch/x86/kernel/hw_breakpoint.c"
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright 2009 IBM Corporation
+ * Author: K.Prasad <prasad@linux.vnet.ibm.com>
+ *
+ */
+
+#include <linux/hw_breakpoint.h>
+#include <linux/notifier.h>
+#include <linux/kprobes.h>
+#include <linux/percpu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/smp.h>
+
+#include <asm/hw_breakpoint.h>
+#include <asm/processor.h>
+#include <asm/sstep.h>
+
+/*
+ * Store the 'bp' that caused the hw-breakpoint exception just before we
+ * single-step. Used to distinguish a single-step exception (due to a previous
+ * hw-breakpoint exception) from a normal one
+ */
+static DEFINE_PER_CPU(struct perf_event *, last_hit_bp);
+
+/*
+ * Flag to denote if the kernel was already single-stepping. Used to
+ * conditionally unset the MSR_SE flag in the single-step exception
+ * following the breakpoint exception.
+ */
+static DEFINE_PER_CPU(bool, is_cpu_singlestep);
+
+/*
+ * Stores the breakpoints currently in use on each breakpoint address
+ * register for each cpus
+ */
+static DEFINE_PER_CPU(struct perf_event *, bp_per_reg);
+
+/*
+ * Install a perf counter breakpoint.
+ *
+ * We seek a free debug address register and use it for this
+ * breakpoint.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (!*slot)
+		*slot = bp;
+	else {
+		WARN_ONCE(1, "Can't find any breakpoint slot");
+		return -EBUSY;
+	}
+
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	return 0;
+}
+
+/*
+ * Uninstall the breakpoint contained in the given counter.
+ *
+ * First we search the debug address register it uses and then we disable
+ * it.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (*slot == bp)
+		*slot = NULL;
+	else {
+		WARN_ONCE(1, "Can't find the breakpoint slot");
+		return;
+	}
+	set_dabr(0);
+}
+
+/*
+ * Validate the arch-specific HW Breakpoint register settings
+ */
+int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk)
+{
+	int is_kernel, ret = -EINVAL;
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+
+	if (!bp)
+		return ret;
+
+	switch (bp->attr.bp_type) {
+	case HW_BREAKPOINT_R:
+		info->type = DABR_DATA_READ;
+		break;
+	case HW_BREAKPOINT_W:
+		info->type = DABR_DATA_WRITE;
+		break;
+	case HW_BREAKPOINT_R | HW_BREAKPOINT_W:
+		info->type = (DABR_DATA_READ | DABR_DATA_WRITE);
+		break;
+	default:
+		return ret;
+	}
+	/* TODO: Check for a valid triggered function */
+	/* if (!bp->triggered)
+		return -EINVAL; */
+
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+	if ((tsk && is_kernel) || (!tsk && !is_kernel))
+		return -EINVAL;
+
+	info->address = bp->attr.bp_addr;
+	info->len = bp->attr.bp_len;
+
+	/*
+	 * Since breakpoint length can be a maximum of HW_BREAKPOINT_LEN(8)
+	 * and breakpoint addresses are aligned to nearest double-word
+	 * HW_BREAKPOINT_ALIGN by rounding off to the lower address, the
+	 * 'symbolsize' should satisfy the check below.
+	 */
+	if (info->len >
+	    (HW_BREAKPOINT_LEN - (info->address & HW_BREAKPOINT_ALIGN)))
+		return -EINVAL;
+	return 0;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_handler(struct die_args *args)
+{
+	int rc = NOTIFY_STOP;
+	struct perf_event *bp;
+	struct pt_regs *regs = args->regs;
+	unsigned long dar = regs->dar;
+	int cpu, is_kernel, stepped = 1;
+	struct arch_hw_breakpoint *info;
+
+	/* Disable breakpoints during exception handling */
+	set_dabr(0);
+	cpu = get_cpu();
+	/*
+	 * The counter may be concurrently released but that can only
+	 * occur from a call_rcu() path. We can then safely fetch
+	 * the breakpoint, use its callback, touch its counter
+	 * while we are in an rcu_read_lock() path.
+	 */
+	rcu_read_lock();
+
+	bp = per_cpu(bp_per_reg, cpu);
+	if (!bp)
+		goto out;
+	info = counter_arch_bp(bp);
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+
+	/*
+	 * Verify if dar lies within the address range occupied by the symbol
+	 * being watched to filter extraneous exceptions.
+	 */
+	    if (!((bp->attr.bp_addr <= dar) &&
+	    (dar <= (bp->attr.bp_addr + bp->attr.bp_len))))
+		/*
+		 * This exception is triggered not because of a memory access on
+		 * the monitored variable but in the double-word address range
+		 * in which it is contained. We will consume this exception,
+		 * considering it as 'noise'.
+		 */
+		goto out;
+
+	/*
+	 * Return early after invoking user-callback function without restoring
+	 * DABR if the breakpoint is from ptrace which always operates in
+	 * one-shot mode
+	 */
+	if (bp->overflow_handler == ptrace_triggered) {
+		(bp->overflow_handler)(bp, 0, NULL, regs);
+		rc = NOTIFY_DONE;
+		goto out;
+	}
+
+	/*
+	 * Do not emulate user-space instructions from kernel-space,
+	 * instead single-step them.
+	 */
+	if (!is_kernel) {
+		current->thread.last_hit_ubp = bp;
+		goto out;
+	}
+
+	/*
+	 * Emulating single-step over causative instruction if already
+	 * in MSR_SE mode will harm other users of kernel single-stepping
+	 * (like an in-kernel debugger), single-step over them.
+	 */
+
+	if (regs->msr & MSR_SE) {
+		per_cpu(is_cpu_singlestep, cpu) = true;
+		per_cpu(last_hit_bp, cpu) = bp;
+		goto out;
+	}
+
+	stepped = emulate_step(regs, regs->nip);
+	 /* emulate_step() could not execute it, single-step them */
+	if (stepped == 0) {
+		regs->msr |= MSR_SE;
+		if (is_kernel)
+			per_cpu(last_hit_bp, cpu) = bp;
+		else
+			current->thread.last_hit_ubp = bp;
+		per_cpu(last_hit_bp, cpu) = bp;
+		goto out;
+	}
+	/*
+	 * As a policy, the callback is invoked in a 'trigger-after-execute'
+	 * fashion
+	 */
+	(bp->overflow_handler)(bp, 0, NULL, regs);
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	per_cpu(is_cpu_singlestep, cpu) = false;
+out:
+	rcu_read_unlock();
+	put_cpu();
+	return rc;
+}
+
+/*
+ * Handle single-step exceptions following a DABR hit.
+ */
+int __kprobes single_step_dabr_instruction(struct die_args *args)
+{
+	struct pt_regs *regs = args->regs;
+	int cpu = get_cpu();
+	int ret = NOTIFY_DONE;
+	siginfo_t info;
+	struct perf_event *bp = NULL, *kernel_bp, *user_bp;
+	struct arch_hw_breakpoint *bp_info;
+
+	/*
+	 * Identify the cause of single-stepping and find the corresponding
+	 * breakpoint structure
+	 */
+	kernel_bp = per_cpu(last_hit_bp, cpu);
+	user_bp = current->thread.last_hit_ubp;
+	if (kernel_bp) {
+		bp = kernel_bp;
+		per_cpu(last_hit_bp, cpu) = NULL;
+	} else if (user_bp) {
+		bp = user_bp;
+		current->thread.last_hit_ubp = NULL;
+	}
+
+	/*
+	 * Check if we are single-stepping as a result of a
+	 * previous HW Breakpoint exception
+	 */
+	if (!bp)
+		goto out;
+
+	bp_info = counter_arch_bp(bp);
+
+	/*
+	 * We shall invoke the user-defined callback function in the single
+	 * stepping handler to confirm to 'trigger-after-execute' semantics
+	 */
+	(bp->overflow_handler)(bp, 0, NULL, regs);
+
+	if (!(per_cpu(is_cpu_singlestep, cpu) ||
+	      test_thread_flag(TIF_SINGLESTEP)))
+		regs->msr &= ~MSR_SE;
+
+	/* Deliver signal to user-space */
+	if (!is_kernel_addr(bp->attr.bp_addr)) {
+		info.si_signo = SIGTRAP;
+		info.si_errno = 0;
+		info.si_code = TRAP_HWBKPT;
+		info.si_addr = (void __user *)bp_info->address;
+		force_sig_info(SIGTRAP, &info, current);
+	}
+
+	set_dabr(bp_info->address | bp_info->type | DABR_TRANSLATION);
+	/*
+	 * The kernel was in single-step mode before hw-breakpoint
+	 * exception, allow them to process the single-step exception further.
+	 */
+	if (per_cpu(is_cpu_singlestep, cpu)) {
+		per_cpu(is_cpu_singlestep, cpu) = false;
+		ret = NOTIFY_DONE;
+	} else
+		ret = NOTIFY_STOP;
+out:
+	put_cpu();
+	return ret;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_exceptions_notify(
+		struct notifier_block *unused, unsigned long val, void *data)
+{
+	int ret = NOTIFY_DONE;
+
+	switch (val) {
+	case DIE_DABR_MATCH:
+		ret = hw_breakpoint_handler(data);
+		break;
+	case DIE_SSTEP:
+		ret = single_step_dabr_instruction(data);
+		break;
+	}
+
+	return ret;
+}
+
+/*
+ * Release the user breakpoints used by ptrace
+ */
+void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
+{
+	struct thread_struct *t = &tsk->thread;
+
+	unregister_hw_breakpoint(t->ptrace_bps[0]);
+	t->ptrace_bps[0] = NULL;
+}
+
+void hw_breakpoint_pmu_read(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+
Index: linux-2.6.ppc64_test/arch/powerpc/Kconfig
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/Kconfig	2010-02-03 07:37:58.000000000 +0530
+++ linux-2.6.ppc64_test/arch/powerpc/Kconfig	2010-02-15 02:10:43.000000000 +0530
@@ -140,6 +140,7 @@
 	select HAVE_SYSCALL_WRAPPERS if PPC64
 	select GENERIC_ATOMIC64 if PPC32
 	select HAVE_PERF_EVENTS
+	select HAVE_HW_BREAKPOINT if PPC64
 
 config EARLY_PRINTK
 	bool
Index: linux-2.6.ppc64_test/arch/powerpc/kernel/Makefile
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/Makefile	2010-02-03 07:37:59.000000000 +0530
+++ linux-2.6.ppc64_test/arch/powerpc/kernel/Makefile	2010-02-15 02:10:43.000000000 +0530
@@ -33,7 +33,7 @@
 obj-y				+= vdso32/
 obj-$(CONFIG_PPC64)		+= setup_64.o sys_ppc32.o \
 				   signal_64.o ptrace32.o \
-				   paca.o nvram_64.o firmware.o
+				   paca.o nvram_64.o firmware.o hw_breakpoint.o
 obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_ppc970.o cpu_setup_pa6t.o
 obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
 obj-$(CONFIG_PPC_BOOK3E_64)	+= exceptions-64e.o
Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/reg.h
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/include/asm/reg.h	2010-02-03 07:37:59.000000000 +0530
+++ linux-2.6.ppc64_test/arch/powerpc/include/asm/reg.h	2010-02-15 02:10:43.000000000 +0530
@@ -180,6 +180,7 @@
 #define   CTRL_TE	0x00c00000	/* thread enable */
 #define   CTRL_RUNLATCH	0x1
 #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
+#define   HBP_NUM	1	/* Number of physical HW breakpoint registers */
 #define   DABR_TRANSLATION	(1UL << 2)
 #define   DABR_DATA_WRITE	(1UL << 1)
 #define   DABR_DATA_READ	(1UL << 0)
Index: linux-2.6.ppc64_test/arch/powerpc/mm/fault.c
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/mm/fault.c	2010-02-03 07:37:59.000000000 +0530
+++ linux-2.6.ppc64_test/arch/powerpc/mm/fault.c	2010-02-15 02:10:43.000000000 +0530
@@ -137,6 +137,12 @@
 		error_code &= 0x48200000;
 	else
 		is_write = error_code & DSISR_ISSTORE;
+
+	if (error_code & DSISR_DABRMATCH) {
+		/* DABR match */
+		do_dabr(regs, address, error_code);
+		return 0;
+	}
 #else
 	is_write = error_code & ESR_DST;
 #endif /* CONFIG_4xx || CONFIG_BOOKE */
@@ -151,14 +157,6 @@
 	if (!user_mode(regs) && (address >= TASK_SIZE))
 		return SIGSEGV;
 
-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
-  	if (error_code & DSISR_DABRMATCH) {
-		/* DABR match */
-		do_dabr(regs, address, error_code);
-		return 0;
-	}
-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
-
 	if (in_atomic() || mm == NULL) {
 		if (!user_mode(regs))
 			return SIGSEGV;
Index: linux-2.6.ppc64_test/arch/powerpc/include/asm/processor.h
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/include/asm/processor.h	2010-02-03 07:37:59.000000000 +0530
+++ linux-2.6.ppc64_test/arch/powerpc/include/asm/processor.h	2010-02-15 02:10:43.000000000 +0530
@@ -177,6 +177,12 @@
 #ifdef CONFIG_PPC64
 	unsigned long	start_tb;	/* Start purr when proc switched in */
 	unsigned long	accum_tb;	/* Total accumilated purr for process */
+	struct perf_event *ptrace_bps[HBP_NUM];
+	/*
+	 * Point to the hw-breakpoint last. Helps safe pre-emption and
+	 * hw-breakpoint re-enablement.
+	 */
+	struct perf_event *last_hit_ubp;
 #endif
 	unsigned long	dabr;		/* Data address breakpoint register */
 #ifdef CONFIG_ALTIVEC
Index: linux-2.6.ppc64_test/arch/powerpc/kernel/ptrace.c
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/ptrace.c	2010-02-03 07:37:59.000000000 +0530
+++ linux-2.6.ppc64_test/arch/powerpc/kernel/ptrace.c	2010-02-15 02:11:24.000000000 +0530
@@ -32,6 +32,8 @@
 #ifdef CONFIG_PPC32
 #include <linux/module.h>
 #endif
+#include <linux/hw_breakpoint.h>
+#include <linux/perf_event.h>
 
 #include <asm/uaccess.h>
 #include <asm/page.h>
@@ -755,9 +757,32 @@
 	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
 }
 
+void ptrace_triggered(struct perf_event *bp, int nmi,
+		      struct perf_sample_data *data, struct pt_regs *regs)
+{
+	struct perf_event_attr attr;
+
+	/*
+	 * Disable the breakpoint request here since ptrace has defined a
+	 * one-shot behaviour for breakpoint exceptions in PPC64.
+	 * The SIGTRAP signal is generated automatically for us in do_dabr().
+	 * We don't have to do anything about that here
+	 */
+	attr = bp->attr;
+	attr.disabled = true;
+	modify_user_hw_breakpoint(bp, &attr);
+}
+
 int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
 			       unsigned long data)
 {
+#ifdef CONFIG_PPC64
+	int ret;
+	struct thread_struct *thread = &(task->thread);
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+#endif /* CONFIG_PPC64 */
+
 	/* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
 	 *  For embedded processors we support one DAC and no IAC's at the
 	 *  moment.
@@ -786,6 +811,60 @@
 	/* Ensure breakpoint translation bit is set */
 	if (data && !(data & DABR_TRANSLATION))
 		return -EIO;
+#ifdef CONFIG_PPC64
+	bp = thread->ptrace_bps[0];
+	if (data == 0) {
+		if (bp) {
+			unregister_hw_breakpoint(bp);
+			thread->ptrace_bps[0] = NULL;
+		}
+		return 0;
+	}
+	if (bp) {
+		attr = bp->attr;
+		attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+
+		switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+		case DABR_DATA_READ:
+			attr.bp_type = HW_BREAKPOINT_R;
+			break;
+		case DABR_DATA_WRITE:
+			attr.bp_type = HW_BREAKPOINT_W;
+			break;
+		case (DABR_DATA_WRITE | DABR_DATA_READ):
+			attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+			break;
+		}
+		ret =  modify_user_hw_breakpoint(bp, &attr);
+		if (ret)
+			return ret;
+		thread->ptrace_bps[0] = bp;
+		thread->dabr = data;
+		return 0;
+	}
+
+	/* Create a new breakpoint request if one doesn't exist already */
+	hw_breakpoint_init(&attr);
+	attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+	switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+	case DABR_DATA_READ:
+		attr.bp_type = HW_BREAKPOINT_R;
+		break;
+	case DABR_DATA_WRITE:
+		attr.bp_type = HW_BREAKPOINT_W;
+		break;
+	case (DABR_DATA_WRITE | DABR_DATA_READ):
+		attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+		break;
+	}
+	thread->ptrace_bps[0] = bp = register_user_hw_breakpoint(&attr,
+							ptrace_triggered, task);
+	if (IS_ERR(bp)) {
+		thread->ptrace_bps[0] = NULL;
+		return PTR_ERR(bp);
+	}
+
+#endif /* CONFIG_PPC64 */
 
 	/* Move contents to the DABR register */
 	task->thread.dabr = data;
Index: linux-2.6.ppc64_test/arch/powerpc/kernel/process.c
===================================================================
--- linux-2.6.ppc64_test.orig/arch/powerpc/kernel/process.c	2010-02-03 07:37:59.000000000 +0530
+++ linux-2.6.ppc64_test/arch/powerpc/kernel/process.c	2010-02-15 02:10:43.000000000 +0530
@@ -48,6 +48,7 @@
 #include <asm/machdep.h>
 #include <asm/time.h>
 #include <asm/syscalls.h>
+#include <asm/hw_breakpoint.h>
 #ifdef CONFIG_PPC64
 #include <asm/firmware.h>
 #endif
@@ -376,8 +377,11 @@
 	if (new->thread.dabr)
 		set_dabr(new->thread.dabr);
 #else
+/* For PPC64, we use the hw-breakpoint interfaces that would schedule DABR */
+#ifndef CONFIG_PPC64
 	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
 		set_dabr(new->thread.dabr);
+#endif /* CONFIG_PPC64 */
 #endif
 
 
@@ -396,6 +400,7 @@
 		old_thread->accum_tb += (current_tb - start_tb);
 		new_thread->start_tb = current_tb;
 	}
+	flush_ptrace_hw_breakpoint(current);
 #endif
 
 	local_irq_save(flags);

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-01-19 10:03           ` Roland McGrath
@ 2010-01-22  7:14             ` K.Prasad
  0 siblings, 0 replies; 26+ messages in thread
From: K.Prasad @ 2010-01-22  7:14 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Michael Neuling, Benjamin Herrenschmidt, Frederic Weisbecker,
	David Gibson, linuxppc-dev, Alan Stern, paulus

On Tue, Jan 19, 2010 at 02:03:35AM -0800, Roland McGrath wrote:
> > It is also not clear to me if disabling pre-emption for the user-space
> > (albeit for a very tiny time-window) is incorrect and if their side-effects
> > are known. If otherwise, I think we should choose to operate in pre-empt
> > safe mode and avoid all costs associated when done without it.
> 
> I never really gave much consideration to returning to user mode with
> preemption disabled.  It would not really have occurred to me that was
> even possible.  I can't say it seems to me like it could ever be a very
> good idea.  I find it hard even to start listing the cans of worms that
> might be opened by that.  Perhaps the powerpc maintainers have a clearer
> picture of it than I do.
> 
> What does it mean when there is something that prevents it from returning
> to user mode?  i.e., TIF_SIGPENDING or TIF_NEED_RESCHED, or whatever.  It
> could do a lot in the kernel before it gets back to user mode.  What if in
> there somewhere it blocks voluntarily?
> 
> Similarly, what does it mean if you get to user mode but the single-stepped
> instruction is a load/store that gets a page fault?  What if it blocks in
> the page fault handler?
> 
> For that matter, what about a page fault for the kernel-mode case?
> 
> Perhaps I'm imagining gremlins where there aren't any, but I just cannot
> really get my head around this "disable preemption while running some
> unknown instruction that normally runs with preemption enabled" thing--let
> alone "disable preemption while returning to user mode".
> 
> 
> Thanks,
> Roland

I posted a patch which re-enables pre-emption after a hw-breakpoint is
processed (linuxppc-dev ref: 20100121084640.GA3252@in.ibm.com). It does
lead to clumsiness (due to the new variables to track states, prior
breakpoints, etc.) but with the reasons you pointed out, it is much
better than having uncertain/incorrect code.

Thanks for your comments.
-- K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-01-21  8:46 [Patch 0/1] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver XII K.Prasad
@ 2010-01-21  8:49 ` K.Prasad
  0 siblings, 0 replies; 26+ messages in thread
From: K.Prasad @ 2010-01-21  8:49 UTC (permalink / raw)
  To: linuxppc-dev, David Gibson, Roland McGrath
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Frederic Weisbecker, Alan Stern, paulus

Implement perf-events based hw-breakpoint interfaces for PPC64 processors.
These interfaces help arbitrate requests from various users and schedules
them as appropriate.

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                     |    1 
 arch/powerpc/include/asm/hw_breakpoint.h |   55 ++++
 arch/powerpc/include/asm/processor.h     |    6 
 arch/powerpc/include/asm/reg.h           |    1 
 arch/powerpc/kernel/Makefile             |    2 
 arch/powerpc/kernel/hw_breakpoint.c      |  363 +++++++++++++++++++++++++++++++
 arch/powerpc/kernel/process.c            |    5 
 arch/powerpc/kernel/ptrace.c             |   77 ++++++
 arch/powerpc/mm/fault.c                  |   14 -
 9 files changed, 515 insertions(+), 9 deletions(-)

Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/hw_breakpoint.h
===================================================================
--- /dev/null
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/hw_breakpoint.h
@@ -0,0 +1,55 @@
+#ifndef	_PPC64_HW_BREAKPOINT_H
+#define	_PPC64_HW_BREAKPOINT_H
+
+#ifdef	__KERNEL__
+#define	__ARCH_HW_BREAKPOINT_H
+#ifdef CONFIG_PPC64
+
+struct arch_hw_breakpoint {
+	u8		len; /* length of the target symbol */
+	int		type;
+	char		*name; /* Contains name of the symbol to set bkpt */
+	unsigned long	address;
+};
+
+#include <linux/kdebug.h>
+#include <asm/reg.h>
+#include <asm/system.h>
+
+/* Total number of available HW breakpoint registers */
+#define HBP_NUM 1
+
+struct perf_event;
+struct pmu;
+struct perf_sample_data;
+
+#define HW_BREAKPOINT_ALIGN 0x7
+/* Maximum permissible length of any HW Breakpoint */
+#define HW_BREAKPOINT_LEN 0x8
+
+extern int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+						unsigned long val, void *data);
+int arch_install_hw_breakpoint(struct perf_event *bp);
+void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+void hw_breakpoint_pmu_read(struct perf_event *bp);
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp);
+extern void flush_ptrace_hw_breakpoint(struct task_struct *tsk);
+
+extern struct pmu perf_ops_bp;
+extern void ptrace_triggered(struct perf_event *bp, int nmi,
+			struct perf_sample_data *data, struct pt_regs *regs);
+static inline void hw_breakpoint_disable(void)
+{
+	set_dabr(0);
+}
+
+#else
+static inline void hw_breakpoint_disable(void)
+{
+	/* Function is defined only on PPC64 for now */
+}
+#endif	/* CONFIG_PPC64 */
+#endif	/* __KERNEL__ */
+#endif	/* _PPC64_HW_BREAKPOINT_H */
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/hw_breakpoint.c
===================================================================
--- /dev/null
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/hw_breakpoint.c
@@ -0,0 +1,363 @@
+/*
+ * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
+ * using the CPU's debug registers. Derived from
+ * "arch/x86/kernel/hw_breakpoint.c"
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright 2009 IBM Corporation
+ * Author: K.Prasad <prasad@linux.vnet.ibm.com>
+ *
+ */
+
+#include <linux/hw_breakpoint.h>
+#include <linux/notifier.h>
+#include <linux/kprobes.h>
+#include <linux/percpu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/smp.h>
+
+#include <asm/hw_breakpoint.h>
+#include <asm/processor.h>
+#include <asm/sstep.h>
+
+/*
+ * Store the 'bp' that caused the hw-breakpoint exception just before we
+ * single-step. Used to distinguish a single-step exception (due to a previous
+ * hw-breakpoint exception) from a normal one
+ */
+static DEFINE_PER_CPU(struct perf_event *, last_hit_bp);
+
+/*
+ * Flag to denote if the kernel was already single-stepping. Used to
+ * conditionally unset the MSR_SE flag in the single-step exception
+ * following the breakpoint exception.
+ */
+static DEFINE_PER_CPU(bool, is_cpu_singlestep);
+
+/*
+ * Stores the breakpoints currently in use on each breakpoint address
+ * register for each cpus
+ */
+static DEFINE_PER_CPU(struct perf_event *, bp_per_reg);
+
+/*
+ * Install a perf counter breakpoint.
+ *
+ * We seek a free debug address register and use it for this
+ * breakpoint.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (!*slot)
+		*slot = bp;
+	else {
+		WARN_ONCE(1, "Can't find any breakpoint slot");
+		return -EBUSY;
+	}
+
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	return 0;
+}
+
+/*
+ * Uninstall the breakpoint contained in the given counter.
+ *
+ * First we search the debug address register it uses and then we disable
+ * it.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (*slot == bp)
+		*slot = NULL;
+	else {
+		WARN_ONCE(1, "Can't find the breakpoint slot");
+		return;
+	}
+	set_dabr(0);
+}
+
+/*
+ * Validate the arch-specific HW Breakpoint register settings
+ */
+int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk)
+{
+	int is_kernel, ret = -EINVAL;
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+
+	if (!bp)
+		return ret;
+
+	switch (bp->attr.bp_type) {
+	case HW_BREAKPOINT_R:
+		info->type = DABR_DATA_READ;
+		break;
+	case HW_BREAKPOINT_W:
+		info->type = DABR_DATA_WRITE;
+		break;
+	case HW_BREAKPOINT_R | HW_BREAKPOINT_W:
+		info->type = (DABR_DATA_READ | DABR_DATA_WRITE);
+		break;
+	default:
+		return ret;
+	}
+	/* TODO: Check for a valid triggered function */
+	/* if (!bp->triggered)
+		return -EINVAL; */
+
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+	if ((tsk && is_kernel) || (!tsk && !is_kernel))
+		return -EINVAL;
+
+	info->address = bp->attr.bp_addr;
+	info->len = bp->attr.bp_len;
+
+	/*
+	 * Since breakpoint length can be a maximum of HW_BREAKPOINT_LEN(8)
+	 * and breakpoint addresses are aligned to nearest double-word
+	 * HW_BREAKPOINT_ALIGN by rounding off to the lower address, the
+	 * 'symbolsize' should satisfy the check below.
+	 */
+	if (info->len >
+	    (HW_BREAKPOINT_LEN - (info->address & HW_BREAKPOINT_ALIGN)))
+		return -EINVAL;
+	return 0;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_handler(struct die_args *args)
+{
+	int rc = NOTIFY_STOP;
+	struct perf_event *bp;
+	struct pt_regs *regs = args->regs;
+	unsigned long dar = regs->dar;
+	int cpu, is_kernel, stepped = 1;
+	struct arch_hw_breakpoint *info;
+
+	/* Disable breakpoints during exception handling */
+	set_dabr(0);
+	cpu = get_cpu();
+	/*
+	 * The counter may be concurrently released but that can only
+	 * occur from a call_rcu() path. We can then safely fetch
+	 * the breakpoint, use its callback, touch its counter
+	 * while we are in an rcu_read_lock() path.
+	 */
+	rcu_read_lock();
+
+	bp = per_cpu(bp_per_reg, cpu);
+	if (!bp)
+		goto out;
+	info = counter_arch_bp(bp);
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+
+	/*
+	 * Verify if dar lies within the address range occupied by the symbol
+	 * being watched to filter extraneous exceptions.
+	 */
+	    if (!((bp->attr.bp_addr <= dar) &&
+	    (dar <= (bp->attr.bp_addr + bp->attr.bp_len))))
+		/*
+		 * This exception is triggered not because of a memory access on
+		 * the monitored variable but in the double-word address range
+		 * in which it is contained. We will consume this exception,
+		 * considering it as 'noise'.
+		 */
+		goto out;
+
+	/*
+	 * Return early after invoking user-callback function without restoring
+	 * DABR if the breakpoint is from ptrace which always operates in
+	 * one-shot mode
+	 */
+	if (bp->overflow_handler == ptrace_triggered) {
+		(bp->overflow_handler)(bp, 0, NULL, regs);
+		rc = NOTIFY_DONE;
+		goto out;
+	}
+
+	/*
+	 * Do not emulate user-space instructions from kernel-space,
+	 * instead single-step them.
+	 */
+	if (!is_kernel) {
+		current->thread.last_hit_ubp = bp;
+		goto out;
+	}
+
+	/*
+	 * Emulating single-step over causative instruction if already
+	 * in MSR_SE mode will harm other users of kernel single-stepping
+	 * (like an in-kernel debugger), single-step over them.
+	 */
+
+	if (regs->msr & MSR_SE) {
+		per_cpu(is_cpu_singlestep, cpu) = true;
+		per_cpu(last_hit_bp, cpu) = bp;
+		goto out;
+	}
+
+	stepped = emulate_step(regs, regs->nip);
+	 /* emulate_step() could not execute it, single-step them */
+	if (stepped == 0) {
+		regs->msr |= MSR_SE;
+		if (is_kernel)
+			per_cpu(last_hit_bp, cpu) = bp;
+		else
+			current->thread.last_hit_ubp = bp;
+		per_cpu(last_hit_bp, cpu) = bp;
+		goto out;
+	}
+	/*
+	 * As a policy, the callback is invoked in a 'trigger-after-execute'
+	 * fashion
+	 */
+	(bp->overflow_handler)(bp, 0, NULL, regs);
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	per_cpu(is_cpu_singlestep, cpu) = false;
+out:
+	rcu_read_unlock();
+	put_cpu();
+	return rc;
+}
+
+/*
+ * Handle single-step exceptions following a DABR hit.
+ */
+int __kprobes single_step_dabr_instruction(struct die_args *args)
+{
+	struct pt_regs *regs = args->regs;
+	int cpu = get_cpu();
+	int ret = NOTIFY_DONE;
+	siginfo_t info;
+	struct perf_event *bp = NULL, *kernel_bp, *user_bp;
+	struct arch_hw_breakpoint *bp_info;
+
+	/*
+	 * Identify the cause of single-stepping and find the corresponding
+	 * breakpoint structure
+	 */
+	kernel_bp = per_cpu(last_hit_bp, cpu);
+	user_bp = current->thread.last_hit_ubp;
+	if (kernel_bp) {
+		bp = kernel_bp;
+		per_cpu(last_hit_bp, cpu) = NULL;
+	} else if (user_bp) {
+		bp = user_bp;
+		current->thread.last_hit_ubp = NULL;
+	}
+
+	/*
+	 * Check if we are single-stepping as a result of a
+	 * previous HW Breakpoint exception
+	 */
+	if (!bp)
+		goto out;
+
+	bp_info = counter_arch_bp(bp);
+
+	/*
+	 * We shall invoke the user-defined callback function in the single
+	 * stepping handler to confirm to 'trigger-after-execute' semantics
+	 */
+	(bp->overflow_handler)(bp, 0, NULL, regs);
+
+	if (!(per_cpu(is_cpu_singlestep, cpu) ||
+	      test_thread_flag(TIF_SINGLESTEP)))
+		regs->msr &= ~MSR_SE;
+
+	/* Deliver signal to user-space */
+	if (!is_kernel_addr(bp->attr.bp_addr)) {
+		info.si_signo = SIGTRAP;
+		info.si_errno = 0;
+		info.si_code = TRAP_HWBKPT;
+		info.si_addr = (void __user *)bp_info->address;
+		force_sig_info(SIGTRAP, &info, current);
+	}
+
+	set_dabr(bp_info->address | bp_info->type | DABR_TRANSLATION);
+	/*
+	 * The kernel was in single-step mode before hw-breakpoint
+	 * exception, allow them to process the single-step exception further.
+	 */
+	if (per_cpu(is_cpu_singlestep, cpu)) {
+		per_cpu(is_cpu_singlestep, cpu) = false;
+		ret = NOTIFY_DONE;
+	} else
+		ret = NOTIFY_STOP;
+out:
+	put_cpu();
+	return ret;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_exceptions_notify(
+		struct notifier_block *unused, unsigned long val, void *data)
+{
+	int ret = NOTIFY_DONE;
+
+	switch (val) {
+	case DIE_DABR_MATCH:
+		ret = hw_breakpoint_handler(data);
+		break;
+	case DIE_SSTEP:
+		ret = single_step_dabr_instruction(data);
+		break;
+	}
+
+	return ret;
+}
+
+/*
+ * Release the user breakpoints used by ptrace
+ */
+void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
+{
+	struct thread_struct *t = &tsk->thread;
+
+	unregister_hw_breakpoint(t->ptrace_bps[0]);
+	t->ptrace_bps[0] = NULL;
+}
+
+void hw_breakpoint_pmu_read(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/Kconfig
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/Kconfig
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/Kconfig
@@ -130,6 +130,7 @@ config PPC
 	select HAVE_SYSCALL_WRAPPERS if PPC64
 	select GENERIC_ATOMIC64 if PPC32
 	select HAVE_PERF_EVENTS
+	select HAVE_HW_BREAKPOINT if PPC64
 
 config EARLY_PRINTK
 	bool
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/Makefile
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/kernel/Makefile
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/Makefile
@@ -33,7 +33,7 @@ obj-y				:= cputable.o ptrace.o syscalls
 obj-y				+= vdso32/
 obj-$(CONFIG_PPC64)		+= setup_64.o sys_ppc32.o \
 				   signal_64.o ptrace32.o \
-				   paca.o nvram_64.o firmware.o
+				   paca.o nvram_64.o firmware.o hw_breakpoint.o
 obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_ppc970.o cpu_setup_pa6t.o
 obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
 obj-$(CONFIG_PPC_BOOK3E_64)	+= exceptions-64e.o
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/reg.h
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/include/asm/reg.h
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/reg.h
@@ -180,6 +180,7 @@
 #define   CTRL_TE	0x00c00000	/* thread enable */
 #define   CTRL_RUNLATCH	0x1
 #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
+#define   HBP_NUM	1	/* Number of physical HW breakpoint registers */
 #define   DABR_TRANSLATION	(1UL << 2)
 #define   DABR_DATA_WRITE	(1UL << 1)
 #define   DABR_DATA_READ	(1UL << 0)
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/mm/fault.c
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/mm/fault.c
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/mm/fault.c
@@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
 		error_code &= 0x48200000;
 	else
 		is_write = error_code & DSISR_ISSTORE;
+
+	if (error_code & DSISR_DABRMATCH) {
+		/* DABR match */
+		do_dabr(regs, address, error_code);
+		return 0;
+	}
 #else
 	is_write = error_code & ESR_DST;
 #endif /* CONFIG_4xx || CONFIG_BOOKE */
@@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
 	if (!user_mode(regs) && (address >= TASK_SIZE))
 		return SIGSEGV;
 
-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
-  	if (error_code & DSISR_DABRMATCH) {
-		/* DABR match */
-		do_dabr(regs, address, error_code);
-		return 0;
-	}
-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
-
 	if (in_atomic() || mm == NULL) {
 		if (!user_mode(regs))
 			return SIGSEGV;
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/processor.h
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/include/asm/processor.h
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/processor.h
@@ -177,6 +177,12 @@ struct thread_struct {
 #ifdef CONFIG_PPC64
 	unsigned long	start_tb;	/* Start purr when proc switched in */
 	unsigned long	accum_tb;	/* Total accumilated purr for process */
+	struct perf_event *ptrace_bps[HBP_NUM];
+	/*
+	 * Point to the hw-breakpoint last. Helps safe pre-emption and
+	 * hw-breakpoint re-enablement.
+	 */
+	struct perf_event *last_hit_ubp;
 #endif
 	unsigned long	dabr;		/* Data address breakpoint register */
 #ifdef CONFIG_ALTIVEC
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/ptrace.c
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/kernel/ptrace.c
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/ptrace.c
@@ -32,6 +32,8 @@
 #ifdef CONFIG_PPC32
 #include <linux/module.h>
 #endif
+#include <linux/hw_breakpoint.h>
+#include <linux/perf_event.h>
 
 #include <asm/uaccess.h>
 #include <asm/page.h>
@@ -755,9 +757,32 @@ void user_disable_single_step(struct tas
 	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
 }
 
+void ptrace_triggered(struct perf_event *bp, int nmi,
+		      struct perf_sample_data *data, struct pt_regs *regs)
+{
+	struct perf_event_attr attr;
+
+	/*
+	 * Disable the breakpoint request here since ptrace has defined a
+	 * one-shot behaviour for breakpoint exceptions in PPC64.
+	 * The SIGTRAP signal is generated automatically for us in do_dabr().
+	 * We don't have to do anything about that here
+	 */
+	attr = bp->attr;
+	attr.disabled = true;
+	modify_user_hw_breakpoint(bp, &attr);
+}
+
 int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
 			       unsigned long data)
 {
+#ifdef CONFIG_PPC64
+	int ret;
+	struct thread_struct *thread = &(task->thread);
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+#endif /* CONFIG_PPC64 */
+
 	/* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
 	 *  For embedded processors we support one DAC and no IAC's at the
 	 *  moment.
@@ -786,6 +811,58 @@ int ptrace_set_debugreg(struct task_stru
 	/* Ensure breakpoint translation bit is set */
 	if (data && !(data & DABR_TRANSLATION))
 		return -EIO;
+#ifdef CONFIG_PPC64
+	bp = thread->ptrace_bps[0];
+	if (data == 0) {
+		if (bp) {
+			unregister_hw_breakpoint(bp);
+			thread->ptrace_bps[0] = NULL;
+		}
+		return 0;
+	}
+	if (bp) {
+		attr = bp->attr;
+		attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+
+		switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+		case DABR_DATA_READ:
+			attr.bp_type = HW_BREAKPOINT_R;
+			break;
+		case DABR_DATA_WRITE:
+			attr.bp_type = HW_BREAKPOINT_W;
+			break;
+		case (DABR_DATA_WRITE | DABR_DATA_READ):
+			attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+			break;
+		}
+		ret =  modify_user_hw_breakpoint(bp, &attr);
+		if (ret)
+			return ret;
+		thread->ptrace_bps[0] = bp;
+		thread->dabr = data;
+		return 0;
+	}
+
+	/* Create a new breakpoint request if one doesn't exist already */
+	hw_breakpoint_init(&attr);
+	attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+	switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+	case DABR_DATA_READ:
+		attr.bp_type = HW_BREAKPOINT_R;
+		break;
+	case DABR_DATA_WRITE:
+		attr.bp_type = HW_BREAKPOINT_W;
+		break;
+	case (DABR_DATA_WRITE | DABR_DATA_READ):
+		attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+		break;
+	}
+	thread->ptrace_bps[0] = bp = register_user_hw_breakpoint(&attr,
+							ptrace_triggered, task);
+	if (IS_ERR(bp))
+		return PTR_ERR(bp);
+
+#endif /* CONFIG_PPC64 */
 
 	/* Move contents to the DABR register */
 	task->thread.dabr = data;
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/process.c
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/kernel/process.c
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/process.c
@@ -48,6 +48,7 @@
 #include <asm/machdep.h>
 #include <asm/time.h>
 #include <asm/syscalls.h>
+#include <asm/hw_breakpoint.h>
 #ifdef CONFIG_PPC64
 #include <asm/firmware.h>
 #endif
@@ -376,8 +377,11 @@ struct task_struct *__switch_to(struct t
 	if (new->thread.dabr)
 		set_dabr(new->thread.dabr);
 #else
+/* For PPC64, we use the hw-breakpoint interfaces that would schedule DABR */
+#ifndef CONFIG_PPC64
 	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
 		set_dabr(new->thread.dabr);
+#endif /* CONFIG_PPC64 */
 #endif
 
 
@@ -564,6 +568,7 @@ void flush_thread(void)
 		else
 			set_ti_thread_flag(t, TIF_32BIT);
 	}
+	flush_ptrace_hw_breakpoint(current);
 #endif
 
 	discard_lazy_cpu_state();

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-01-19  9:40         ` K.Prasad
@ 2010-01-19 10:03           ` Roland McGrath
  2010-01-22  7:14             ` K.Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: Roland McGrath @ 2010-01-19 10:03 UTC (permalink / raw)
  To: prasad
  Cc: Michael Neuling, Benjamin Herrenschmidt, Frederic Weisbecker,
	David Gibson, linuxppc-dev, Alan Stern, paulus

> It is also not clear to me if disabling pre-emption for the user-space
> (albeit for a very tiny time-window) is incorrect and if their side-effects
> are known. If otherwise, I think we should choose to operate in pre-empt
> safe mode and avoid all costs associated when done without it.

I never really gave much consideration to returning to user mode with
preemption disabled.  It would not really have occurred to me that was
even possible.  I can't say it seems to me like it could ever be a very
good idea.  I find it hard even to start listing the cans of worms that
might be opened by that.  Perhaps the powerpc maintainers have a clearer
picture of it than I do.

What does it mean when there is something that prevents it from returning
to user mode?  i.e., TIF_SIGPENDING or TIF_NEED_RESCHED, or whatever.  It
could do a lot in the kernel before it gets back to user mode.  What if in
there somewhere it blocks voluntarily?

Similarly, what does it mean if you get to user mode but the single-stepped
instruction is a load/store that gets a page fault?  What if it blocks in
the page fault handler?

For that matter, what about a page fault for the kernel-mode case?

Perhaps I'm imagining gremlins where there aren't any, but I just cannot
really get my head around this "disable preemption while running some
unknown instruction that normally runs with preemption enabled" thing--let
alone "disable preemption while returning to user mode".


Thanks,
Roland

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2009-12-17 19:03       ` K.Prasad
@ 2010-01-19  9:40         ` K.Prasad
  2010-01-19 10:03           ` Roland McGrath
  0 siblings, 1 reply; 26+ messages in thread
From: K.Prasad @ 2010-01-19  9:40 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Michael Neuling, Benjamin Herrenschmidt, Frederic Weisbecker,
	David Gibson, linuxppc-dev, Alan Stern, paulus

On Fri, Dec 18, 2009 at 12:33:20AM +0530, K.Prasad wrote:
> On Mon, Dec 14, 2009 at 11:26:26AM -0800, Roland McGrath wrote:
<snipped>
> > What remains less than clear is how preemption relates.  For any per-thread
> > hw_breakpoint, there is no high-level reason to care one way or the other.
> > The thread, its HW breakpoints, its register state including state of
> > stepping, are all part of per-thread state and no reason to do any less (or
> > more) preemption than normally happens.
> > 
> 
> I get that reasoning now...I'd been unduly worried about pre-emption
> and hence the introduction of pre-emption disabled state.
> 
> But of course, in the existing design, the per-cpu variables would be
> affected because if pre-emption was to occur. I'll see how that can be
> factored in, while retaining the abstraction provided by the interfaces.
> 
<snipped>
> 
> As stated above, I was worried about a pre-emption happening between a
> return from breakpoint exception handler and the execution of the
> causative instruction....but as I learn, it seems fine now. It is just that
> the kernel code needs to be tweaked keeping this in mind.
> 
> Thanks,
> K.Prasad
> 

Hi Roland,
	The cost of allowing pre-emption (such as storing the per-cpu
variables into per-thread structures for user-space breakpoints and
offsetting the effect of a new process between the hw-breakpoint handler
and single-step handler) appears to far out-weigh the present approach
(where pre-emption is disabled between two user-space instructions).

It is also not clear to me if disabling pre-emption for the user-space
(albeit for a very tiny time-window) is incorrect and if their side-effects
are known. If otherwise, I think we should choose to operate in pre-empt
safe mode and avoid all costs associated when done without it.

I'm eager to know what you think.

Thanks,
K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2010-01-19  9:12 [Patch 0/1] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver XI K.Prasad
@ 2010-01-19  9:14 ` K.Prasad
  0 siblings, 0 replies; 26+ messages in thread
From: K.Prasad @ 2010-01-19  9:14 UTC (permalink / raw)
  To: David Gibson, linuxppc-dev, Roland McGrath
  Cc: Michael Neuling, Benjamin Herrenschmidt, shaggy,
	Frederic Weisbecker, Alan Stern, paulus

Implement perf-events based hw-breakpoint interfaces for PPC64 processors.
These interfaces help arbitrate requests from various users and schedules
them as appropriate.

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                     |    1 
 arch/powerpc/include/asm/hw_breakpoint.h |   55 +++++
 arch/powerpc/include/asm/processor.h     |    1 
 arch/powerpc/include/asm/reg.h           |    1 
 arch/powerpc/kernel/Makefile             |    2 
 arch/powerpc/kernel/hw_breakpoint.c      |  332 +++++++++++++++++++++++++++++++
 arch/powerpc/kernel/process.c            |    5 
 arch/powerpc/kernel/ptrace.c             |   77 +++++++
 arch/powerpc/mm/fault.c                  |   14 -
 9 files changed, 479 insertions(+), 9 deletions(-)

Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/hw_breakpoint.h
===================================================================
--- /dev/null
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/hw_breakpoint.h
@@ -0,0 +1,55 @@
+#ifndef	_PPC64_HW_BREAKPOINT_H
+#define	_PPC64_HW_BREAKPOINT_H
+
+#ifdef	__KERNEL__
+#define	__ARCH_HW_BREAKPOINT_H
+#ifdef CONFIG_PPC64
+
+struct arch_hw_breakpoint {
+	u8		len; /* length of the target symbol */
+	int		type;
+	char		*name; /* Contains name of the symbol to set bkpt */
+	unsigned long	address;
+};
+
+#include <linux/kdebug.h>
+#include <asm/reg.h>
+#include <asm/system.h>
+
+/* Total number of available HW breakpoint registers */
+#define HBP_NUM 1
+
+struct perf_event;
+struct pmu;
+struct perf_sample_data;
+
+#define HW_BREAKPOINT_ALIGN 0x7
+/* Maximum permissible length of any HW Breakpoint */
+#define HW_BREAKPOINT_LEN 0x8
+
+extern int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+						unsigned long val, void *data);
+int arch_install_hw_breakpoint(struct perf_event *bp);
+void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+void hw_breakpoint_pmu_read(struct perf_event *bp);
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp);
+extern void flush_ptrace_hw_breakpoint(struct task_struct *tsk);
+
+extern struct pmu perf_ops_bp;
+extern void ptrace_triggered(struct perf_event *bp, int nmi,
+			struct perf_sample_data *data, struct pt_regs *regs);
+static inline void hw_breakpoint_disable(void)
+{
+	set_dabr(0);
+}
+
+#else
+static inline void hw_breakpoint_disable(void)
+{
+	/* Function is defined only on PPC64 for now */
+}
+#endif	/* CONFIG_PPC64 */
+#endif	/* __KERNEL__ */
+#endif	/* _PPC64_HW_BREAKPOINT_H */
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/hw_breakpoint.c
===================================================================
--- /dev/null
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/hw_breakpoint.c
@@ -0,0 +1,332 @@
+/*
+ * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
+ * using the CPU's debug registers. Derived from
+ * "arch/x86/kernel/hw_breakpoint.c"
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright 2009 IBM Corporation
+ * Author: K.Prasad <prasad@linux.vnet.ibm.com>
+ *
+ */
+
+#include <linux/hw_breakpoint.h>
+#include <linux/notifier.h>
+#include <linux/kprobes.h>
+#include <linux/percpu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/smp.h>
+
+#include <asm/hw_breakpoint.h>
+#include <asm/processor.h>
+#include <asm/sstep.h>
+
+/*
+ * Store the 'bp' that caused the hw-breakpoint exception just before we
+ * single-step. Used to distinguish a single-step exception (due to a previous
+ * hw-breakpoint exception) from a normal one
+ */
+static DEFINE_PER_CPU(struct perf_event *, last_hit_bp);
+
+/*
+ * Stores the breakpoints currently in use on each breakpoint address
+ * register for each cpus
+ */
+static DEFINE_PER_CPU(struct perf_event *, bp_per_reg);
+
+/*
+ * Install a perf counter breakpoint.
+ *
+ * We seek a free debug address register and use it for this
+ * breakpoint.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (!*slot)
+		*slot = bp;
+	else {
+		WARN_ONCE(1, "Can't find any breakpoint slot");
+		return -EBUSY;
+	}
+
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	return 0;
+}
+
+/*
+ * Uninstall the breakpoint contained in the given counter.
+ *
+ * First we search the debug address register it uses and then we disable
+ * it.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (*slot == bp)
+		*slot = NULL;
+	else {
+		WARN_ONCE(1, "Can't find the breakpoint slot");
+		return;
+	}
+	set_dabr(0);
+}
+
+/*
+ * Validate the arch-specific HW Breakpoint register settings
+ */
+int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk)
+{
+	int is_kernel, ret = -EINVAL;
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+
+	if (!bp)
+		return ret;
+
+	switch (bp->attr.bp_type) {
+	case HW_BREAKPOINT_R:
+		info->type = DABR_DATA_READ;
+		break;
+	case HW_BREAKPOINT_W:
+		info->type = DABR_DATA_WRITE;
+		break;
+	case HW_BREAKPOINT_R | HW_BREAKPOINT_W:
+		info->type = (DABR_DATA_READ | DABR_DATA_WRITE);
+		break;
+	default:
+		return ret;
+	}
+	/* TODO: Check for a valid triggered function */
+	/* if (!bp->triggered)
+		return -EINVAL; */
+
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+	if ((tsk && is_kernel) || (!tsk && !is_kernel))
+		return -EINVAL;
+
+	info->address = bp->attr.bp_addr;
+	info->len = bp->attr.bp_len;
+
+	/*
+	 * Since breakpoint length can be a maximum of HW_BREAKPOINT_LEN(8)
+	 * and breakpoint addresses are aligned to nearest double-word
+	 * HW_BREAKPOINT_ALIGN by rounding off to the lower address, the
+	 * 'symbolsize' should satisfy the check below.
+	 */
+	if (info->len >
+	    (HW_BREAKPOINT_LEN - (info->address & HW_BREAKPOINT_ALIGN)))
+		return -EINVAL;
+	return 0;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_handler(struct die_args *args)
+{
+	int rc = NOTIFY_STOP;
+	struct perf_event *bp;
+	struct pt_regs *regs = args->regs;
+	unsigned long dar = regs->dar;
+	int cpu, is_kernel, stepped = 1;
+	struct arch_hw_breakpoint *info;
+
+	/* Disable breakpoints during exception handling */
+	set_dabr(0);
+	cpu = get_cpu();
+	/*
+	 * The counter may be concurrently released but that can only
+	 * occur from a call_rcu() path. We can then safely fetch
+	 * the breakpoint, use its callback, touch its counter
+	 * while we are in an rcu_read_lock() path.
+	 */
+	rcu_read_lock();
+
+	bp = per_cpu(bp_per_reg, cpu);
+	if (!bp)
+		goto out;
+	info = counter_arch_bp(bp);
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+
+	/* Verify if dar lies within the address range occupied by the symbol
+	 * being watched. Since we cannot get the symbol size for
+	 * user-space requests we skip this check in that case
+	 */
+	if (is_kernel &&
+	    !((bp->attr.bp_addr <= dar) &&
+	    (dar <= (bp->attr.bp_addr + bp->attr.bp_len))))
+		/*
+		 * This exception is triggered not because of a memory access on
+		 * the monitored variable but in the double-word address range
+		 * in which it is contained. We will consume this exception,
+		 * considering it as 'noise'.
+		 */
+		goto out;
+
+	/*
+	 * Return early after invoking user-callback function without restoring
+	 * DABR if the breakpoint is from ptrace which always operates in
+	 * one-shot mode
+	 */
+	if (bp->overflow_handler == ptrace_triggered) {
+		(bp->overflow_handler)(bp, 0, NULL, regs);
+		rc = NOTIFY_DONE;
+		goto out;
+	}
+
+	stepped = emulate_step(regs, regs->nip);
+	/*
+	 * Single-step the causative instruction manually if
+	 * emulate_step() could not execute it
+	 */
+	if (stepped == 0) {
+		regs->msr |= MSR_SE;
+		per_cpu(last_hit_bp, cpu) = bp;
+		goto out;
+	}
+	/*
+	 * As a policy, the callback is invoked in a 'trigger-after-execute'
+	 * fashion
+	 */
+	(bp->overflow_handler)(bp, 0, NULL, regs);
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+
+out:
+	rcu_read_unlock();
+	/*
+	 * Enable pre-emption only if single-stepping is finished i.e.
+	 * Pre-emption is disabled for a small time-window extending from
+	 * the completion of instruction preceding the causative instruction
+	 * and the single-step exception handler that immediately follows the
+	 * completion of the causative instruction. The hardware breakpoint
+	 * exception is sandwiched between the two.
+	 */
+	if (stepped) {
+		per_cpu(last_hit_bp, cpu) = 0;
+		put_cpu();
+	}
+	return rc;
+}
+
+/*
+ * Handle single-step exceptions following a DABR hit.
+ */
+int __kprobes single_step_dabr_instruction(struct die_args *args)
+{
+	struct pt_regs *regs = args->regs;
+	int cpu = get_cpu();
+	int ret = NOTIFY_DONE;
+	siginfo_t info;
+	struct perf_event *bp = per_cpu(last_hit_bp, cpu);
+	struct arch_hw_breakpoint *bp_info;
+
+	/*
+	 * Check if we are single-stepping as a result of a
+	 * previous HW Breakpoint exception
+	 */
+	if (!bp)
+		goto out;
+
+	bp_info = counter_arch_bp(bp);
+	if (!test_thread_flag(TIF_SINGLESTEP))
+		regs->msr &= ~MSR_SE;
+	/*
+	 * We shall invoke the user-defined callback function in the single
+	 * stepping handler to confirm to 'trigger-after-execute' semantics
+	 */
+	(bp->overflow_handler)(bp, 0, NULL, regs);
+
+	/* Deliver signal to user-space */
+	if (!is_kernel_addr(bp->attr.bp_addr)) {
+		info.si_signo = SIGTRAP;
+		info.si_errno = 0;
+		info.si_code = TRAP_HWBKPT;
+		info.si_addr = (void __user *)bp_info->address;
+		force_sig_info(SIGTRAP, &info, current);
+	}
+
+	set_dabr(bp_info->address | bp_info->type | DABR_TRANSLATION);
+	per_cpu(last_hit_bp, cpu) = NULL;
+	ret = NOTIFY_STOP;
+	/*
+	 * If single-stepped after hw_breakpoint_handler(), pre-emption is
+	 * already disabled.
+	 */
+	put_cpu();
+
+out:
+	/*
+	 * A put_cpu() call is required to complement the get_cpu()
+	 * call used initially
+	 */
+	put_cpu();
+	return ret;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_exceptions_notify(
+		struct notifier_block *unused, unsigned long val, void *data)
+{
+	int ret = NOTIFY_DONE;
+
+	switch (val) {
+	case DIE_DABR_MATCH:
+		ret = hw_breakpoint_handler(data);
+		break;
+	case DIE_SSTEP:
+		ret = single_step_dabr_instruction(data);
+		break;
+	}
+
+	return ret;
+}
+
+/*
+ * Release the user breakpoints used by ptrace
+ */
+void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
+{
+	struct thread_struct *t = &tsk->thread;
+
+	unregister_hw_breakpoint(t->ptrace_bps[0]);
+	t->ptrace_bps[0] = NULL;
+}
+
+void hw_breakpoint_pmu_read(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/Kconfig
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/Kconfig
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/Kconfig
@@ -130,6 +130,7 @@ config PPC
 	select HAVE_SYSCALL_WRAPPERS if PPC64
 	select GENERIC_ATOMIC64 if PPC32
 	select HAVE_PERF_EVENTS
+	select HAVE_HW_BREAKPOINT if PPC64
 
 config EARLY_PRINTK
 	bool
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/Makefile
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/kernel/Makefile
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/Makefile
@@ -33,7 +33,7 @@ obj-y				:= cputable.o ptrace.o syscalls
 obj-y				+= vdso32/
 obj-$(CONFIG_PPC64)		+= setup_64.o sys_ppc32.o \
 				   signal_64.o ptrace32.o \
-				   paca.o nvram_64.o firmware.o
+				   paca.o nvram_64.o firmware.o hw_breakpoint.o
 obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_ppc970.o cpu_setup_pa6t.o
 obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
 obj-$(CONFIG_PPC_BOOK3E_64)	+= exceptions-64e.o
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/reg.h
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/include/asm/reg.h
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/reg.h
@@ -180,6 +180,7 @@
 #define   CTRL_TE	0x00c00000	/* thread enable */
 #define   CTRL_RUNLATCH	0x1
 #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
+#define   HBP_NUM	1	/* Number of physical HW breakpoint registers */
 #define   DABR_TRANSLATION	(1UL << 2)
 #define   DABR_DATA_WRITE	(1UL << 1)
 #define   DABR_DATA_READ	(1UL << 0)
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/mm/fault.c
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/mm/fault.c
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/mm/fault.c
@@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
 		error_code &= 0x48200000;
 	else
 		is_write = error_code & DSISR_ISSTORE;
+
+	if (error_code & DSISR_DABRMATCH) {
+		/* DABR match */
+		do_dabr(regs, address, error_code);
+		return 0;
+	}
 #else
 	is_write = error_code & ESR_DST;
 #endif /* CONFIG_4xx || CONFIG_BOOKE */
@@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
 	if (!user_mode(regs) && (address >= TASK_SIZE))
 		return SIGSEGV;
 
-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
-  	if (error_code & DSISR_DABRMATCH) {
-		/* DABR match */
-		do_dabr(regs, address, error_code);
-		return 0;
-	}
-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
-
 	if (in_atomic() || mm == NULL) {
 		if (!user_mode(regs))
 			return SIGSEGV;
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/processor.h
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/include/asm/processor.h
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/processor.h
@@ -177,6 +177,7 @@ struct thread_struct {
 #ifdef CONFIG_PPC64
 	unsigned long	start_tb;	/* Start purr when proc switched in */
 	unsigned long	accum_tb;	/* Total accumilated purr for process */
+	struct perf_event *ptrace_bps[HBP_NUM];
 #endif
 	unsigned long	dabr;		/* Data address breakpoint register */
 #ifdef CONFIG_ALTIVEC
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/ptrace.c
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/kernel/ptrace.c
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/ptrace.c
@@ -32,6 +32,8 @@
 #ifdef CONFIG_PPC32
 #include <linux/module.h>
 #endif
+#include <linux/hw_breakpoint.h>
+#include <linux/perf_event.h>
 
 #include <asm/uaccess.h>
 #include <asm/page.h>
@@ -755,9 +757,32 @@ void user_disable_single_step(struct tas
 	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
 }
 
+void ptrace_triggered(struct perf_event *bp, int nmi,
+		      struct perf_sample_data *data, struct pt_regs *regs)
+{
+	struct perf_event_attr attr;
+
+	/*
+	 * Disable the breakpoint request here since ptrace has defined a
+	 * one-shot behaviour for breakpoint exceptions in PPC64.
+	 * The SIGTRAP signal is generated automatically for us in do_dabr().
+	 * We don't have to do anything about that here
+	 */
+	attr = bp->attr;
+	attr.disabled = true;
+	modify_user_hw_breakpoint(bp, &attr);
+}
+
 int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
 			       unsigned long data)
 {
+#ifdef CONFIG_PPC64
+	int ret;
+	struct thread_struct *thread = &(task->thread);
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+#endif /* CONFIG_PPC64 */
+
 	/* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
 	 *  For embedded processors we support one DAC and no IAC's at the
 	 *  moment.
@@ -786,6 +811,58 @@ int ptrace_set_debugreg(struct task_stru
 	/* Ensure breakpoint translation bit is set */
 	if (data && !(data & DABR_TRANSLATION))
 		return -EIO;
+#ifdef CONFIG_PPC64
+	bp = thread->ptrace_bps[0];
+	if (data == 0) {
+		if (bp) {
+			unregister_hw_breakpoint(bp);
+			thread->ptrace_bps[0] = NULL;
+		}
+		return 0;
+	}
+	if (bp) {
+		attr = bp->attr;
+		attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+
+		switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+		case DABR_DATA_READ:
+			attr.bp_type = HW_BREAKPOINT_R;
+			break;
+		case DABR_DATA_WRITE:
+			attr.bp_type = HW_BREAKPOINT_W;
+			break;
+		case (DABR_DATA_WRITE | DABR_DATA_READ):
+			attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+			break;
+		}
+		ret =  modify_user_hw_breakpoint(bp, &attr);
+		if (ret)
+			return ret;
+		thread->ptrace_bps[0] = bp;
+		thread->dabr = data;
+		return 0;
+	}
+
+	/* Create a new breakpoint request if one doesn't exist already */
+	hw_breakpoint_init(&attr);
+	attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+	switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+	case DABR_DATA_READ:
+		attr.bp_type = HW_BREAKPOINT_R;
+		break;
+	case DABR_DATA_WRITE:
+		attr.bp_type = HW_BREAKPOINT_W;
+		break;
+	case (DABR_DATA_WRITE | DABR_DATA_READ):
+		attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+		break;
+	}
+	thread->ptrace_bps[0] = bp = register_user_hw_breakpoint(&attr,
+							ptrace_triggered, task);
+	if (IS_ERR(bp))
+		return PTR_ERR(bp);
+
+#endif /* CONFIG_PPC64 */
 
 	/* Move contents to the DABR register */
 	task->thread.dabr = data;
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/process.c
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/kernel/process.c
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/process.c
@@ -48,6 +48,7 @@
 #include <asm/machdep.h>
 #include <asm/time.h>
 #include <asm/syscalls.h>
+#include <asm/hw_breakpoint.h>
 #ifdef CONFIG_PPC64
 #include <asm/firmware.h>
 #endif
@@ -376,8 +377,11 @@ struct task_struct *__switch_to(struct t
 	if (new->thread.dabr)
 		set_dabr(new->thread.dabr);
 #else
+/* For PPC64, we use the hw-breakpoint interfaces that would schedule DABR */
+#ifndef CONFIG_PPC64
 	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
 		set_dabr(new->thread.dabr);
+#endif /* CONFIG_PPC64 */
 #endif
 
 
@@ -564,6 +568,7 @@ void flush_thread(void)
 		else
 			set_ti_thread_flag(t, TIF_32BIT);
 	}
+	flush_ptrace_hw_breakpoint(current);
 #endif
 
 	discard_lazy_cpu_state();

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2009-12-14 19:26     ` Roland McGrath
@ 2009-12-17 19:03       ` K.Prasad
  2010-01-19  9:40         ` K.Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: K.Prasad @ 2009-12-17 19:03 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Michael Neuling, Benjamin Herrenschmidt, Frederic Weisbecker,
	David Gibson, linuxppc-dev, Alan Stern, paulus

On Mon, Dec 14, 2009 at 11:26:26AM -0800, Roland McGrath wrote:
<snipped>
> I understand the reason for using stepping.  (I have advised in the past
> that I thought this magical implicit step logic was too hairy to roll in
> under the covers and that a low-level facility expressing the different
> hardware semantics to a kernel API would be OK.  I do agree with the
> motivation of cross-arch uniformity of the semantics.  I don't object to
> making it magically right--I just expressed general skepticism/fear about
> getting that right so that I didn't want to try writing that magic.  Now
> I'm just responding about the particular details I've noticed about that
> can of worms.  It's certainly great if you can resolve all that.  But I'll
> note that I am still by no means confident that the details I have raised
> cover all the worms in that can.)
> 
> What remains less than clear is how preemption relates.  For any per-thread
> hw_breakpoint, there is no high-level reason to care one way or the other.
> The thread, its HW breakpoints, its register state including state of
> stepping, are all part of per-thread state and no reason to do any less (or
> more) preemption than normally happens.
> 

I get that reasoning now...I'd been unduly worried about pre-emption
and hence the introduction of pre-emption disabled state.

But of course, in the existing design, the per-cpu variables would be
affected because if pre-emption was to occur. I'll see how that can be
factored in, while retaining the abstraction provided by the interfaces.

> > Disabling pre-emption is necessary to ensure that hw-breakpoints are
> > enabled immediately after the causative instruction has finished
> > execution (the control flow may go astray if pre-emption occurs between
> > i1 and i2).
> 
> I don't understand what "go astray" means here.  The only thing I can think
> of is the effect on any per-cpu variables you are using in hw_breakpoint
> implementation.
> 

As stated above, I was worried about a pre-emption happening between a
return from breakpoint exception handler and the execution of the
causative instruction....but as I learn, it seems fine now. It is just that
the kernel code needs to be tweaked keeping this in mind.

Thanks,
K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2009-12-14 18:03   ` K.Prasad
@ 2009-12-14 19:26     ` Roland McGrath
  2009-12-17 19:03       ` K.Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: Roland McGrath @ 2009-12-14 19:26 UTC (permalink / raw)
  To: prasad
  Cc: Michael Neuling, Benjamin Herrenschmidt, Frederic Weisbecker,
	David Gibson, linuxppc-dev, Alan Stern, paulus

> Yes, it does unset MSR_SE bit in single_step_dabr_instruction()
> irrespective of whether it was previously enabled through
> user_enable_single_step(). This could be mitigated with the use of a
> separate flag which can be used to conditionally unset MSR_SE, however
> given further concerns about pre-emption (as expressed by you below),
> I'm afraid of substantial revamp of the user-space semantics.

There is already TIF_SINGLESTEP set by user_enable_single_step.  So for
that aspect, it is probably relatively straightforward to cover that
interaction.  The code has to be pretty exact and will merit some comments
about subtleties, but I suspect the actual new code required will be just a
tiny amount.

> Kprobes has been tested to work simultaneously with hw-breakpoints. KGDB
> has not been ported yet to use the hw-breakpoint interfaces (KGDB had
> issues in it, that prevented it from being tested during our
> development...though its maintainer has begun showing interest
> recently).
> 
> Xmon was (and I believe is still) in a state where data breakpoints did
> not work. It needs to be ported too, to benefit from the hw-breakpoint
> interfaces.

That is not really what I meant at all.  That is good stuff to work out.
But I just meant the interactions with kprobes/kgdb's use of single-stepping,
the direct analogy to the user_enable_single_step issue.

> I must admit that the issue of pre-emption [...]

I understand the reason for using stepping.  (I have advised in the past
that I thought this magical implicit step logic was too hairy to roll in
under the covers and that a low-level facility expressing the different
hardware semantics to a kernel API would be OK.  I do agree with the
motivation of cross-arch uniformity of the semantics.  I don't object to
making it magically right--I just expressed general skepticism/fear about
getting that right so that I didn't want to try writing that magic.  Now
I'm just responding about the particular details I've noticed about that
can of worms.  It's certainly great if you can resolve all that.  But I'll
note that I am still by no means confident that the details I have raised
cover all the worms in that can.)

What remains less than clear is how preemption relates.  For any per-thread
hw_breakpoint, there is no high-level reason to care one way or the other.
The thread, its HW breakpoints, its register state including state of
stepping, are all part of per-thread state and no reason to do any less (or
more) preemption than normally happens.

> Disabling pre-emption is necessary to ensure that hw-breakpoints are
> enabled immediately after the causative instruction has finished
> execution (the control flow may go astray if pre-emption occurs between
> i1 and i2).

I don't understand what "go astray" means here.  The only thing I can think
of is the effect on any per-cpu variables you are using in hw_breakpoint
implementation.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2009-12-14  0:56 ` Roland McGrath
@ 2009-12-14 18:03   ` K.Prasad
  2009-12-14 19:26     ` Roland McGrath
  0 siblings, 1 reply; 26+ messages in thread
From: K.Prasad @ 2009-12-14 18:03 UTC (permalink / raw)
  To: Roland McGrath
  Cc: Michael Neuling, Benjamin Herrenschmidt, Frederic Weisbecker,
	David Gibson, linuxppc-dev, Alan Stern, paulus

On Sun, Dec 13, 2009 at 04:56:48PM -0800, Roland McGrath wrote:
> I can't see anything you've done to keep this use of MSR_SE in the
> user-mode register state from interfering with user_enable_single_step().
> It looks to me like you'd swallow the normal step indications.
> 

Yes, it does unset MSR_SE bit in single_step_dabr_instruction()
irrespective of whether it was previously enabled through
user_enable_single_step(). This could be mitigated with the use of a
separate flag which can be used to conditionally unset MSR_SE, however
given further concerns about pre-emption (as expressed by you below),
I'm afraid of substantial revamp of the user-space semantics.

> Likewise I'm not very clear on the interaction with kprobes, kgdb,
> or whatnot for kernel-mode cases.  But I'll leave those concerns to
> others, since I know more about the user-mode situations.
>

Kprobes has been tested to work simultaneously with hw-breakpoints. KGDB
has not been ported yet to use the hw-breakpoint interfaces (KGDB had
issues in it, that prevented it from being tested during our
development...though its maintainer has begun showing interest
recently).

Xmon was (and I believe is still) in a state where data breakpoints did
not work. It needs to be ported too, to benefit from the hw-breakpoint
interfaces.
 
> Back to the user-mode case, is it really reasonable to disable
> preemption in hw_breakpoint_handler and leave it so across returning
> to user mode?  (Is that even possible?  I thought user mode was
> always preemptible.)  That is done very casually with little comment
> in hw_breakpoint_handler and single_step_dabr_instruction, but it
> seems like an extremely deep and magical thing that merits more
> explanation.  I guess the need for it has to do with the per_cpu
> variable you're using, but the whole situation is not very clear on
> first reading.  Even for kernel mode, what does this mean when the
> stepped instruction does a page fault?
> 

I must admit that the issue of pre-emption should have been given more
thought. Suppose there's a stream of user-space instructions "i1, i2, i3,
.....i<n>" and if 'i2' instruction can cause a hw-breakpoint exception,
then there exists a small window between i1 and i2 where pre-emption is
disabled (while a schedule operation could have taken place otherwise).

Disabling pre-emption is necessary to ensure that hw-breakpoints are
enabled immediately after the causative instruction has finished
execution (the control flow may go astray if pre-emption occurs between
i1 and i2). The root cause of this behaviour is a combination of
'trigger-before-execute' behaviour (for data-exceptions) and a desire
for 'continuous' exceptions (as opposed to one-shot behaviour seen in
ptrace). The per-cpu variable 'last_hit_bp' just helps identify a
single-step exception resulting from a hbp_handler vs other sources.

Resorting to one-shot behaviour (which is the easiest workaround
available) will break the desired uniformity in behaviour for
hw-breakpoint interfaces - say every register_user_<> interface must be
accompanied by a unregister_<> interface, etc. Post perf-events'
integration, ensuring a one-shot behaviour might also have its own
bunch of undesirable consequences (such as circular locks), that must be
overcome.

Unless I see a way to re-instate the breakpoints (surviving a
pre-emption), I will send out a new patch that resorts to a one-shot
behaviour for user-space (kernel-space is fine though).

Thank you for the insightful comments!

K.Prasad

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
  2009-12-11 16:04 K.Prasad
@ 2009-12-14  0:56 ` Roland McGrath
  2009-12-14 18:03   ` K.Prasad
  0 siblings, 1 reply; 26+ messages in thread
From: Roland McGrath @ 2009-12-14  0:56 UTC (permalink / raw)
  To: prasad
  Cc: Michael Neuling, Benjamin Herrenschmidt, Frederic Weisbecker,
	David Gibson, linuxppc-dev, Alan Stern, paulus

I can't see anything you've done to keep this use of MSR_SE in the
user-mode register state from interfering with user_enable_single_step().
It looks to me like you'd swallow the normal step indications.

Likewise I'm not very clear on the interaction with kprobes, kgdb,
or whatnot for kernel-mode cases.  But I'll leave those concerns to
others, since I know more about the user-mode situations.

Back to the user-mode case, is it really reasonable to disable
preemption in hw_breakpoint_handler and leave it so across returning
to user mode?  (Is that even possible?  I thought user mode was
always preemptible.)  That is done very casually with little comment
in hw_breakpoint_handler and single_step_dabr_instruction, but it
seems like an extremely deep and magical thing that merits more
explanation.  I guess the need for it has to do with the per_cpu
variable you're using, but the whole situation is not very clear on
first reading.  Even for kernel mode, what does this mean when the
stepped instruction does a page fault?


Thanks,
Roland

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64
@ 2009-12-11 16:04 K.Prasad
  2009-12-14  0:56 ` Roland McGrath
  0 siblings, 1 reply; 26+ messages in thread
From: K.Prasad @ 2009-12-11 16:04 UTC (permalink / raw)
  To: David Gibson, linuxppc-dev
  Cc: Michael Neuling, Benjamin Herrenschmidt, Frederic Weisbecker,
	Alan Stern, paulus, Roland McGrath

Implement perf-events based hw-breakpoint interfaces for PPC64 processors.
These interfaces help arbitrate requests from various users and schedules
them as appropriate.

Signed-off-by: K.Prasad <prasad@linux.vnet.ibm.com>
---
 arch/powerpc/Kconfig                     |    1 
 arch/powerpc/include/asm/hw_breakpoint.h |   55 +++++
 arch/powerpc/include/asm/processor.h     |    1 
 arch/powerpc/include/asm/reg.h           |    1 
 arch/powerpc/kernel/Makefile             |    2 
 arch/powerpc/kernel/hw_breakpoint.c      |  325 +++++++++++++++++++++++++++++++
 arch/powerpc/kernel/process.c            |    5 
 arch/powerpc/kernel/ptrace.c             |   77 +++++++
 arch/powerpc/mm/fault.c                  |   14 -
 9 files changed, 472 insertions(+), 9 deletions(-)

Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/hw_breakpoint.h
===================================================================
--- /dev/null
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/hw_breakpoint.h
@@ -0,0 +1,55 @@
+#ifndef	_PPC64_HW_BREAKPOINT_H
+#define	_PPC64_HW_BREAKPOINT_H
+
+#ifdef	__KERNEL__
+#define	__ARCH_HW_BREAKPOINT_H
+#ifdef CONFIG_PPC64
+
+struct arch_hw_breakpoint {
+	u8		len; /* length of the target symbol */
+	int		type;
+	char		*name; /* Contains name of the symbol to set bkpt */
+	unsigned long	address;
+};
+
+#include <linux/kdebug.h>
+#include <asm/reg.h>
+#include <asm/system.h>
+
+/* Total number of available HW breakpoint registers */
+#define HBP_NUM 1
+
+struct perf_event;
+struct pmu;
+struct perf_sample_data;
+
+#define HW_BREAKPOINT_ALIGN 0x7
+/* Maximum permissible length of any HW Breakpoint */
+#define HW_BREAKPOINT_LEN 0x8
+
+extern int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+						unsigned long val, void *data);
+int arch_install_hw_breakpoint(struct perf_event *bp);
+void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+void hw_breakpoint_pmu_read(struct perf_event *bp);
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp);
+extern void flush_ptrace_hw_breakpoint(struct task_struct *tsk);
+
+extern struct pmu perf_ops_bp;
+extern void ptrace_triggered(struct perf_event *bp, int nmi,
+			struct perf_sample_data *data, struct pt_regs *regs);
+static inline void hw_breakpoint_disable(void)
+{
+	set_dabr(0);
+}
+
+#else
+static inline void hw_breakpoint_disable(void)
+{
+	/* Function is defined only on PPC64 for now */
+}
+#endif	/* CONFIG_PPC64 */
+#endif	/* __KERNEL__ */
+#endif	/* _PPC64_HW_BREAKPOINT_H */
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/hw_breakpoint.c
===================================================================
--- /dev/null
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/hw_breakpoint.c
@@ -0,0 +1,325 @@
+/*
+ * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
+ * using the CPU's debug registers. Derived from
+ * "arch/x86/kernel/hw_breakpoint.c"
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright 2009 IBM Corporation
+ * Author: K.Prasad <prasad@linux.vnet.ibm.com>
+ *
+ */
+
+#include <linux/hw_breakpoint.h>
+#include <linux/notifier.h>
+#include <linux/kprobes.h>
+#include <linux/percpu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/sched.h>
+#include <linux/init.h>
+#include <linux/smp.h>
+
+#include <asm/hw_breakpoint.h>
+#include <asm/processor.h>
+#include <asm/sstep.h>
+
+/*
+ * Store the 'bp' that caused the hw-breakpoint exception just before we
+ * single-step. Used to distinguish a single-step exception (due to a previous
+ * hw-breakpoint exception) from a normal one
+ */
+static DEFINE_PER_CPU(struct perf_event *, last_hit_bp);
+
+/*
+ * Stores the breakpoints currently in use on each breakpoint address
+ * register for each cpus
+ */
+static DEFINE_PER_CPU(struct perf_event *, bp_per_reg);
+
+/*
+ * Install a perf counter breakpoint.
+ *
+ * We seek a free debug address register and use it for this
+ * breakpoint.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (!*slot)
+		*slot = bp;
+	else {
+		WARN_ONCE(1, "Can't find any breakpoint slot");
+		return -EBUSY;
+	}
+
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+	return 0;
+}
+
+/*
+ * Uninstall the breakpoint contained in the given counter.
+ *
+ * First we search the debug address register it uses and then we disable
+ * it.
+ *
+ * Atomic: we hold the counter->ctx->lock and we only handle variables
+ * and registers local to this cpu.
+ */
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+	struct perf_event **slot = &__get_cpu_var(bp_per_reg);
+
+	if (*slot == bp)
+		*slot = NULL;
+	else {
+		WARN_ONCE(1, "Can't find the breakpoint slot");
+		return;
+	}
+	set_dabr(0);
+}
+
+/*
+ * Validate the arch-specific HW Breakpoint register settings
+ */
+int arch_validate_hwbkpt_settings(struct perf_event *bp,
+						struct task_struct *tsk)
+{
+	int is_kernel, ret = -EINVAL;
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+
+	if (!bp)
+		return ret;
+
+	switch (bp->attr.bp_type) {
+	case HW_BREAKPOINT_R:
+		info->type = DABR_DATA_READ;
+		break;
+	case HW_BREAKPOINT_W:
+		info->type = DABR_DATA_WRITE;
+		break;
+	case HW_BREAKPOINT_R | HW_BREAKPOINT_W:
+		info->type = (DABR_DATA_READ | DABR_DATA_WRITE);
+		break;
+	default:
+		return ret;
+	}
+	/* TODO: Check for a valid triggered function */
+	/* if (!bp->triggered)
+		return -EINVAL; */
+
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+	if ((tsk && is_kernel) || (!tsk && !is_kernel))
+		return -EINVAL;
+
+	info->address = bp->attr.bp_addr;
+	info->len = bp->attr.bp_len;
+
+	/*
+	 * Since breakpoint length can be a maximum of HW_BREAKPOINT_LEN(8)
+	 * and breakpoint addresses are aligned to nearest double-word
+	 * HW_BREAKPOINT_ALIGN by rounding off to the lower address, the
+	 * 'symbolsize' should satisfy the check below.
+	 */
+	if (info->len >
+	    (HW_BREAKPOINT_LEN - (info->address & HW_BREAKPOINT_ALIGN)))
+		return -EINVAL;
+	return 0;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_handler(struct die_args *args)
+{
+	int rc = NOTIFY_STOP;
+	struct perf_event *bp;
+	struct pt_regs *regs = args->regs;
+	unsigned long dar = regs->dar;
+	int cpu, is_kernel, stepped = 1;
+	struct arch_hw_breakpoint *info;
+
+	/* Disable breakpoints during exception handling */
+	set_dabr(0);
+	cpu = get_cpu();
+	/*
+	 * The counter may be concurrently released but that can only
+	 * occur from a call_rcu() path. We can then safely fetch
+	 * the breakpoint, use its callback, touch its counter
+	 * while we are in an rcu_read_lock() path.
+	 */
+	rcu_read_lock();
+
+	bp = per_cpu(bp_per_reg, cpu);
+	if (!bp)
+		goto out;
+	info = counter_arch_bp(bp);
+	is_kernel = is_kernel_addr(bp->attr.bp_addr);
+
+	/* Verify if dar lies within the address range occupied by the symbol
+	 * being watched. Since we cannot get the symbol size for
+	 * user-space requests we skip this check in that case
+	 */
+	if (is_kernel &&
+	    !((bp->attr.bp_addr <= dar) &&
+	    (dar <= (bp->attr.bp_addr + bp->attr.bp_len))))
+		/*
+		 * This exception is triggered not because of a memory access on
+		 * the monitored variable but in the double-word address range
+		 * in which it is contained. We will consume this exception,
+		 * considering it as 'noise'.
+		 */
+		goto out;
+
+	/*
+	 * Return early after invoking user-callback function without restoring
+	 * DABR if the breakpoint is from ptrace which always operates in
+	 * one-shot mode
+	 */
+	if (bp->overflow_handler == ptrace_triggered) {
+		(bp->overflow_handler)(bp, 0, NULL, regs);
+		rc = NOTIFY_DONE;
+		goto out;
+	}
+
+	stepped = emulate_step(regs, regs->nip);
+	/*
+	 * Single-step the causative instruction manually if
+	 * emulate_step() could not execute it
+	 */
+	if (stepped == 0) {
+		regs->msr |= MSR_SE;
+		per_cpu(last_hit_bp, cpu) = bp;
+		goto out;
+	}
+	/*
+	 * As a policy, the callback is invoked in a 'trigger-after-execute'
+	 * fashion
+	 */
+	(bp->overflow_handler)(bp, 0, NULL, regs);
+	set_dabr(info->address | info->type | DABR_TRANSLATION);
+
+out:
+	rcu_read_unlock();
+	/* Enable pre-emption only if single-stepping is finished */
+	if (stepped) {
+		per_cpu(last_hit_bp, cpu) = 0;
+		put_cpu();
+	}
+	return rc;
+}
+
+/*
+ * Handle single-step exceptions following a DABR hit.
+ */
+int __kprobes single_step_dabr_instruction(struct die_args *args)
+{
+	struct pt_regs *regs = args->regs;
+	int cpu = get_cpu();
+	int ret = NOTIFY_DONE;
+	siginfo_t info;
+	struct perf_event *bp = per_cpu(last_hit_bp, cpu);
+	struct arch_hw_breakpoint *bp_info;
+
+	/*
+	 * Check if we are single-stepping as a result of a
+	 * previous HW Breakpoint exception
+	 */
+	if (!bp)
+		goto out;
+
+	bp_info = counter_arch_bp(bp);
+	regs->msr &= ~MSR_SE;
+	/*
+	 * We shall invoke the user-defined callback function in the single
+	 * stepping handler to confirm to 'trigger-after-execute' semantics
+	 */
+	(bp->overflow_handler)(bp, 0, NULL, regs);
+
+	/* Deliver signal to user-space */
+	if (!is_kernel_addr(bp->attr.bp_addr)) {
+		info.si_signo = SIGTRAP;
+		info.si_errno = 0;
+		info.si_code = TRAP_HWBKPT;
+		info.si_addr = (void __user *)bp_info->address;
+		force_sig_info(SIGTRAP, &info, current);
+	}
+
+	set_dabr(bp_info->address | bp_info->type | DABR_TRANSLATION);
+	per_cpu(last_hit_bp, cpu) = NULL;
+	ret = NOTIFY_STOP;
+	/*
+	 * If single-stepped after hw_breakpoint_handler(), pre-emption is
+	 * already disabled.
+	 */
+	put_cpu();
+
+out:
+	/*
+	 * A put_cpu() call is required to complement the get_cpu()
+	 * call used initially
+	 */
+	put_cpu();
+	return ret;
+}
+
+/*
+ * Handle debug exception notifications.
+ */
+int __kprobes hw_breakpoint_exceptions_notify(
+		struct notifier_block *unused, unsigned long val, void *data)
+{
+	int ret = NOTIFY_DONE;
+
+	switch (val) {
+	case DIE_DABR_MATCH:
+		ret = hw_breakpoint_handler(data);
+		break;
+	case DIE_SSTEP:
+		ret = single_step_dabr_instruction(data);
+		break;
+	}
+
+	return ret;
+}
+
+/*
+ * Release the user breakpoints used by ptrace
+ */
+void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
+{
+	struct thread_struct *t = &tsk->thread;
+
+	unregister_hw_breakpoint(t->ptrace_bps[0]);
+	t->ptrace_bps[0] = NULL;
+}
+
+
+void hw_breakpoint_pmu_read(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+void hw_breakpoint_pmu_unthrottle(struct perf_event *bp)
+{
+	/* TODO */
+}
+
+
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/Kconfig
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/Kconfig
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/Kconfig
@@ -130,6 +130,7 @@ config PPC
 	select HAVE_SYSCALL_WRAPPERS if PPC64
 	select GENERIC_ATOMIC64 if PPC32
 	select HAVE_PERF_EVENTS
+	select HAVE_HW_BREAKPOINT if PPC64
 
 config EARLY_PRINTK
 	bool
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/Makefile
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/kernel/Makefile
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/Makefile
@@ -33,7 +33,7 @@ obj-y				:= cputable.o ptrace.o syscalls
 obj-y				+= vdso32/
 obj-$(CONFIG_PPC64)		+= setup_64.o sys_ppc32.o \
 				   signal_64.o ptrace32.o \
-				   paca.o nvram_64.o firmware.o
+				   paca.o nvram_64.o firmware.o hw_breakpoint.o
 obj-$(CONFIG_PPC_BOOK3S_64)	+= cpu_setup_ppc970.o cpu_setup_pa6t.o
 obj64-$(CONFIG_RELOCATABLE)	+= reloc_64.o
 obj-$(CONFIG_PPC_BOOK3E_64)	+= exceptions-64e.o
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/reg.h
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/include/asm/reg.h
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/reg.h
@@ -180,6 +180,7 @@
 #define   CTRL_TE	0x00c00000	/* thread enable */
 #define   CTRL_RUNLATCH	0x1
 #define SPRN_DABR	0x3F5	/* Data Address Breakpoint Register */
+#define   HBP_NUM	1	/* Number of physical HW breakpoint registers */
 #define   DABR_TRANSLATION	(1UL << 2)
 #define   DABR_DATA_WRITE	(1UL << 1)
 #define   DABR_DATA_READ	(1UL << 0)
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/mm/fault.c
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/mm/fault.c
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/mm/fault.c
@@ -137,6 +137,12 @@ int __kprobes do_page_fault(struct pt_re
 		error_code &= 0x48200000;
 	else
 		is_write = error_code & DSISR_ISSTORE;
+
+	if (error_code & DSISR_DABRMATCH) {
+		/* DABR match */
+		do_dabr(regs, address, error_code);
+		return 0;
+	}
 #else
 	is_write = error_code & ESR_DST;
 #endif /* CONFIG_4xx || CONFIG_BOOKE */
@@ -151,14 +157,6 @@ int __kprobes do_page_fault(struct pt_re
 	if (!user_mode(regs) && (address >= TASK_SIZE))
 		return SIGSEGV;
 
-#if !(defined(CONFIG_4xx) || defined(CONFIG_BOOKE))
-  	if (error_code & DSISR_DABRMATCH) {
-		/* DABR match */
-		do_dabr(regs, address, error_code);
-		return 0;
-	}
-#endif /* !(CONFIG_4xx || CONFIG_BOOKE)*/
-
 	if (in_atomic() || mm == NULL) {
 		if (!user_mode(regs))
 			return SIGSEGV;
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/processor.h
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/include/asm/processor.h
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/include/asm/processor.h
@@ -177,6 +177,7 @@ struct thread_struct {
 #ifdef CONFIG_PPC64
 	unsigned long	start_tb;	/* Start purr when proc switched in */
 	unsigned long	accum_tb;	/* Total accumilated purr for process */
+	struct perf_event *ptrace_bps[HBP_NUM];
 #endif
 	unsigned long	dabr;		/* Data address breakpoint register */
 #ifdef CONFIG_ALTIVEC
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/ptrace.c
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/kernel/ptrace.c
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/ptrace.c
@@ -32,6 +32,8 @@
 #ifdef CONFIG_PPC32
 #include <linux/module.h>
 #endif
+#include <linux/hw_breakpoint.h>
+#include <linux/perf_event.h>
 
 #include <asm/uaccess.h>
 #include <asm/page.h>
@@ -755,9 +757,32 @@ void user_disable_single_step(struct tas
 	clear_tsk_thread_flag(task, TIF_SINGLESTEP);
 }
 
+void ptrace_triggered(struct perf_event *bp, int nmi,
+		      struct perf_sample_data *data, struct pt_regs *regs)
+{
+	struct perf_event_attr attr;
+
+	/*
+	 * Disable the breakpoint request here since ptrace has defined a
+	 * one-shot behaviour for breakpoint exceptions in PPC64.
+	 * The SIGTRAP signal is generated automatically for us in do_dabr().
+	 * We don't have to do anything about that here
+	 */
+	attr = bp->attr;
+	attr.disabled = true;
+	modify_user_hw_breakpoint(bp, &attr);
+}
+
 int ptrace_set_debugreg(struct task_struct *task, unsigned long addr,
 			       unsigned long data)
 {
+#ifdef CONFIG_PPC64
+	int ret;
+	struct thread_struct *thread = &(task->thread);
+	struct perf_event *bp;
+	struct perf_event_attr attr;
+#endif /* CONFIG_PPC64 */
+
 	/* For ppc64 we support one DABR and no IABR's at the moment (ppc64).
 	 *  For embedded processors we support one DAC and no IAC's at the
 	 *  moment.
@@ -786,6 +811,58 @@ int ptrace_set_debugreg(struct task_stru
 	/* Ensure breakpoint translation bit is set */
 	if (data && !(data & DABR_TRANSLATION))
 		return -EIO;
+#ifdef CONFIG_PPC64
+	bp = thread->ptrace_bps[0];
+	if (data == 0) {
+		if (bp) {
+			unregister_hw_breakpoint(bp);
+			thread->ptrace_bps[0] = NULL;
+		}
+		return 0;
+	}
+	if (bp) {
+		attr = bp->attr;
+		attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+
+		switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+		case DABR_DATA_READ:
+			attr.bp_type = HW_BREAKPOINT_R;
+			break;
+		case DABR_DATA_WRITE:
+			attr.bp_type = HW_BREAKPOINT_W;
+			break;
+		case (DABR_DATA_WRITE | DABR_DATA_READ):
+			attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+			break;
+		}
+		ret =  modify_user_hw_breakpoint(bp, &attr);
+		if (ret)
+			return ret;
+		thread->ptrace_bps[0] = bp;
+		thread->dabr = data;
+		return 0;
+	}
+
+	/* Create a new breakpoint request if one doesn't exist already */
+	hw_breakpoint_init(&attr);
+	attr.bp_addr = data & ~HW_BREAKPOINT_ALIGN;
+	switch (data & (DABR_DATA_WRITE | DABR_DATA_READ)) {
+	case DABR_DATA_READ:
+		attr.bp_type = HW_BREAKPOINT_R;
+		break;
+	case DABR_DATA_WRITE:
+		attr.bp_type = HW_BREAKPOINT_W;
+		break;
+	case (DABR_DATA_WRITE | DABR_DATA_READ):
+		attr.bp_type = HW_BREAKPOINT_R | HW_BREAKPOINT_W;
+		break;
+	}
+	thread->ptrace_bps[0] = bp = register_user_hw_breakpoint(&attr,
+							ptrace_triggered, task);
+	if (IS_ERR(bp))
+		return PTR_ERR(bp);
+
+#endif /* CONFIG_PPC64 */
 
 	/* Move contents to the DABR register */
 	task->thread.dabr = data;
Index: linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/process.c
===================================================================
--- linux-2.6-tip.new_ppc64_perf.orig/arch/powerpc/kernel/process.c
+++ linux-2.6-tip.new_ppc64_perf/arch/powerpc/kernel/process.c
@@ -48,6 +48,7 @@
 #include <asm/machdep.h>
 #include <asm/time.h>
 #include <asm/syscalls.h>
+#include <asm/hw_breakpoint.h>
 #ifdef CONFIG_PPC64
 #include <asm/firmware.h>
 #endif
@@ -376,8 +377,11 @@ struct task_struct *__switch_to(struct t
 	if (new->thread.dabr)
 		set_dabr(new->thread.dabr);
 #else
+/* For PPC64, we use the hw-breakpoint interfaces that would schedule DABR */
+#ifndef CONFIG_PPC64
 	if (unlikely(__get_cpu_var(current_dabr) != new->thread.dabr))
 		set_dabr(new->thread.dabr);
+#endif /* CONFIG_PPC64 */
 #endif
 
 
@@ -564,6 +568,7 @@ void flush_thread(void)
 		else
 			set_ti_thread_flag(t, TIF_32BIT);
 	}
+	flush_ptrace_hw_breakpoint(current);
 #endif
 
 	discard_lazy_cpu_state();

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2010-04-14  3:54 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-08 18:12 [Patch 0/1] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver XIV K.Prasad
2010-03-08 18:14 ` [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64 K.Prasad
2010-03-12  6:19   ` Benjamin Herrenschmidt
2010-03-15  6:29     ` K.Prasad
2010-04-07  8:03       ` Benjamin Herrenschmidt
2010-04-14  3:53         ` K.Prasad
2010-03-23  5:33   ` Paul Mackerras
2010-03-23  7:28     ` K.Prasad
  -- strict thread matches above, loose matches on Subject: below --
2010-02-15  5:56 [Patch 0/1] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver XIII K.Prasad
2010-02-15  5:59 ` [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64 K.Prasad
2010-02-21  1:01   ` Frederic Weisbecker
2010-02-22 13:17     ` K.Prasad
2010-02-23 10:57       ` K.Prasad
2010-02-26 17:52         ` Frederic Weisbecker
2010-02-26  1:58       ` Frederic Weisbecker
2010-03-08 23:57         ` David Gibson
2010-03-09  2:14         ` K.Prasad
2010-01-21  8:46 [Patch 0/1] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver XII K.Prasad
2010-01-21  8:49 ` [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64 K.Prasad
2010-01-19  9:12 [Patch 0/1] PPC64-HWBKPT: Hardware Breakpoint interfaces - ver XI K.Prasad
2010-01-19  9:14 ` [Patch 1/1] PPC64-HWBKPT: Implement hw-breakpoints for PPC64 K.Prasad
2009-12-11 16:04 K.Prasad
2009-12-14  0:56 ` Roland McGrath
2009-12-14 18:03   ` K.Prasad
2009-12-14 19:26     ` Roland McGrath
2009-12-17 19:03       ` K.Prasad
2010-01-19  9:40         ` K.Prasad
2010-01-19 10:03           ` Roland McGrath
2010-01-22  7:14             ` K.Prasad

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).