All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/13] IMC Instrumentation Support
@ 2017-03-16  7:34 Madhavan Srinivasan
  2017-03-16  7:34 ` [PATCH v5 01/13] powerpc/powernv: Data structure and macros definitions Madhavan Srinivasan
                   ` (12 more replies)
  0 siblings, 13 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:34 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Madhavan Srinivasan,
	Gautham R . Shenoy, Balbir Singh, Benjamin Herrenschmidt,
	Paul Mackerras, Anton Blanchard, Sukadev Bhattiprolu,
	Michael Neuling, Stewart Smith, Daniel Axtens, Stephane Eranian,
	Anju T Sudhakar, Hemant Kumar

Power9 has In-Memory-Collection (IMC) infrastructure which contains
various Performance Monitoring Units (PMUs) at Nest level (these are
on-chip but off-core), Core level and Thread level.

The Nest PMU counters are handled by a Nest IMC microcode which runs
in the OCC (On-Chip Controller) complex. The microcode collects the
counter data and moves the nest IMC counter data to memory.

The Core and Thread IMC PMU counters are handled in the core. Core
level PMU counters give us the IMC counters' data per core and thread
level PMU counters give us the IMC counters' data per CPU thread.

This patchset enables the nest IMC, core IMC and thread IMC
PMUs and is based on the initial work done by Madhavan Srinivasan.
"Nest Instrumentation Support" :
https://lists.ozlabs.org/pipermail/linuxppc-dev/2015-August/132078.html

v1 for this patchset can be found here :
https://lwn.net/Articles/705475/

Nest events:
Per-chip nest instrumentation provides various per-chip metrics
such as memory, powerbus, Xlink and Alink bandwidth.

Core events:
Per-core IMC instrumentation provides various per-core metrics
such as non-idle cycles, non-idle instructions, various cache and
memory related metrics etc.

Thread events:
All the events for thread level are same as core level with the
difference being in the domain. These are per-cpu metrics.

PMU Events' Information:
OPAL obtains the IMC PMU and event information from the IMC Catalog
and passes on to the kernel via the device tree. The events' information
contains :
 - Event name
 - Event Offset
 - Event description
and, maybe :
 - Event scale
 - Event unit

Some PMUs may have a common scale and unit values for all their
supported events. For those cases, the scale and unit properties for
those events must be inherited from the PMU.

The event offset in the memory is where the counter data gets
accumulated.

The OPAL-side patches are posted upstream :
https://lists.ozlabs.org/pipermail/skiboot/2017-March/006531.html

The kernel discovers the IMC counters information in the device tree
at the "imc-counters" device node which has a compatible field
"ibm,opal-in-memory-counters".

Parsing of the Events' information:
To parse the IMC PMUs and events information, the kernel has to
discover the "imc-counters" node and walk through the pmu and event
nodes.

Here is an excerpt of the dt showing the imc-counters with
mcs0 (nest), core and thread node:

https://github.com/open-power/ima-catalog/blob/master/81E00612.4E0100.dts

/dts-v1/;

[...]

/dts-v1/;

/ {
        name = "";
        compatible = "ibm,opal-in-memory-counters";
        #address-cells = <0x1>;
        #size-cells = <0x1>;
        imc-nest-offset = <0x320000>;
        imc-nest-size = <0x30000>;
        version-id = "";

        NEST_MCS: nest-mcs-events {
                #address-cells = <0x1>;
                #size-cells = <0x1>;

                event@0 {
                        event-name = "RRTO_QFULL_NO_DISP" ;
                        reg = <0x0 0x8>;
                        desc = "RRTO not dispatched in MCS0 due to capacity - pulses once for each time a valid RRTO op is not dispatched due to a command list full condition" ;
                };
                event@8 {
                        event-name = "WRTO_QFULL_NO_DISP" ;
                        reg = <0x8 0x8>;
                        desc = "WRTO not dispatched in MCS0 due to capacity - pulses once for each time a valid WRTO op is not dispatched due to a command list full condition" ;
                };
		[...]
	mcs0 {
                compatible = "ibm,imc-counters-nest";
                events-prefix = "PM_MCS0_";
                unit = "";
                scale = "";
                reg = <0x118 0x8>;
                events = < &NEST_MCS >;
        };

        mcs1 {
                compatible = "ibm,imc-counters-nest";
                events-prefix = "PM_MCS1_";
                unit = "";
                scale = "";
                reg = <0x198 0x8>;
                events = < &NEST_MCS >;
        };
	[...]

	CORE_EVENTS: core-events {
                #address-cells = <0x1>;
                #size-cells = <0x1>;

                event@e0 {
                        event-name = "0THRD_NON_IDLE_PCYC" ;
                        reg = <0xe0 0x8>;
                        desc = "The number of processor cycles when all threads are idle" ;
                };
                event@120 {
                        event-name = "1THRD_NON_IDLE_PCYC" ;
                        reg = <0x120 0x8>;
                        desc = "The number of processor cycles when exactly one SMT thread is executing non-idle code" ;
                };
		[...]
       core {
                compatible = "ibm,imc-counters-core";
                events-prefix = "CPM_";
                unit = "";
                scale = "";
                reg = <0x0 0x8>;
                events = < &CORE_EVENTS >;
        };

        thread {
                compatible = "ibm,imc-counters-core";
                events-prefix = "CPM_";
                unit = "";
                scale = "";
                reg = <0x0 0x8>;
                events = < &CORE_EVENTS >;
        };
};

>From the device tree, the kernel parses the PMUs and their events'
information.

After parsing the IMC PMUs and their events, the PMUs and their
attributes are registered in the kernel.

This patchset (patches 9 and 10) configure the thread level IMC PMUs
to count for tasks, which give us the thread level metric values per
task.

Example Usage :
 # perf list

  [...]
  nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0/           [Kernel PMU event]
  nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0_LAST_SAMPLE/ [Kernel PMU event]
  [...]
  core_imc/CPM_NON_IDLE_INST/                        [Kernel PMU event]
  core_imc/CPM_NON_IDLE_PCYC/                        [Kernel PMU event]
  [...]
  thread_imc/CPM_NON_IDLE_INST/                      [Kernel PMU event]
  thread_imc/CPM_NON_IDLE_PCYC/                      [Kernel PMU event]

To see per chip data for nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0/ :
 # perf stat -e "nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0/" -a --per-socket

To see non-idle instructions for core 0 :
 # ./perf stat -e "core_imc/CPM_NON_IDLE_INST/" -C 0 -I 1000

To see non-idle instructions for a "make" :
 # ./perf stat -e "thread_imc/CPM_NON_IDLE_PCYC/" make

Comments/feedback/suggestions are welcome.

Changelog:
 v4 -> v5:
 - Updated opal call numbers
 - Added a patch to disable Core-IMC device using shutdown callback
 - Added patch to support cpuhotplug for thread-imc
 - Added patch to disable and enable core imc engine in cpuhot plug path
 v3 -> v4 :
 - Changed the events parser code to discover the PMU and events because
   of the changed format of the IMC DTS file (Patch 3).
 - Implemented the two TODOs to include core and thread IMC support with
   this patchset (Patches 7 through 10).
 - Changed the CPU hotplug code of Nest IMC PMUs to include a new state
   CPUHP_AP_PERF_POWERPC_NEST_ONLINE (Patch 6).
 v2 -> v3 :
 - Changed all references for IMA (In-Memory Accumulation) to IMC (In-Memory
   Collection).
 v1 -> v2 :
 - Account for the cases where a PMU can have a common scale and unit
   values for all its supported events (Patch 3/6).
 - Fixed a Build error (for maple_defconfig) by enabling imc_pmu.o
   only for CONFIG_PPC_POWERNV=y (Patch 4/6)
 - Read from the "event-name" property instead of "name" for an event
   node (Patch 3/6).

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>

Anju T Sudhakar (2):
  powerpc/perf: Thread imc cpuhotplug support
  powerpc/perf: Enable/disable core engine during cpuhotplug

Hemant Kumar (10):
  powerpc/powernv: Data structure and macros definitions
  powerpc/powernv: Autoload IMC device driver module
  powerpc/powernv: Detect supported IMC units and its events
  powerpc/perf: Add event attribute and group to IMC pmus
  powerpc/perf: Generic imc pmu event functions
  powerpc/perf: IMC pmu cpumask and cpu hotplug support
  powerpc/powernv: Core IMC events detection
  powerpc/perf: PMU functions for Core IMC and hotplugging
  powerpc/powernv: Thread IMC events detection
  powerpc/perf: Thread IMC PMU functions

Madhavan Srinivasan (1):
  powerpc/powernv: Add device shutdown function for Core IMC

 arch/powerpc/include/asm/imc-pmu.h             |  85 +++
 arch/powerpc/include/asm/opal-api.h            |  11 +-
 arch/powerpc/include/asm/opal.h                |   5 +
 arch/powerpc/perf/Makefile                     |   6 +-
 arch/powerpc/perf/imc-pmu.c                    | 811 +++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/Makefile        |   2 +-
 arch/powerpc/platforms/powernv/opal-imc.c      | 560 +++++++++++++++++
 arch/powerpc/platforms/powernv/opal-wrappers.S |   2 +
 arch/powerpc/platforms/powernv/opal.c          |  13 +
 include/linux/cpuhotplug.h                     |   3 +
 10 files changed, 1495 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/include/asm/imc-pmu.h
 create mode 100644 arch/powerpc/perf/imc-pmu.c
 create mode 100644 arch/powerpc/platforms/powernv/opal-imc.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v5 01/13] powerpc/powernv: Data structure and macros definitions
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
@ 2017-03-16  7:34 ` Madhavan Srinivasan
  2017-03-16  7:34 ` [PATCH v5 02/13] powerpc/powernv: Autoload IMC device driver module Madhavan Srinivasan
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:34 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

Create new header file "imc-pmu.h" to add the data structures
and macros needed for IMC pmu support.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/imc-pmu.h | 73 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)
 create mode 100644 arch/powerpc/include/asm/imc-pmu.h

diff --git a/arch/powerpc/include/asm/imc-pmu.h b/arch/powerpc/include/asm/imc-pmu.h
new file mode 100644
index 000000000000..323232248cc4
--- /dev/null
+++ b/arch/powerpc/include/asm/imc-pmu.h
@@ -0,0 +1,73 @@
+#ifndef PPC_POWERNV_IMC_PMU_DEF_H
+#define PPC_POWERNV_IMC_PMU_DEF_H
+
+/*
+ * IMC Nest Performance Monitor counter support.
+ *
+ * Copyright (C) 2016 Madhavan Srinivasan, IBM Corporation.
+ *           (C) 2016 Hemant K Shaw, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+#include <linux/of.h>
+#include <linux/io.h>
+#include <asm/opal.h>
+
+#define IMC_MAX_CHIPS			32
+#define IMC_MAX_PMUS			32
+#define IMC_MAX_PMU_NAME_LEN		256
+
+#define NEST_IMC_ENGINE_START		1
+#define NEST_IMC_ENGINE_STOP		0
+#define NEST_MAX_PAGES			16
+
+#define NEST_IMC_PRODUCTION_MODE	1
+
+#define IMC_DTB_COMPAT		"ibm,opal-in-memory-counters"
+#define IMC_DTB_NEST_COMPAT	"ibm,imc-counters-nest"
+
+/*
+ * Structure to hold per chip specific memory address
+ * information for nest pmus. Nest Counter data are exported
+ * in per-chip reserved memory region by the PORE Engine.
+ */
+struct perchip_nest_info {
+	u32 chip_id;
+	u64 pbase;
+	u64 vbase[NEST_MAX_PAGES];
+	u64 size;
+};
+
+/*
+ * Place holder for nest pmu events and values.
+ */
+struct imc_events {
+	char *ev_name;
+	char *ev_value;
+};
+
+/*
+ * Device tree parser code detects IMC pmu support and
+ * registers new IMC pmus. This structure will
+ * hold the pmu functions and attrs for each imc pmu and
+ * will be referenced at the time of pmu registration.
+ */
+struct imc_pmu {
+	struct pmu pmu;
+	int domain;
+	const struct attribute_group *attr_groups[4];
+};
+
+/*
+ * Domains for IMC PMUs
+ */
+#define IMC_DOMAIN_NEST		1
+
+#define UNKNOWN_DOMAIN		-1
+
+#endif /* PPC_POWERNV_IMC_PMU_DEF_H */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 02/13] powerpc/powernv: Autoload IMC device driver module
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
  2017-03-16  7:34 ` [PATCH v5 01/13] powerpc/powernv: Data structure and macros definitions Madhavan Srinivasan
@ 2017-03-16  7:34 ` Madhavan Srinivasan
  2017-03-16  7:34 ` [PATCH v5 03/13] powerpc/powernv: Detect supported IMC units and its events Madhavan Srinivasan
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:34 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

This patch does three things :
 - Enables "opal.c" to create a platform device for the IMC interface
   according to the appropriate compatibility string.
 - Find the reserved-memory region details from the system device tree
   and get the base address of HOMER (Reserved memory) region address for each chip.
 - We also get the Nest PMU counter data offsets (in the HOMER region)
   and their sizes. The offsets for the counters' data are fixed and
   won't change from chip to chip.

The device tree parsing logic is separated from the PMU creation
functions (which is done in subsequent patches).

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/Makefile   |   2 +-
 arch/powerpc/platforms/powernv/opal-imc.c | 117 ++++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/opal.c     |  13 ++++
 3 files changed, 131 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/powernv/opal-imc.c

diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
index b5d98cb3f482..44909fec1121 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -2,7 +2,7 @@ obj-y			+= setup.o opal-wrappers.o opal.o opal-async.o idle.o
 obj-y			+= opal-rtc.o opal-nvram.o opal-lpc.o opal-flash.o
 obj-y			+= rng.o opal-elog.o opal-dump.o opal-sysparam.o opal-sensor.o
 obj-y			+= opal-msglog.o opal-hmi.o opal-power.o opal-irqchip.o
-obj-y			+= opal-kmsg.o
+obj-y			+= opal-kmsg.o opal-imc.o
 
 obj-$(CONFIG_SMP)	+= smp.o subcore.o subcore-asm.o
 obj-$(CONFIG_PCI)	+= pci.o pci-ioda.o npu-dma.o
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c
new file mode 100644
index 000000000000..1b99c4e2f3f8
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -0,0 +1,117 @@
+/*
+ * OPAL IMC interface detection driver
+ * Supported on POWERNV platform
+ *
+ * Copyright	(C) 2016 Madhavan Srinivasan, IBM Corporation.
+ *		(C) 2016 Hemant K Shaw, IBM Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/miscdevice.h>
+#include <linux/fs.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+#include <linux/poll.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <asm/opal.h>
+#include <asm/io.h>
+#include <asm/uaccess.h>
+#include <asm/cputable.h>
+#include <asm/imc-pmu.h>
+
+struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
+
+static int opal_imc_counters_probe(struct platform_device *pdev)
+{
+	struct device_node *child, *imc_dev, *rm_node = NULL;
+	struct perchip_nest_info *pcni;
+	u32 reg[4], pages, nest_offset, nest_size, idx;
+	int i = 0;
+	const char *node_name;
+
+	if (!pdev || !pdev->dev.of_node)
+		return -ENODEV;
+
+	imc_dev = pdev->dev.of_node;
+
+	/*
+	 * nest_offset : where the nest-counters' data start.
+	 * size : size of the entire nest-counters region
+	 */
+	if (of_property_read_u32(imc_dev, "imc-nest-offset", &nest_offset))
+		goto err;
+	if (of_property_read_u32(imc_dev, "imc-nest-size", &nest_size))
+		goto err;
+
+	/* Find the "homer region" for each chip */
+	rm_node = of_find_node_by_path("/reserved-memory");
+	if (!rm_node)
+		goto err;
+
+	for_each_child_of_node(rm_node, child) {
+		if (of_property_read_string_index(child, "name", 0,
+						  &node_name))
+			continue;
+		if (strncmp("ibm,homer-image", node_name,
+			    strlen("ibm,homer-image")))
+			continue;
+
+		/* Get the chip id to which the above homer region belongs to */
+		if (of_property_read_u32(child, "ibm,chip-id", &idx))
+			goto err;
+
+		/* reg property will have four u32 cells. */
+		if (of_property_read_u32_array(child, "reg", reg, 4))
+			goto err;
+
+		pcni = &nest_perchip_info[idx];
+
+		/* Fetch the homer region base address */
+		pcni->pbase = reg[0];
+		pcni->pbase = pcni->pbase << 32 | reg[1];
+		/* Add the nest IMC Base offset */
+		pcni->pbase = pcni->pbase + nest_offset;
+		/* Fetch the size of the homer region */
+		pcni->size = nest_size;
+
+		do {
+			pages = PAGE_SIZE * i;
+			pcni->vbase[i++] = (u64)phys_to_virt(pcni->pbase +
+							     pages);
+		} while (i < (pcni->size / PAGE_SIZE));
+	}
+
+	return 0;
+err:
+	return -ENODEV;
+}
+
+static const struct of_device_id opal_imc_match[] = {
+	{ .compatible = IMC_DTB_COMPAT },
+	{},
+};
+
+static struct platform_driver opal_imc_driver = {
+	.driver = {
+		.name = "opal-imc-counters",
+		.of_match_table = opal_imc_match,
+	},
+	.probe = opal_imc_counters_probe,
+};
+
+MODULE_DEVICE_TABLE(of, opal_imc_match);
+module_platform_driver(opal_imc_driver);
+MODULE_DESCRIPTION("PowerNV OPAL IMC driver");
+MODULE_LICENSE("GPL");
diff --git a/arch/powerpc/platforms/powernv/opal.c b/arch/powerpc/platforms/powernv/opal.c
index e0f856bfbfe8..2c90c570953d 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -30,6 +30,7 @@
 #include <asm/opal.h>
 #include <asm/firmware.h>
 #include <asm/mce.h>
+#include <asm/imc-pmu.h>
 
 #include "powernv.h"
 
@@ -631,6 +632,15 @@ static void opal_pdev_init(const char *compatible)
 		of_platform_device_create(np, NULL, NULL);
 }
 
+static void opal_imc_init_dev(void)
+{
+	struct device_node *np;
+
+	np = of_find_compatible_node(NULL, NULL, IMC_DTB_COMPAT);
+	if (np)
+		of_platform_device_create(np, NULL, NULL);
+}
+
 static int kopald(void *unused)
 {
 	unsigned long timeout = msecs_to_jiffies(opal_heartbeat) + 1;
@@ -704,6 +714,9 @@ static int __init opal_init(void)
 	/* Setup a heatbeat thread if requested by OPAL */
 	opal_init_heartbeat();
 
+	/* Detect IMC pmu counters support and create PMUs */
+	opal_imc_init_dev();
+
 	/* Create leds platform devices */
 	leds = of_find_node_by_path("/ibm,opal/leds");
 	if (leds) {
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 03/13] powerpc/powernv: Detect supported IMC units and its events
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
  2017-03-16  7:34 ` [PATCH v5 01/13] powerpc/powernv: Data structure and macros definitions Madhavan Srinivasan
  2017-03-16  7:34 ` [PATCH v5 02/13] powerpc/powernv: Autoload IMC device driver module Madhavan Srinivasan
@ 2017-03-16  7:34 ` Madhavan Srinivasan
  2017-03-16  7:34 ` [PATCH v5 04/13] powerpc/perf: Add event attribute and group to IMC pmus Madhavan Srinivasan
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:34 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

Parse device tree to detect IMC units. Traverse through each IMC unit
node to find supported events and corresponding unit/scale files (if any).

Here is the DTS file for reference:

	https://github.com/open-power/ima-catalog/blob/master/81E00612.4E0100.dts

The device tree for IMC counters starts at the node "imc-counters".
This node contains all the IMC PMU nodes and event nodes
for these IMC PMUs. The PMU nodes have an "events" property which has a
phandle value for the actual events node. The events are separated from
the PMU nodes to abstract out the common events. For example, PMU node
"mcs0", "mcs1" etc. will contain a pointer to "nest-mcs-events" since,
the events are common between these PMUs. These events have a different
prefix based on their relation to different PMUs, and hence, the PMU
nodes themselves contain an "events-prefix" property. The value for this
property concatenated to the event name, forms the actual event
name. Also, the PMU have a "reg" field as the base offset for the events
which belong to this PMU. This "reg" field is added to an event in the
"events" node, which gives us the location of the counter data. Kernel
code uses this offset as event configuration value.

Device tree parser code also looks for scale/unit property in the event
node and passes on the value as an event attr for perf interface to use
in the post processing by the perf tool. Some PMUs may have common scale
and unit properties which implies that all events supported by this PMU
inherit the scale and unit properties of the PMU itself. For those
events, we need to set the common unit and scale values.

For failure to initialize any unit or any event, disable that unit and
continue setting up the rest of them.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/opal-imc.c | 383 ++++++++++++++++++++++++++++++
 1 file changed, 383 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c
index 1b99c4e2f3f8..0c076aca87e6 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -32,6 +32,388 @@
 #include <asm/imc-pmu.h>
 
 struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
+struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
+
+static int imc_event_info(char *name, struct imc_events *events)
+{
+	char *buf;
+
+	/* memory for content */
+	buf = kzalloc(IMC_MAX_PMU_NAME_LEN, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	events->ev_name = name;
+	events->ev_value = buf;
+	return 0;
+}
+
+static int imc_event_info_str(struct property *pp, char *name,
+			       struct imc_events *events)
+{
+	int ret;
+
+	ret = imc_event_info(name, events);
+	if (ret)
+		return ret;
+
+	if (!pp->value || (strnlen(pp->value, pp->length) == pp->length) ||
+	   (pp->length > IMC_MAX_PMU_NAME_LEN))
+		return -EINVAL;
+	strncpy(events->ev_value, (const char *)pp->value, pp->length);
+
+	return 0;
+}
+
+static int imc_event_info_val(char *name, u32 val,
+			      struct imc_events *events)
+{
+	int ret;
+
+	ret = imc_event_info(name, events);
+	if (ret)
+		return ret;
+	sprintf(events->ev_value, "event=0x%x", val);
+
+	return 0;
+}
+
+static int set_event_property(struct property *pp, char *event_prop,
+			      struct imc_events *events, char *ev_name)
+{
+	char *buf;
+	int ret;
+
+	buf = kzalloc(IMC_MAX_PMU_NAME_LEN, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	sprintf(buf, "%s.%s", ev_name, event_prop);
+	ret = imc_event_info_str(pp, buf, events);
+	if (ret) {
+		kfree(events->ev_name);
+		kfree(events->ev_value);
+	}
+
+	return ret;
+}
+
+/*
+ * imc_events_node_parser: Parse the event node "dev" and assign the parsed
+ *                         information to event "events".
+ *
+ * Parses the "reg" property of this event. "reg" gives us the event offset.
+ * Also, parse the "scale" and "unit" properties, if any.
+ */
+static int imc_events_node_parser(struct device_node *dev,
+				  struct imc_events *events,
+				  struct property *event_scale,
+				  struct property *event_unit,
+				  struct property *name_prefix,
+				  u32 reg)
+{
+	struct property *name, *pp;
+	char *ev_name;
+	u32 val;
+	int idx = 0, ret;
+
+	if (!dev)
+		return -EINVAL;
+
+	/*
+	 * Loop through each property of an event node
+	 */
+	name = of_find_property(dev, "event-name", NULL);
+	if (!name)
+		return -ENODEV;
+
+	if (!name->value ||
+	  (strnlen(name->value, name->length) == name->length) ||
+	  (name->length > IMC_MAX_PMU_NAME_LEN))
+		return -EINVAL;
+
+	ev_name = kzalloc(IMC_MAX_PMU_NAME_LEN, GFP_KERNEL);
+	if (!ev_name)
+		return -ENOMEM;
+
+	snprintf(ev_name, IMC_MAX_PMU_NAME_LEN, "%s%s",
+		 (char *)name_prefix->value,
+		 (char *)name->value);
+
+	/*
+	 * Parse each property of this event node "dev". Property "reg" has
+	 * the offset which is assigned to the event name. Other properties
+	 * like "scale" and "unit" are assigned to event.scale and event.unit
+	 * accordingly.
+	 */
+	for_each_property_of_node(dev, pp) {
+		/*
+		 * If there is an issue in parsing a single property of
+		 * this event, we just clean up the buffers, but we still
+		 * continue to parse.
+		 */
+		if (strncmp(pp->name, "reg", 3) == 0) {
+			of_property_read_u32(dev, pp->name, &val);
+			val += reg;
+			ret = imc_event_info_val(ev_name, val, &events[idx]);
+			if (ret) {
+				kfree(events[idx].ev_name);
+				kfree(events[idx].ev_value);
+				continue;
+			}
+			/*
+			 * If the common scale and unit properties available,
+			 * then, assign them to this event
+			 */
+			if (event_scale) {
+				idx++;
+				ret = set_event_property(event_scale, "scale",
+							 &events[idx],
+							 ev_name);
+				if (ret)
+					continue;
+				idx++;
+			}
+			if (event_unit) {
+				ret = set_event_property(event_unit, "unit",
+							 &events[idx],
+							 ev_name);
+				if (ret)
+					continue;
+			}
+			idx++;
+		} else if (strncmp(pp->name, "unit", 4) == 0) {
+			ret = set_event_property(pp, "unit", &events[idx],
+						 ev_name);
+			if (ret)
+				continue;
+			idx++;
+		} else if (strncmp(pp->name, "scale", 5) == 0) {
+			ret = set_event_property(pp, "scale", &events[idx],
+						 ev_name);
+			if (ret)
+				continue;
+			idx++;
+		}
+	}
+
+	return idx;
+}
+
+/*
+ * imc_get_domain : Returns the domain for pmu "pmu_dev".
+ */
+int imc_get_domain(struct device_node *pmu_dev)
+{
+	if (of_device_is_compatible(pmu_dev, IMC_DTB_NEST_COMPAT))
+		return IMC_DOMAIN_NEST;
+	else
+		return UNKNOWN_DOMAIN;
+}
+
+/*
+ * get_nr_children : Returns the number of children for a pmu device node.
+ */
+static int get_nr_children(struct device_node *pmu_node)
+{
+	struct device_node *child;
+	int i = 0;
+
+	for_each_child_of_node(pmu_node, child)
+		i++;
+	return i;
+}
+
+/*
+ * imc_free_events : Cleanup the "events" list having "nr_entries" entries.
+ */
+static void imc_free_events(struct imc_events *events, int nr_entries)
+{
+	int i;
+
+	/* Nothing to clean, return */
+	if (!events)
+		return;
+	for (i = 0; i < nr_entries; i++) {
+		kfree(events[i].ev_name);
+		kfree(events[i].ev_value);
+	}
+
+	kfree(events);
+}
+
+/*
+ * imc_pmu_create : Takes the parent device which is the pmu unit and a
+ *                  pmu_index as the inputs.
+ * Allocates memory for the pmu, sets up its domain (NEST or CORE), and
+ * allocates memory for the events supported by this pmu. Assigns a name for
+ * the pmu. Calls imc_events_node_parser() to setup the individual events.
+ * If everything goes fine, it calls, init_imc_pmu() to setup the pmu device
+ * and register it.
+ */
+static int imc_pmu_create(struct device_node *parent, int pmu_index)
+{
+	struct device_node *ev_node = NULL, *dir = NULL;
+	struct imc_events *events;
+	struct imc_pmu *pmu_ptr;
+	u32 prop, reg;
+	struct property *pp, *scale_pp, *unit_pp, *name_prefix;
+	char *buf;
+	int idx = 0, ret = 0, nr_children = 0;
+
+	if (!parent)
+		return -EINVAL;
+
+	/* memory for pmu */
+	pmu_ptr = kzalloc(sizeof(struct imc_pmu), GFP_KERNEL);
+	if (!pmu_ptr)
+		return -ENOMEM;
+
+	pmu_ptr->domain = imc_get_domain(parent);
+	if (pmu_ptr->domain == UNKNOWN_DOMAIN)
+		goto free_pmu;
+
+	/* Needed for hotplug/migration */
+	per_nest_pmu_arr[pmu_index] = pmu_ptr;
+
+	/*
+	 * "events" property inside a PMU node contains the phandle value
+	 * for the actual events node. The "events" node for the IMC PMU
+	 * is not in this node, rather inside "imc-counters" node, since,
+	 * we want to factor out the common events (thereby, reducing the
+	 * size of the device tree)
+	 */
+	of_property_read_u32(parent, "events", &prop);
+	if (!prop)
+		return -EINVAL;
+
+	/*
+	 * Fetch the actual node where the events for this PMU exist.
+	 */
+	dir = of_find_node_by_phandle(prop);
+	if (!dir)
+		return -EINVAL;
+
+	/*
+	 * Get the maximum no. of events in this node.
+	 * Multiply by 3 to account for .scale and .unit properties
+	 * This number suggests the amount of memory needed to setup the
+	 * events for this pmu.
+	 */
+	nr_children = get_nr_children(dir) * 3;
+
+	/* memory for pmu events */
+	events = kzalloc((sizeof(struct imc_events) * nr_children),
+			 GFP_KERNEL);
+	if (!events) {
+		ret = -ENOMEM;
+		goto free_pmu;
+	}
+
+	pp = of_find_property(parent, "name", NULL);
+	if (!pp) {
+		ret = -ENODEV;
+		goto free_events;
+	}
+
+	if (!pp->value ||
+	  (strnlen(pp->value, pp->length) == pp->length) ||
+	    (pp->length > IMC_MAX_PMU_NAME_LEN)) {
+		ret = -EINVAL;
+		goto free_events;
+	}
+
+	buf = kzalloc(IMC_MAX_PMU_NAME_LEN, GFP_KERNEL);
+	if (!buf) {
+		ret = -ENOMEM;
+		goto free_events;
+	}
+
+	/* Save the name to register it later */
+	sprintf(buf, "nest_%s", (char *)pp->value);
+	pmu_ptr->pmu.name = (char *)buf;
+
+	/*
+	 * Check if there is a common "scale" and "unit" properties inside
+	 * the PMU node for all the events supported by this PMU.
+	 */
+	scale_pp = of_find_property(parent, "scale", NULL);
+	unit_pp = of_find_property(parent, "unit", NULL);
+
+	/*
+	 * Get the event-prefix property from the PMU node
+	 * which needs to be attached with the event names.
+	 */
+	name_prefix = of_find_property(parent, "events-prefix", NULL);
+	if (!name_prefix)
+		return -ENODEV;
+
+	/*
+	 * "reg" property gives out the base offset of the counters data
+	 * for this PMU.
+	 */
+	of_property_read_u32(parent, "reg", &reg);
+
+	if (!name_prefix->value ||
+	   (strnlen(name_prefix->value, name_prefix->length) == name_prefix->length) ||
+	   (name_prefix->length > IMC_MAX_PMU_NAME_LEN))
+		return -EINVAL;
+
+	/* Loop through event nodes */
+	for_each_child_of_node(dir, ev_node) {
+		ret = imc_events_node_parser(ev_node, &events[idx], scale_pp,
+					     unit_pp, name_prefix, reg);
+		if (ret < 0) {
+			/* Unable to parse this event */
+			if (ret == -ENOMEM)
+				goto free_events;
+			continue;
+		}
+
+		/*
+		 * imc_event_node_parser will return number of
+		 * event entries created for this. This could include
+		 * event scale and unit files also.
+		 */
+		idx += ret;
+	}
+
+	return 0;
+
+free_events:
+	imc_free_events(events, idx);
+free_pmu:
+	kfree(pmu_ptr);
+	return ret;
+}
+
+/*
+ * imc_pmu_setup : Setup the IMC PMUs (children of "parent").
+ */
+static void imc_pmu_setup(struct device_node *parent)
+{
+	struct device_node *child;
+	int pmu_count = 0, rc = 0;
+	const struct property *pp;
+
+	if (!parent)
+		return;
+
+	/* Setup all the IMC pmus */
+	for_each_child_of_node(parent, child) {
+		pp = of_get_property(child, "compatible", NULL);
+		if (pp) {
+			/*
+			 * If there is a node with a "compatible" field,
+			 * that's a PMU node
+			 */
+			rc = imc_pmu_create(child, pmu_count);
+			if (rc)
+				return;
+			pmu_count++;
+		}
+	}
+}
 
 static int opal_imc_counters_probe(struct platform_device *pdev)
 {
@@ -93,6 +475,7 @@ static int opal_imc_counters_probe(struct platform_device *pdev)
 		} while (i < (pcni->size / PAGE_SIZE));
 	}
 
+	imc_pmu_setup(imc_dev);
 	return 0;
 err:
 	return -ENODEV;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 04/13] powerpc/perf: Add event attribute and group to IMC pmus
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (2 preceding siblings ...)
  2017-03-16  7:34 ` [PATCH v5 03/13] powerpc/powernv: Detect supported IMC units and its events Madhavan Srinivasan
@ 2017-03-16  7:34 ` Madhavan Srinivasan
  2017-03-16  7:34 ` [PATCH v5 05/13] powerpc/perf: Generic imc pmu event functions Madhavan Srinivasan
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:34 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

Device tree IMC driver code parses the IMC units and their events. It
passes the information to IMC pmu code which is placed in powerpc/perf
as "imc-pmu.c".

This patch creates only event attributes and attribute groups for the
IMC pmus.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/perf/Makefile                |  6 +-
 arch/powerpc/perf/imc-pmu.c               | 96 +++++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/opal-imc.c | 12 +++-
 3 files changed, 111 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/perf/imc-pmu.c

diff --git a/arch/powerpc/perf/Makefile b/arch/powerpc/perf/Makefile
index 4d606b99a5cb..d0d1f04203c7 100644
--- a/arch/powerpc/perf/Makefile
+++ b/arch/powerpc/perf/Makefile
@@ -2,10 +2,14 @@ subdir-ccflags-$(CONFIG_PPC_WERROR) := -Werror
 
 obj-$(CONFIG_PERF_EVENTS)	+= callchain.o perf_regs.o
 
+imc-$(CONFIG_PPC_POWERNV)       += imc-pmu.o
+
 obj-$(CONFIG_PPC_PERF_CTRS)	+= core-book3s.o bhrb.o
 obj64-$(CONFIG_PPC_PERF_CTRS)	+= power4-pmu.o ppc970-pmu.o power5-pmu.o \
 				   power5+-pmu.o power6-pmu.o power7-pmu.o \
-				   isa207-common.o power8-pmu.o power9-pmu.o
+				   isa207-common.o power8-pmu.o power9-pmu.o \
+				   $(imc-y)
+
 obj32-$(CONFIG_PPC_PERF_CTRS)	+= mpc7450-pmu.o
 
 obj-$(CONFIG_FSL_EMB_PERF_EVENT) += core-fsl-emb.o
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
new file mode 100644
index 000000000000..7b6ce500ddc5
--- /dev/null
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -0,0 +1,96 @@
+/*
+ * Nest Performance Monitor counter support.
+ *
+ * Copyright (C) 2016 Madhavan Srinivasan, IBM Corporation.
+ *	     (C) 2016 Hemant K Shaw, IBM Corporation.
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation.
+ */
+#include <linux/perf_event.h>
+#include <linux/slab.h>
+#include <asm/opal.h>
+#include <asm/imc-pmu.h>
+#include <asm/cputhreads.h>
+#include <linux/string.h>
+
+struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
+struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
+
+/* dev_str_attr : Populate event "name" and string "str" in attribute */
+static struct attribute *dev_str_attr(const char *name, const char *str)
+{
+	struct perf_pmu_events_attr *attr;
+
+	attr = kzalloc(sizeof(*attr), GFP_KERNEL);
+
+	sysfs_attr_init(&attr->attr.attr);
+
+	attr->event_str = str;
+	attr->attr.attr.name = name;
+	attr->attr.attr.mode = 0444;
+	attr->attr.show = perf_event_sysfs_show;
+
+	return &attr->attr.attr;
+}
+
+/*
+ * update_events_in_group: Update the "events" information in an attr_group
+ *                         and assign the attr_group to the pmu "pmu".
+ */
+static int update_events_in_group(struct imc_events *events,
+				  int idx, struct imc_pmu *pmu)
+{
+	struct attribute_group *attr_group;
+	struct attribute **attrs;
+	int i;
+
+	/* Allocate memory for attribute group */
+	attr_group = kzalloc(sizeof(*attr_group), GFP_KERNEL);
+	if (!attr_group)
+		return -ENOMEM;
+
+	/* Allocate memory for attributes */
+	attrs = kzalloc((sizeof(struct attribute *) * (idx + 1)), GFP_KERNEL);
+	if (!attrs) {
+		kfree(attr_group);
+		return -ENOMEM;
+	}
+
+	attr_group->name = "events";
+	attr_group->attrs = attrs;
+	for (i = 0; i < idx; i++, events++) {
+		attrs[i] = dev_str_attr((char *)events->ev_name,
+					(char *)events->ev_value);
+	}
+
+	pmu->attr_groups[0] = attr_group;
+	return 0;
+}
+
+/*
+ * init_imc_pmu : Setup the IMC pmu device in "pmu_ptr" and its events
+ *                "events".
+ * Setup the cpu mask information for these pmus and setup the state machine
+ * hotplug notifiers as well.
+ */
+int init_imc_pmu(struct imc_events *events, int idx,
+		 struct imc_pmu *pmu_ptr)
+{
+	int ret = -ENODEV;
+
+	ret = update_events_in_group(events, idx, pmu_ptr);
+	if (ret)
+		goto err_free;
+
+	return 0;
+
+err_free:
+	/* Only free the attr_groups which are dynamically allocated  */
+	if (pmu_ptr->attr_groups[0]) {
+		kfree(pmu_ptr->attr_groups[0]->attrs);
+		kfree(pmu_ptr->attr_groups[0]);
+	}
+
+	return ret;
+}
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c
index 0c076aca87e6..a15e8e64dda0 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -31,8 +31,11 @@
 #include <asm/cputable.h>
 #include <asm/imc-pmu.h>
 
-struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
-struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
+extern struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
+extern struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
+
+extern int init_imc_pmu(struct imc_events *events,
+			int idx, struct imc_pmu *pmu_ptr);
 
 static int imc_event_info(char *name, struct imc_events *events)
 {
@@ -378,6 +381,11 @@ static int imc_pmu_create(struct device_node *parent, int pmu_index)
 		idx += ret;
 	}
 
+	ret = init_imc_pmu(events, idx, pmu_ptr);
+	if (ret) {
+		pr_err("IMC PMU %s Register failed\n", pmu_ptr->pmu.name);
+		goto free_events;
+	}
 	return 0;
 
 free_events:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 05/13] powerpc/perf: Generic imc pmu event functions
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (3 preceding siblings ...)
  2017-03-16  7:34 ` [PATCH v5 04/13] powerpc/perf: Add event attribute and group to IMC pmus Madhavan Srinivasan
@ 2017-03-16  7:34 ` Madhavan Srinivasan
  2017-03-16  7:35 ` [PATCH v5 06/13] powerpc/perf: IMC pmu cpumask and cpu hotplug support Madhavan Srinivasan
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:34 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

Since, the IMC counters' data are periodically fed to a memory location,
the functions to read/update, start/stop, add/del can be generic and can
be used by all IMC PMU units.

This patch adds a set of generic imc pmu related event functions to be
used  by each imc pmu unit. Add code to setup format attribute and to
register imc pmus. Add a event_init function for nest_imc events.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/imc-pmu.h        |   1 +
 arch/powerpc/perf/imc-pmu.c               | 121 ++++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/opal-imc.c |  30 +++++++-
 3 files changed, 148 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/imc-pmu.h b/arch/powerpc/include/asm/imc-pmu.h
index 323232248cc4..7b58721f840e 100644
--- a/arch/powerpc/include/asm/imc-pmu.h
+++ b/arch/powerpc/include/asm/imc-pmu.h
@@ -70,4 +70,5 @@ struct imc_pmu {
 
 #define UNKNOWN_DOMAIN		-1
 
+int imc_get_domain(struct device_node *pmu_dev);
 #endif /* PPC_POWERNV_IMC_PMU_DEF_H */
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 7b6ce500ddc5..f6f1ef9f56af 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -17,6 +17,116 @@
 struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
 struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
 
+/* Needed for sanity check */
+extern u64 nest_max_offset;
+
+PMU_FORMAT_ATTR(event, "config:0-20");
+static struct attribute *imc_format_attrs[] = {
+	&format_attr_event.attr,
+	NULL,
+};
+
+static struct attribute_group imc_format_group = {
+	.name = "format",
+	.attrs = imc_format_attrs,
+};
+
+static int nest_imc_event_init(struct perf_event *event)
+{
+	int chip_id;
+	u32 config = event->attr.config;
+	struct perchip_nest_info *pcni;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* Sampling not supported */
+	if (event->hw.sample_period)
+		return -EINVAL;
+
+	/* unsupported modes and filters */
+	if (event->attr.exclude_user   ||
+	    event->attr.exclude_kernel ||
+	    event->attr.exclude_hv     ||
+	    event->attr.exclude_idle   ||
+	    event->attr.exclude_host   ||
+	    event->attr.exclude_guest)
+		return -EINVAL;
+
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	/* Sanity check for config (event offset) */
+	if (config > nest_max_offset)
+		return -EINVAL;
+
+	chip_id = topology_physical_package_id(event->cpu);
+	pcni = &nest_perchip_info[chip_id];
+	event->hw.event_base = pcni->vbase[config/PAGE_SIZE] +
+							(config & ~PAGE_MASK);
+
+	return 0;
+}
+
+static void imc_read_counter(struct perf_event *event)
+{
+	u64 *addr, data;
+
+	addr = (u64 *)event->hw.event_base;
+	data = __be64_to_cpu(*addr);
+	local64_set(&event->hw.prev_count, data);
+}
+
+static void imc_perf_event_update(struct perf_event *event)
+{
+	u64 counter_prev, counter_new, final_count, *addr;
+
+	addr = (u64 *)event->hw.event_base;
+	counter_prev = local64_read(&event->hw.prev_count);
+	counter_new = __be64_to_cpu(*addr);
+	final_count = counter_new - counter_prev;
+
+	local64_set(&event->hw.prev_count, counter_new);
+	local64_add(final_count, &event->count);
+}
+
+static void imc_event_start(struct perf_event *event, int flags)
+{
+	imc_read_counter(event);
+}
+
+static void imc_event_stop(struct perf_event *event, int flags)
+{
+	imc_perf_event_update(event);
+}
+
+static int imc_event_add(struct perf_event *event, int flags)
+{
+	if (flags & PERF_EF_START)
+		imc_event_start(event, flags);
+
+	return 0;
+}
+
+/* update_pmu_ops : Populate the appropriate operations for "pmu" */
+static int update_pmu_ops(struct imc_pmu *pmu)
+{
+	if (!pmu)
+		return -EINVAL;
+
+	pmu->pmu.task_ctx_nr = perf_invalid_context;
+	pmu->pmu.event_init = nest_imc_event_init;
+	pmu->pmu.add = imc_event_add;
+	pmu->pmu.del = imc_event_stop;
+	pmu->pmu.start = imc_event_start;
+	pmu->pmu.stop = imc_event_stop;
+	pmu->pmu.read = imc_perf_event_update;
+	pmu->attr_groups[1] = &imc_format_group;
+	pmu->pmu.attr_groups = pmu->attr_groups;
+
+	return 0;
+}
+
 /* dev_str_attr : Populate event "name" and string "str" in attribute */
 static struct attribute *dev_str_attr(const char *name, const char *str)
 {
@@ -83,6 +193,17 @@ int init_imc_pmu(struct imc_events *events, int idx,
 	if (ret)
 		goto err_free;
 
+	ret = update_pmu_ops(pmu_ptr);
+	if (ret)
+		goto err_free;
+
+	ret = perf_pmu_register(&pmu_ptr->pmu, pmu_ptr->pmu.name, -1);
+	if (ret)
+		goto err_free;
+
+	pr_info("%s performance monitor hardware support registered\n",
+		pmu_ptr->pmu.name);
+
 	return 0;
 
 err_free:
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c
index a15e8e64dda0..894dbd17fd2f 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -36,6 +36,7 @@ extern struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
 
 extern int init_imc_pmu(struct imc_events *events,
 			int idx, struct imc_pmu *pmu_ptr);
+u64 nest_max_offset;
 
 static int imc_event_info(char *name, struct imc_events *events)
 {
@@ -68,8 +69,25 @@ static int imc_event_info_str(struct property *pp, char *name,
 	return 0;
 }
 
+/*
+ * Updates the maximum offset for an event in the pmu with domain
+ * "pmu_domain". Right now, only nest domain is supported.
+ */
+static void update_max_value(u32 value, int pmu_domain)
+{
+	switch (pmu_domain) {
+	case IMC_DOMAIN_NEST:
+		if (nest_max_offset < value)
+			nest_max_offset = value;
+		break;
+	default:
+		/* Unknown domain, return */
+		return;
+	}
+}
+
 static int imc_event_info_val(char *name, u32 val,
-			      struct imc_events *events)
+			      struct imc_events *events, int pmu_domain)
 {
 	int ret;
 
@@ -77,6 +95,7 @@ static int imc_event_info_val(char *name, u32 val,
 	if (ret)
 		return ret;
 	sprintf(events->ev_value, "event=0x%x", val);
+	update_max_value(val, pmu_domain);
 
 	return 0;
 }
@@ -113,7 +132,8 @@ static int imc_events_node_parser(struct device_node *dev,
 				  struct property *event_scale,
 				  struct property *event_unit,
 				  struct property *name_prefix,
-				  u32 reg)
+				  u32 reg,
+				  int pmu_domain)
 {
 	struct property *name, *pp;
 	char *ev_name;
@@ -158,7 +178,8 @@ static int imc_events_node_parser(struct device_node *dev,
 		if (strncmp(pp->name, "reg", 3) == 0) {
 			of_property_read_u32(dev, pp->name, &val);
 			val += reg;
-			ret = imc_event_info_val(ev_name, val, &events[idx]);
+			ret = imc_event_info_val(ev_name, val, &events[idx],
+				pmu_domain);
 			if (ret) {
 				kfree(events[idx].ev_name);
 				kfree(events[idx].ev_value);
@@ -365,7 +386,8 @@ static int imc_pmu_create(struct device_node *parent, int pmu_index)
 	/* Loop through event nodes */
 	for_each_child_of_node(dir, ev_node) {
 		ret = imc_events_node_parser(ev_node, &events[idx], scale_pp,
-					     unit_pp, name_prefix, reg);
+					     unit_pp, name_prefix, reg,
+					     pmu_ptr->domain);
 		if (ret < 0) {
 			/* Unable to parse this event */
 			if (ret == -ENOMEM)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 06/13] powerpc/perf: IMC pmu cpumask and cpu hotplug support
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (4 preceding siblings ...)
  2017-03-16  7:34 ` [PATCH v5 05/13] powerpc/perf: Generic imc pmu event functions Madhavan Srinivasan
@ 2017-03-16  7:35 ` Madhavan Srinivasan
  2017-03-23 11:52   ` Gautham R Shenoy
  2017-03-16  7:35 ` [PATCH v5 07/13] powerpc/powernv: Core IMC events detection Madhavan Srinivasan
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:35 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

Adds cpumask attribute to be used by each IMC pmu. Only one cpu (any
online CPU) from each chip for nest PMUs is designated to read counters.

On CPU hotplug, dying CPU is checked to see whether it is one of the
designated cpus, if yes, next online cpu from the same chip (for nest
units) is designated as new cpu to read counters. For this purpose, we
introduce a new state : CPUHP_AP_PERF_POWERPC_NEST_ONLINE.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/opal-api.h            |   3 +-
 arch/powerpc/include/asm/opal.h                |   3 +
 arch/powerpc/perf/imc-pmu.c                    | 163 ++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 include/linux/cpuhotplug.h                     |   1 +
 5 files changed, 169 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index a0aa285869b5..e1c3d4837857 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -168,7 +168,8 @@
 #define OPAL_INT_SET_MFRR			125
 #define OPAL_PCI_TCE_KILL			126
 #define OPAL_NMMU_SET_PTCR			127
-#define OPAL_LAST				127
+#define OPAL_NEST_IMC_COUNTERS_CONTROL		145
+#define OPAL_LAST				145
 
 /* Device tree flags */
 
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 1ff03a6da76e..d93d08204243 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -227,6 +227,9 @@ int64_t opal_pci_tce_kill(uint64_t phb_id, uint32_t kill_type,
 			  uint64_t dma_addr, uint32_t npages);
 int64_t opal_nmmu_set_ptcr(uint64_t chip_id, uint64_t ptcr);
 
+int64_t opal_nest_imc_counters_control(uint64_t mode, uint64_t value1,
+				uint64_t value2, uint64_t value3);
+
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
 				   int depth, void *data);
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index f6f1ef9f56af..e46ff6d2a584 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -16,6 +16,7 @@
 
 struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
 struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
+static cpumask_t nest_imc_cpumask;
 
 /* Needed for sanity check */
 extern u64 nest_max_offset;
@@ -31,6 +32,160 @@ static struct attribute_group imc_format_group = {
 	.attrs = imc_format_attrs,
 };
 
+/* Get the cpumask printed to a buffer "buf" */
+static ssize_t imc_pmu_cpumask_get_attr(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	cpumask_t *active_mask;
+
+	active_mask = &nest_imc_cpumask;
+	return cpumap_print_to_pagebuf(true, buf, active_mask);
+}
+
+static DEVICE_ATTR(cpumask, S_IRUGO, imc_pmu_cpumask_get_attr, NULL);
+
+static struct attribute *imc_pmu_cpumask_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+static struct attribute_group imc_pmu_cpumask_attr_group = {
+	.attrs = imc_pmu_cpumask_attrs,
+};
+
+/*
+ * nest_init : Initializes the nest imc engine for the current chip.
+ */
+static void nest_init(int *loc)
+{
+	int rc;
+
+	rc = opal_nest_imc_counters_control(NEST_IMC_PRODUCTION_MODE,
+					    NEST_IMC_ENGINE_START, 0, 0);
+	if (rc)
+		loc[smp_processor_id()] = 1;
+}
+
+static void nest_change_cpu_context(int old_cpu, int new_cpu)
+{
+	int i;
+
+	for (i = 0;
+	     (per_nest_pmu_arr[i] != NULL) && (i < IMC_MAX_PMUS); i++)
+		perf_pmu_migrate_context(&per_nest_pmu_arr[i]->pmu,
+							old_cpu, new_cpu);
+}
+
+static int ppc_nest_imc_cpu_online(unsigned int cpu)
+{
+	int nid, fcpu, ncpu;
+	struct cpumask *l_cpumask, tmp_mask;
+
+	/* Fint the cpumask of this node */
+	nid = cpu_to_node(cpu);
+	l_cpumask = cpumask_of_node(nid);
+
+	/*
+	 * If any of the cpu from this node is already present in the mask,
+	 * just return, if not, then set this cpu in the mask.
+	 */
+	if (!cpumask_and(&tmp_mask, l_cpumask, &nest_imc_cpumask)) {
+		cpumask_set_cpu(cpu, &nest_imc_cpumask);
+		return 0;
+	}
+
+	fcpu = cpumask_first(l_cpumask);
+	ncpu = cpumask_next(cpu, l_cpumask);
+	if (cpu == fcpu) {
+		if (cpumask_test_and_clear_cpu(ncpu, &nest_imc_cpumask)) {
+			cpumask_set_cpu(cpu, &nest_imc_cpumask);
+			nest_change_cpu_context(ncpu, cpu);
+		}
+	}
+
+	return 0;
+}
+
+static int ppc_nest_imc_cpu_offline(unsigned int cpu)
+{
+	int nid, target = -1;
+	struct cpumask *l_cpumask;
+
+	/*
+	 * Check in the designated list for this cpu. Dont bother
+	 * if not one of them.
+	 */
+	if (!cpumask_test_and_clear_cpu(cpu, &nest_imc_cpumask))
+		return 0;
+
+	/*
+	 * Now that this cpu is one of the designated,
+	 * find a next cpu a) which is online and b) in same chip.
+	 */
+	nid = cpu_to_node(cpu);
+	l_cpumask = cpumask_of_node(nid);
+	target = cpumask_next(cpu, l_cpumask);
+
+	/*
+	 * Update the cpumask with the target cpu and
+	 * migrate the context if needed
+	 */
+	if (target >= 0 && target <= nr_cpu_ids) {
+		cpumask_set_cpu(target, &nest_imc_cpumask);
+		nest_change_cpu_context(cpu, target);
+	}
+	return 0;
+}
+
+static int nest_pmu_cpumask_init(void)
+{
+	const struct cpumask *l_cpumask;
+	int cpu, nid;
+	int *cpus_opal_rc;
+
+	if (!cpumask_empty(&nest_imc_cpumask))
+		return 0;
+
+	/*
+	 * Nest PMUs are per-chip counters. So designate a cpu
+	 * from each chip for counter collection.
+	 */
+	for_each_online_node(nid) {
+		l_cpumask = cpumask_of_node(nid);
+
+		/* designate first online cpu in this node */
+		cpu = cpumask_first(l_cpumask);
+		cpumask_set_cpu(cpu, &nest_imc_cpumask);
+	}
+
+	/*
+	 * Memory for OPAL call return value.
+	 */
+	cpus_opal_rc = kzalloc((sizeof(int) * nr_cpu_ids), GFP_KERNEL);
+	if (!cpus_opal_rc)
+		goto fail;
+
+	/* Initialize Nest PMUs in each node using designated cpus */
+	on_each_cpu_mask(&nest_imc_cpumask, (smp_call_func_t)nest_init,
+						(void *)cpus_opal_rc, 1);
+
+	/* Check return value array for any OPAL call failure */
+	for_each_cpu(cpu, &nest_imc_cpumask) {
+		if (cpus_opal_rc[cpu])
+			goto fail;
+	}
+
+	cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
+			  "POWER_NEST_IMC_ONLINE",
+			  ppc_nest_imc_cpu_online,
+			  ppc_nest_imc_cpu_offline);
+
+	return 0;
+
+fail:
+	return -ENODEV;
+}
+
 static int nest_imc_event_init(struct perf_event *event)
 {
 	int chip_id;
@@ -63,7 +218,7 @@ static int nest_imc_event_init(struct perf_event *event)
 	chip_id = topology_physical_package_id(event->cpu);
 	pcni = &nest_perchip_info[chip_id];
 	event->hw.event_base = pcni->vbase[config/PAGE_SIZE] +
-							(config & ~PAGE_MASK);
+		(config & ~PAGE_MASK);
 
 	return 0;
 }
@@ -122,6 +277,7 @@ static int update_pmu_ops(struct imc_pmu *pmu)
 	pmu->pmu.stop = imc_event_stop;
 	pmu->pmu.read = imc_perf_event_update;
 	pmu->attr_groups[1] = &imc_format_group;
+	pmu->attr_groups[2] = &imc_pmu_cpumask_attr_group;
 	pmu->pmu.attr_groups = pmu->attr_groups;
 
 	return 0;
@@ -189,6 +345,11 @@ int init_imc_pmu(struct imc_events *events, int idx,
 {
 	int ret = -ENODEV;
 
+	/* Add cpumask and register for hotplug notification */
+	ret = nest_pmu_cpumask_init();
+	if (ret)
+		return ret;
+
 	ret = update_events_in_group(events, idx, pmu_ptr);
 	if (ret)
 		goto err_free;
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
index da8a0f7a035c..b7208d8e6cc0 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -301,3 +301,4 @@ OPAL_CALL(opal_int_eoi,				OPAL_INT_EOI);
 OPAL_CALL(opal_int_set_mfrr,			OPAL_INT_SET_MFRR);
 OPAL_CALL(opal_pci_tce_kill,			OPAL_PCI_TCE_KILL);
 OPAL_CALL(opal_nmmu_set_ptcr,			OPAL_NMMU_SET_PTCR);
+OPAL_CALL(opal_nest_imc_counters_control,	OPAL_NEST_IMC_COUNTERS_CONTROL);
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 62d240e962f0..cfb0cedc72af 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -136,6 +136,7 @@ enum cpuhp_state {
 	CPUHP_AP_PERF_ARM_CCI_ONLINE,
 	CPUHP_AP_PERF_ARM_CCN_ONLINE,
 	CPUHP_AP_PERF_ARM_L2X0_ONLINE,
+	CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
 	CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE,
 	CPUHP_AP_WORKQUEUE_ONLINE,
 	CPUHP_AP_RCUTREE_ONLINE,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 07/13] powerpc/powernv: Core IMC events detection
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (5 preceding siblings ...)
  2017-03-16  7:35 ` [PATCH v5 06/13] powerpc/perf: IMC pmu cpumask and cpu hotplug support Madhavan Srinivasan
@ 2017-03-16  7:35 ` Madhavan Srinivasan
  2017-03-16  7:35 ` [PATCH v5 08/13] powerpc/perf: PMU functions for Core IMC and hotplugging Madhavan Srinivasan
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:35 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

This patch adds support for detection of core IMC events along with the
Nest IMC events. It adds a new domain IMC_DOMAIN_CORE and its determined
with the help of the compatibility string "ibm,imc-counters-core" based
on the IMC device tree.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/imc-pmu.h        |  2 ++
 arch/powerpc/perf/imc-pmu.c               |  3 +++
 arch/powerpc/platforms/powernv/opal-imc.c | 18 ++++++++++++++++--
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/imc-pmu.h b/arch/powerpc/include/asm/imc-pmu.h
index 7b58721f840e..59de083ed6f3 100644
--- a/arch/powerpc/include/asm/imc-pmu.h
+++ b/arch/powerpc/include/asm/imc-pmu.h
@@ -30,6 +30,7 @@
 
 #define IMC_DTB_COMPAT		"ibm,opal-in-memory-counters"
 #define IMC_DTB_NEST_COMPAT	"ibm,imc-counters-nest"
+#define IMC_DTB_CORE_COMPAT	"ibm,imc-counters-core"
 
 /*
  * Structure to hold per chip specific memory address
@@ -67,6 +68,7 @@ struct imc_pmu {
  * Domains for IMC PMUs
  */
 #define IMC_DOMAIN_NEST		1
+#define IMC_DOMAIN_CORE		2
 
 #define UNKNOWN_DOMAIN		-1
 
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index e46ff6d2a584..9a0e3bc932ef 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -18,8 +18,11 @@ struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
 struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
 static cpumask_t nest_imc_cpumask;
 
+struct imc_pmu *core_imc_pmu;
+
 /* Needed for sanity check */
 extern u64 nest_max_offset;
+extern u64 core_max_offset;
 
 PMU_FORMAT_ATTR(event, "config:0-20");
 static struct attribute *imc_format_attrs[] = {
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c
index 894dbd17fd2f..8c52fffec9fe 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -33,10 +33,12 @@
 
 extern struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
 extern struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
+extern struct imc_pmu *core_imc_pmu;
 
 extern int init_imc_pmu(struct imc_events *events,
 			int idx, struct imc_pmu *pmu_ptr);
 u64 nest_max_offset;
+u64 core_max_offset;
 
 static int imc_event_info(char *name, struct imc_events *events)
 {
@@ -80,6 +82,10 @@ static void update_max_value(u32 value, int pmu_domain)
 		if (nest_max_offset < value)
 			nest_max_offset = value;
 		break;
+	case IMC_DOMAIN_CORE:
+		if (core_max_offset < value)
+			core_max_offset = value;
+		break;
 	default:
 		/* Unknown domain, return */
 		return;
@@ -231,6 +237,8 @@ int imc_get_domain(struct device_node *pmu_dev)
 {
 	if (of_device_is_compatible(pmu_dev, IMC_DTB_NEST_COMPAT))
 		return IMC_DOMAIN_NEST;
+	if (of_device_is_compatible(pmu_dev, IMC_DTB_CORE_COMPAT))
+		return IMC_DOMAIN_CORE;
 	else
 		return UNKNOWN_DOMAIN;
 }
@@ -298,7 +306,10 @@ static int imc_pmu_create(struct device_node *parent, int pmu_index)
 		goto free_pmu;
 
 	/* Needed for hotplug/migration */
-	per_nest_pmu_arr[pmu_index] = pmu_ptr;
+	if (pmu_ptr->domain == IMC_DOMAIN_CORE)
+		core_imc_pmu = pmu_ptr;
+	else if (pmu_ptr->domain == IMC_DOMAIN_NEST)
+		per_nest_pmu_arr[pmu_index] = pmu_ptr;
 
 	/*
 	 * "events" property inside a PMU node contains the phandle value
@@ -354,7 +365,10 @@ static int imc_pmu_create(struct device_node *parent, int pmu_index)
 	}
 
 	/* Save the name to register it later */
-	sprintf(buf, "nest_%s", (char *)pp->value);
+	if (pmu_ptr->domain == IMC_DOMAIN_NEST)
+		sprintf(buf, "nest_%s", (char *)pp->value);
+	else
+		sprintf(buf, "%s_imc", (char *)pp->value);
 	pmu_ptr->pmu.name = (char *)buf;
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 08/13] powerpc/perf: PMU functions for Core IMC and hotplugging
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (6 preceding siblings ...)
  2017-03-16  7:35 ` [PATCH v5 07/13] powerpc/powernv: Core IMC events detection Madhavan Srinivasan
@ 2017-03-16  7:35 ` Madhavan Srinivasan
  2017-03-23 13:09   ` Gautham R Shenoy
  2017-03-16  7:35 ` [PATCH v5 09/13] powerpc/powernv: Thread IMC events detection Madhavan Srinivasan
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:35 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

This patch adds the PMU function to initialize a core IMC event. It also
adds cpumask initialization function for core IMC PMU. For
initialization, a page of memory is allocated per core where the data
for core IMC counters will be accumulated. The base address for this
page is sent to OPAL via an OPAL call which initializes various SCOMs
related to Core IMC initialization. Upon any errors, the pages are
free'ed and core IMC counters are disabled using the same OPAL call.

For CPU hotplugging, a cpumask is initialized which contains an online
CPU from each core. If a cpu goes offline, we check whether that cpu
belongs to the core imc cpumask, if yes, then, we migrate the PMU
context to any other online cpu (if available) in that core. If a cpu
comes back online, then this cpu will be added to the core imc cpumask
only if there was no other cpu from that core in the previous cpumask.

To register the hotplug functions for core_imc, a new state
CPUHP_AP_PERF_POWERPC_COREIMC_ONLINE is added to the list of existing
states.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
[Anju: Changed the condition for setting cpumask for core
in imc_pmu_cpumask_get_attr() ]
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/imc-pmu.h             |   1 +
 arch/powerpc/include/asm/opal-api.h            |  10 +-
 arch/powerpc/include/asm/opal.h                |   2 +
 arch/powerpc/perf/imc-pmu.c                    | 248 ++++++++++++++++++++++++-
 arch/powerpc/platforms/powernv/opal-imc.c      |   4 +-
 arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
 include/linux/cpuhotplug.h                     |   1 +
 7 files changed, 257 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/imc-pmu.h b/arch/powerpc/include/asm/imc-pmu.h
index 59de083ed6f3..5e76cd06d6d8 100644
--- a/arch/powerpc/include/asm/imc-pmu.h
+++ b/arch/powerpc/include/asm/imc-pmu.h
@@ -21,6 +21,7 @@
 #define IMC_MAX_CHIPS			32
 #define IMC_MAX_PMUS			32
 #define IMC_MAX_PMU_NAME_LEN		256
+#define IMC_MAX_CORES			256
 
 #define NEST_IMC_ENGINE_START		1
 #define NEST_IMC_ENGINE_STOP		0
diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
index e1c3d4837857..70313b13afc6 100644
--- a/arch/powerpc/include/asm/opal-api.h
+++ b/arch/powerpc/include/asm/opal-api.h
@@ -169,7 +169,8 @@
 #define OPAL_PCI_TCE_KILL			126
 #define OPAL_NMMU_SET_PTCR			127
 #define OPAL_NEST_IMC_COUNTERS_CONTROL		145
-#define OPAL_LAST				145
+#define OPAL_CORE_IMC_COUNTERS_CONTROL		146
+#define OPAL_LAST				146
 
 /* Device tree flags */
 
@@ -929,6 +930,13 @@ enum {
 	OPAL_PCI_TCE_KILL_ALL,
 };
 
+/* Operation argument to Core IMC */
+enum {
+	OPAL_CORE_IMC_DISABLE,
+	OPAL_CORE_IMC_ENABLE,
+	OPAL_CORE_IMC_INIT,
+};
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __OPAL_API_H */
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index d93d08204243..c4baa6d32037 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -229,6 +229,8 @@ int64_t opal_nmmu_set_ptcr(uint64_t chip_id, uint64_t ptcr);
 
 int64_t opal_nest_imc_counters_control(uint64_t mode, uint64_t value1,
 				uint64_t value2, uint64_t value3);
+int64_t opal_core_imc_counters_control(uint64_t operation, uint64_t addr,
+				uint64_t value2, uint64_t value3);
 
 /* Internal functions */
 extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 9a0e3bc932ef..6d5fda9279c0 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -1,5 +1,5 @@
 /*
- * Nest Performance Monitor counter support.
+ * IMC Performance Monitor counter support.
  *
  * Copyright (C) 2016 Madhavan Srinivasan, IBM Corporation.
  *	     (C) 2016 Hemant K Shaw, IBM Corporation.
@@ -18,6 +18,9 @@ struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
 struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
 static cpumask_t nest_imc_cpumask;
 
+/* Maintains base addresses for all the cores */
+static u64 per_core_pdbar_add[IMC_MAX_CHIPS][IMC_MAX_CORES];
+static cpumask_t core_imc_cpumask;
 struct imc_pmu *core_imc_pmu;
 
 /* Needed for sanity check */
@@ -37,11 +40,18 @@ static struct attribute_group imc_format_group = {
 
 /* Get the cpumask printed to a buffer "buf" */
 static ssize_t imc_pmu_cpumask_get_attr(struct device *dev,
-				struct device_attribute *attr, char *buf)
+					struct device_attribute *attr,
+					char *buf)
 {
+	struct pmu *pmu = dev_get_drvdata(dev);
 	cpumask_t *active_mask;
 
-	active_mask = &nest_imc_cpumask;
+	if (!strncmp(pmu->name, "nest_", strlen("nest_")))
+		active_mask = &nest_imc_cpumask;
+	 else if (!strncmp(pmu->name, "core_", strlen("core_")))
+		active_mask = &core_imc_cpumask;
+	else
+		return 0;
 	return cpumap_print_to_pagebuf(true, buf, active_mask);
 }
 
@@ -57,6 +67,94 @@ static struct attribute_group imc_pmu_cpumask_attr_group = {
 };
 
 /*
+ * core_imc_mem_init : Initializes memory for the current core.
+ *
+ * Uses __get_free_pages() and uses the returned address as an argument to
+ * an opal call to configure the pdbar. The address sent as an argument is
+ * converted to physical address before the opal call is made. This is the
+ * base address at which the core imc counters are populated.
+ */
+static int core_imc_mem_init(void)
+{
+	int core_id, phys_id;
+	int rc = -1;
+
+	phys_id = topology_physical_package_id(smp_processor_id());
+	core_id = smp_processor_id() / threads_per_core;
+
+	per_core_pdbar_add[phys_id][core_id] =
+		(u64)__get_free_pages(GFP_KERNEL |  __GFP_ZERO, 0);
+	rc = opal_core_imc_counters_control(OPAL_CORE_IMC_INIT,
+				       (u64)virt_to_phys((void *)per_core_pdbar_add[phys_id][core_id]),
+		 0, 0);
+
+	return rc;
+}
+
+/*
+ * Calls core_imc_mem_init and checks the return value.
+ */
+static void core_imc_init(int *loc)
+{
+	int rc = 0;
+
+	rc = core_imc_mem_init();
+	if (rc)
+		loc[smp_processor_id()] = 1;
+}
+
+static void core_imc_change_cpu_context(int old_cpu, int new_cpu)
+{
+	if (!core_imc_pmu)
+		return;
+	perf_pmu_migrate_context(&core_imc_pmu->pmu, old_cpu, new_cpu);
+}
+
+
+static int ppc_core_imc_cpu_online(unsigned int cpu)
+{
+	int ret;
+
+	/* If a cpu for this core is already set, then, don't do anything */
+	ret = cpumask_any_and(&core_imc_cpumask,
+				 cpu_sibling_mask(cpu));
+	if (ret < nr_cpu_ids)
+		return 0;
+
+	/* Else, set the cpu in the mask, and change the context */
+	cpumask_set_cpu(cpu, &core_imc_cpumask);
+	core_imc_change_cpu_context(-1, cpu);
+	return 0;
+}
+
+static int ppc_core_imc_cpu_offline(unsigned int cpu)
+{
+	int target;
+	unsigned int ncpu;
+
+	/*
+	 * clear this cpu out of the mask, if not present in the mask,
+	 * don't bother doing anything.
+	 */
+	if (!cpumask_test_and_clear_cpu(cpu, &core_imc_cpumask))
+		return 0;
+
+	/* Find any online cpu in that core except the current "cpu" */
+	ncpu = cpumask_any_but(cpu_sibling_mask(cpu), cpu);
+
+	if (ncpu < nr_cpu_ids) {
+		target = ncpu;
+		cpumask_set_cpu(target, &core_imc_cpumask);
+	} else
+		target = -1;
+
+	/* migrate the context */
+	core_imc_change_cpu_context(cpu, target);
+
+	return 0;
+}
+
+/*
  * nest_init : Initializes the nest imc engine for the current chip.
  */
 static void nest_init(int *loc)
@@ -189,6 +287,86 @@ static int nest_pmu_cpumask_init(void)
 	return -ENODEV;
 }
 
+static void cleanup_core_imc_memory(void)
+{
+	int phys_id, core_id;
+	u64 addr;
+
+	phys_id = topology_physical_package_id(smp_processor_id());
+	core_id = smp_processor_id() / threads_per_core;
+
+	addr = per_core_pdbar_add[phys_id][core_id];
+
+	/* Only if the address is non-zero shall, we free it */
+	if (addr)
+		free_pages(addr, 0);
+}
+
+static void cleanup_all_core_imc_memory(void)
+{
+	on_each_cpu_mask(&core_imc_cpumask,
+			 (smp_call_func_t)cleanup_core_imc_memory, NULL, 1);
+}
+
+static void core_imc_control_disable(void)
+{
+	opal_core_imc_counters_control(OPAL_CORE_IMC_DISABLE, 0, 0, 0);
+}
+
+static void core_imc_disable(void)
+{
+	on_each_cpu_mask(&core_imc_cpumask,
+			 (smp_call_func_t)core_imc_control_disable, NULL, 1);
+}
+
+static int core_imc_pmu_cpumask_init(void)
+{
+	int cpu, *cpus_opal_rc;
+
+	/*
+	 * Get the mask of first online cpus for every core.
+	 */
+	core_imc_cpumask = cpu_online_cores_map();
+
+	/*
+	 * Memory for OPAL call return value.
+	 */
+	cpus_opal_rc = kzalloc((sizeof(int) * nr_cpu_ids), GFP_KERNEL);
+	if (!cpus_opal_rc)
+		goto fail;
+
+	/*
+	 * Initialize the core IMC PMU on each core using the
+	 * core_imc_cpumask by calling core_imc_init().
+	 */
+	on_each_cpu_mask(&core_imc_cpumask, (smp_call_func_t)core_imc_init,
+						(void *)cpus_opal_rc, 1);
+
+	/* Check return value array for any OPAL call failure */
+	for_each_cpu(cpu, &core_imc_cpumask) {
+		if (cpus_opal_rc[cpu]) {
+			kfree(cpus_opal_rc);
+			goto fail;
+		}
+	}
+
+	kfree(cpus_opal_rc);
+
+	cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_COREIMC_ONLINE,
+			  "POWER_CORE_IMC_ONLINE",
+			  ppc_core_imc_cpu_online,
+			  ppc_core_imc_cpu_offline);
+
+	return 0;
+
+fail:
+	/* First, disable the core imc engine */
+	core_imc_disable();
+	/* Then, free up the allocated pages */
+	cleanup_all_core_imc_memory();
+	return -ENODEV;
+}
+
 static int nest_imc_event_init(struct perf_event *event)
 {
 	int chip_id;
@@ -226,6 +404,44 @@ static int nest_imc_event_init(struct perf_event *event)
 	return 0;
 }
 
+static int core_imc_event_init(struct perf_event *event)
+{
+	int core_id, phys_id;
+	u64 config = event->attr.config;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* Sampling not supported */
+	if (event->hw.sample_period)
+		return -EINVAL;
+
+	/* unsupported modes and filters */
+	if (event->attr.exclude_user   ||
+	    event->attr.exclude_kernel ||
+	    event->attr.exclude_hv     ||
+	    event->attr.exclude_idle   ||
+	    event->attr.exclude_host   ||
+	    event->attr.exclude_guest)
+		return -EINVAL;
+
+	if (event->cpu < 0)
+		return -EINVAL;
+
+	event->hw.idx = -1;
+
+	/* Sanity check for config (event offset) */
+	if (config > core_max_offset)
+		return -EINVAL;
+
+	core_id = event->cpu / threads_per_core;
+	phys_id = topology_physical_package_id(event->cpu);
+	event->hw.event_base = per_core_pdbar_add[phys_id][core_id] +
+		(config & ~PAGE_MASK);
+
+	return 0;
+}
+
 static void imc_read_counter(struct perf_event *event)
 {
 	u64 *addr, data;
@@ -273,7 +489,11 @@ static int update_pmu_ops(struct imc_pmu *pmu)
 		return -EINVAL;
 
 	pmu->pmu.task_ctx_nr = perf_invalid_context;
-	pmu->pmu.event_init = nest_imc_event_init;
+	if (pmu->domain == IMC_DOMAIN_NEST) {
+		pmu->pmu.event_init = nest_imc_event_init;
+	} else if (pmu->domain == IMC_DOMAIN_CORE) {
+		pmu->pmu.event_init = core_imc_event_init;
+	}
 	pmu->pmu.add = imc_event_add;
 	pmu->pmu.del = imc_event_stop;
 	pmu->pmu.start = imc_event_start;
@@ -349,9 +569,20 @@ int init_imc_pmu(struct imc_events *events, int idx,
 	int ret = -ENODEV;
 
 	/* Add cpumask and register for hotplug notification */
-	ret = nest_pmu_cpumask_init();
-	if (ret)
-		return ret;
+	switch (pmu_ptr->domain) {
+	case IMC_DOMAIN_NEST:
+		ret = nest_pmu_cpumask_init();
+		if (ret)
+			return ret;
+		break;
+	case IMC_DOMAIN_CORE:
+		ret = core_imc_pmu_cpumask_init();
+		if (ret)
+			return ret;
+		break;
+	default:
+		return -1;  /* Unknown domain */
+	}
 
 	ret = update_events_in_group(events, idx, pmu_ptr);
 	if (ret)
@@ -376,6 +607,9 @@ int init_imc_pmu(struct imc_events *events, int idx,
 		kfree(pmu_ptr->attr_groups[0]->attrs);
 		kfree(pmu_ptr->attr_groups[0]);
 	}
+	/* For core_imc, we have allocated memory, we need to free it */
+	if (pmu_ptr->domain == IMC_DOMAIN_CORE)
+		cleanup_all_core_imc_memory();
 
 	return ret;
 }
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c
index 8c52fffec9fe..3d60549d006f 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -191,12 +191,12 @@ static int imc_events_node_parser(struct device_node *dev,
 				kfree(events[idx].ev_value);
 				continue;
 			}
+			idx++;
 			/*
 			 * If the common scale and unit properties available,
 			 * then, assign them to this event
 			 */
 			if (event_scale) {
-				idx++;
 				ret = set_event_property(event_scale, "scale",
 							 &events[idx],
 							 ev_name);
@@ -210,8 +210,8 @@ static int imc_events_node_parser(struct device_node *dev,
 							 ev_name);
 				if (ret)
 					continue;
+				idx++;
 			}
-			idx++;
 		} else if (strncmp(pp->name, "unit", 4) == 0) {
 			ret = set_event_property(pp, "unit", &events[idx],
 						 ev_name);
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
index b7208d8e6cc0..672d26ba94b7 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -302,3 +302,4 @@ OPAL_CALL(opal_int_set_mfrr,			OPAL_INT_SET_MFRR);
 OPAL_CALL(opal_pci_tce_kill,			OPAL_PCI_TCE_KILL);
 OPAL_CALL(opal_nmmu_set_ptcr,			OPAL_NMMU_SET_PTCR);
 OPAL_CALL(opal_nest_imc_counters_control,	OPAL_NEST_IMC_COUNTERS_CONTROL);
+OPAL_CALL(opal_core_imc_counters_control,	OPAL_CORE_IMC_COUNTERS_CONTROL);
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index cfb0cedc72af..abde85d9511a 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -137,6 +137,7 @@ enum cpuhp_state {
 	CPUHP_AP_PERF_ARM_CCN_ONLINE,
 	CPUHP_AP_PERF_ARM_L2X0_ONLINE,
 	CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
+	CPUHP_AP_PERF_POWERPC_COREIMC_ONLINE,
 	CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE,
 	CPUHP_AP_WORKQUEUE_ONLINE,
 	CPUHP_AP_RCUTREE_ONLINE,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 09/13] powerpc/powernv: Thread IMC events detection
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (7 preceding siblings ...)
  2017-03-16  7:35 ` [PATCH v5 08/13] powerpc/perf: PMU functions for Core IMC and hotplugging Madhavan Srinivasan
@ 2017-03-16  7:35 ` Madhavan Srinivasan
  2017-03-16  7:35 ` [PATCH v5 10/13] powerpc/perf: Thread IMC PMU functions Madhavan Srinivasan
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:35 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

Patch adds support for detection of thread IMC events. It adds a new
domain IMC_DOMAIN_THREAD and it is determined with the help of the
compatibility string "ibm,imc-counters-thread" based on the IMC device
tree.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/imc-pmu.h        |  2 ++
 arch/powerpc/perf/imc-pmu.c               |  1 +
 arch/powerpc/platforms/powernv/opal-imc.c | 11 +++++++++--
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/imc-pmu.h b/arch/powerpc/include/asm/imc-pmu.h
index 5e76cd06d6d8..f2b4f122e6c2 100644
--- a/arch/powerpc/include/asm/imc-pmu.h
+++ b/arch/powerpc/include/asm/imc-pmu.h
@@ -32,6 +32,7 @@
 #define IMC_DTB_COMPAT		"ibm,opal-in-memory-counters"
 #define IMC_DTB_NEST_COMPAT	"ibm,imc-counters-nest"
 #define IMC_DTB_CORE_COMPAT	"ibm,imc-counters-core"
+#define IMC_DTB_THREAD_COMPAT   "ibm,imc-counters-thread"
 
 /*
  * Structure to hold per chip specific memory address
@@ -70,6 +71,7 @@ struct imc_pmu {
  */
 #define IMC_DOMAIN_NEST		1
 #define IMC_DOMAIN_CORE		2
+#define IMC_DOMAIN_THREAD       3
 
 #define UNKNOWN_DOMAIN		-1
 
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 6d5fda9279c0..32eea6941e95 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -26,6 +26,7 @@ struct imc_pmu *core_imc_pmu;
 /* Needed for sanity check */
 extern u64 nest_max_offset;
 extern u64 core_max_offset;
+extern u64 thread_max_offset;
 
 PMU_FORMAT_ATTR(event, "config:0-20");
 static struct attribute *imc_format_attrs[] = {
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c
index 3d60549d006f..70f4b0924fae 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -39,6 +39,7 @@ extern int init_imc_pmu(struct imc_events *events,
 			int idx, struct imc_pmu *pmu_ptr);
 u64 nest_max_offset;
 u64 core_max_offset;
+u64 thread_max_offset;
 
 static int imc_event_info(char *name, struct imc_events *events)
 {
@@ -86,6 +87,10 @@ static void update_max_value(u32 value, int pmu_domain)
 		if (core_max_offset < value)
 			core_max_offset = value;
 		break;
+	case IMC_DOMAIN_THREAD:
+		if (thread_max_offset < value)
+			thread_max_offset = value;
+		break;
 	default:
 		/* Unknown domain, return */
 		return;
@@ -239,6 +244,8 @@ int imc_get_domain(struct device_node *pmu_dev)
 		return IMC_DOMAIN_NEST;
 	if (of_device_is_compatible(pmu_dev, IMC_DTB_CORE_COMPAT))
 		return IMC_DOMAIN_CORE;
+	if (of_device_is_compatible(pmu_dev, IMC_DTB_THREAD_COMPAT))
+		return IMC_DOMAIN_THREAD;
 	else
 		return UNKNOWN_DOMAIN;
 }
@@ -277,7 +284,7 @@ static void imc_free_events(struct imc_events *events, int nr_entries)
 /*
  * imc_pmu_create : Takes the parent device which is the pmu unit and a
  *                  pmu_index as the inputs.
- * Allocates memory for the pmu, sets up its domain (NEST or CORE), and
+ * Allocates memory for the pmu, sets up its domain (NEST/CORE/THREAD), and
  * allocates memory for the events supported by this pmu. Assigns a name for
  * the pmu. Calls imc_events_node_parser() to setup the individual events.
  * If everything goes fine, it calls, init_imc_pmu() to setup the pmu device
@@ -305,7 +312,7 @@ static int imc_pmu_create(struct device_node *parent, int pmu_index)
 	if (pmu_ptr->domain == UNKNOWN_DOMAIN)
 		goto free_pmu;
 
-	/* Needed for hotplug/migration */
+	/* Needed for hotplug/migration for nest and core IMC PMUs */
 	if (pmu_ptr->domain == IMC_DOMAIN_CORE)
 		core_imc_pmu = pmu_ptr;
 	else if (pmu_ptr->domain == IMC_DOMAIN_NEST)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v5 10/13] powerpc/perf: Thread IMC PMU functions
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (8 preceding siblings ...)
  2017-03-16  7:35 ` [PATCH v5 09/13] powerpc/powernv: Thread IMC events detection Madhavan Srinivasan
@ 2017-03-16  7:35 ` Madhavan Srinivasan
  2017-03-16  7:35 ` [PATCH 11/13] powerpc/powernv: Add device shutdown function for Core IMC Madhavan Srinivasan
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:35 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Hemant Kumar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian, Anju T Sudhakar,
	Madhavan Srinivasan

From: Hemant Kumar <hemant@linux.vnet.ibm.com>

This patch adds the PMU functions required for event initialization,
read, update, add, del etc. for thread IMC PMU. Thread IMC PMUs are used
for per-task monitoring. These PMUs don't need any hotplugging support.

For each CPU, a page of memory is allocated and is kept static i.e.,
these pages will exist till the machine shuts down. The base address of
this page is assigned to the ldbar of that cpu. As soon as we do that,
the thread IMC counters start running for that cpu and the data of these
counters are assigned to the page allocated. But we use this for
per-task monitoring. Whenever we start monitoring a task, the event is
added is onto the task. At that point, we read the initial value of the
event. Whenever, we stop monitoring the task, the final value is taken
and the difference is the event data.

Now, a task can move to a different cpu. Suppose a task X is moving from
cpu A to cpu B. When the task is scheduled out of A, we get an
event_del for A, and hence, the event data is updated. And, we stop
updating the X's event data. As soon as X moves on to B, event_add is
called for B, and we again update the event_data. And this is how it
keeps on updating the event data even when the task is scheduled on to
different cpus.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/imc-pmu.h |   4 +
 arch/powerpc/perf/imc-pmu.c        | 161 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 164 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/imc-pmu.h b/arch/powerpc/include/asm/imc-pmu.h
index f2b4f122e6c2..8b7141ba2f2b 100644
--- a/arch/powerpc/include/asm/imc-pmu.h
+++ b/arch/powerpc/include/asm/imc-pmu.h
@@ -22,6 +22,7 @@
 #define IMC_MAX_PMUS			32
 #define IMC_MAX_PMU_NAME_LEN		256
 #define IMC_MAX_CORES			256
+#define IMC_MAX_CPUS                    2048
 
 #define NEST_IMC_ENGINE_START		1
 #define NEST_IMC_ENGINE_STOP		0
@@ -34,6 +35,9 @@
 #define IMC_DTB_CORE_COMPAT	"ibm,imc-counters-core"
 #define IMC_DTB_THREAD_COMPAT   "ibm,imc-counters-thread"
 
+#define THREAD_IMC_LDBAR_MASK           0x0003ffffffffe000
+#define THREAD_IMC_ENABLE               0x8000000000000000
+
 /*
  * Structure to hold per chip specific memory address
  * information for nest pmus. Nest Counter data are exported
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 32eea6941e95..6fc1fbc0067c 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -23,6 +23,9 @@ static u64 per_core_pdbar_add[IMC_MAX_CHIPS][IMC_MAX_CORES];
 static cpumask_t core_imc_cpumask;
 struct imc_pmu *core_imc_pmu;
 
+/* Maintains base address for all the cpus */
+static u64 per_cpu_add[IMC_MAX_CPUS];
+
 /* Needed for sanity check */
 extern u64 nest_max_offset;
 extern u64 core_max_offset;
@@ -443,6 +446,56 @@ static int core_imc_event_init(struct perf_event *event)
 	return 0;
 }
 
+static int thread_imc_event_init(struct perf_event *event)
+{
+	struct task_struct *target;
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* Sampling not supported */
+	if (event->hw.sample_period)
+		return -EINVAL;
+
+	event->hw.idx = -1;
+
+	/* Sanity check for config (event offset) */
+	if (event->attr.config > thread_max_offset)
+		return -EINVAL;
+
+	target = event->hw.target;
+
+	if (!target)
+		return -EINVAL;
+
+	event->pmu->task_ctx_nr = perf_sw_context;
+	return 0;
+}
+
+static void thread_imc_read_counter(struct perf_event *event)
+{
+	u64 *addr, data;
+	int cpu_id = smp_processor_id();
+
+	addr = (u64 *)(per_cpu_add[cpu_id] + event->attr.config);
+	data = __be64_to_cpu(*addr);
+	local64_set(&event->hw.prev_count, data);
+}
+
+static void thread_imc_perf_event_update(struct perf_event *event)
+{
+	u64 counter_prev, counter_new, final_count, *addr;
+	int cpu_id = smp_processor_id();
+
+	addr = (u64 *)(per_cpu_add[cpu_id] + event->attr.config);
+	counter_prev = local64_read(&event->hw.prev_count);
+	counter_new = __be64_to_cpu(*addr);
+	final_count = counter_new - counter_prev;
+
+	local64_set(&event->hw.prev_count, counter_new);
+	local64_add(final_count, &event->count);
+}
+
 static void imc_read_counter(struct perf_event *event)
 {
 	u64 *addr, data;
@@ -483,6 +536,53 @@ static int imc_event_add(struct perf_event *event, int flags)
 	return 0;
 }
 
+static void thread_imc_event_start(struct perf_event *event, int flags)
+{
+	thread_imc_read_counter(event);
+}
+
+static void thread_imc_event_stop(struct perf_event *event, int flags)
+{
+	thread_imc_perf_event_update(event);
+}
+
+static void thread_imc_event_del(struct perf_event *event, int flags)
+{
+	thread_imc_perf_event_update(event);
+}
+
+static int thread_imc_event_add(struct perf_event *event, int flags)
+{
+	thread_imc_event_start(event, flags);
+
+	return 0;
+}
+
+static void thread_imc_pmu_start_txn(struct pmu *pmu,
+				     unsigned int txn_flags)
+{
+	if (txn_flags & ~PERF_PMU_TXN_ADD)
+		return;
+	perf_pmu_disable(pmu);
+}
+
+static void thread_imc_pmu_cancel_txn(struct pmu *pmu)
+{
+	perf_pmu_enable(pmu);
+}
+
+static int thread_imc_pmu_commit_txn(struct pmu *pmu)
+{
+	perf_pmu_enable(pmu);
+	return 0;
+}
+
+static void thread_imc_pmu_sched_task(struct perf_event_context *ctx,
+				  bool sched_in)
+{
+	return;
+}
+
 /* update_pmu_ops : Populate the appropriate operations for "pmu" */
 static int update_pmu_ops(struct imc_pmu *pmu)
 {
@@ -492,17 +592,31 @@ static int update_pmu_ops(struct imc_pmu *pmu)
 	pmu->pmu.task_ctx_nr = perf_invalid_context;
 	if (pmu->domain == IMC_DOMAIN_NEST) {
 		pmu->pmu.event_init = nest_imc_event_init;
+		pmu->attr_groups[2] = &imc_pmu_cpumask_attr_group;
 	} else if (pmu->domain == IMC_DOMAIN_CORE) {
 		pmu->pmu.event_init = core_imc_event_init;
+		pmu->attr_groups[2] = &imc_pmu_cpumask_attr_group;
 	}
+
 	pmu->pmu.add = imc_event_add;
 	pmu->pmu.del = imc_event_stop;
 	pmu->pmu.start = imc_event_start;
 	pmu->pmu.stop = imc_event_stop;
 	pmu->pmu.read = imc_perf_event_update;
 	pmu->attr_groups[1] = &imc_format_group;
-	pmu->attr_groups[2] = &imc_pmu_cpumask_attr_group;
 	pmu->pmu.attr_groups = pmu->attr_groups;
+	if (pmu->domain == IMC_DOMAIN_THREAD) {
+		pmu->pmu.event_init = thread_imc_event_init;
+		pmu->pmu.start = thread_imc_event_start;
+		pmu->pmu.add = thread_imc_event_add;
+		pmu->pmu.del = thread_imc_event_del;
+		pmu->pmu.stop = thread_imc_event_stop;
+		pmu->pmu.read = thread_imc_perf_event_update;
+		pmu->pmu.start_txn = thread_imc_pmu_start_txn;
+		pmu->pmu.cancel_txn = thread_imc_pmu_cancel_txn;
+		pmu->pmu.commit_txn = thread_imc_pmu_commit_txn;
+		pmu->pmu.sched_task = thread_imc_pmu_sched_task;
+	}
 
 	return 0;
 }
@@ -558,6 +672,44 @@ static int update_events_in_group(struct imc_events *events,
 	return 0;
 }
 
+static void cleanup_thread_imc_memory(void *dummy)
+{
+	int cpu_id = smp_processor_id();
+	u64 addr = per_cpu_add[cpu_id];
+
+	/* Only if the address is non-zero, shall we free it */
+	if (addr)
+		free_pages(addr, 0);
+}
+
+static void cleanup_all_thread_imc_memory(void)
+{
+	on_each_cpu(cleanup_thread_imc_memory, NULL, 1);
+}
+
+/*
+ * Allocates a page of memory for each of the online cpus, and, writes the
+ * physical base address of that page to the LDBAR for that cpu. This starts
+ * the thread IMC counters.
+ */
+static void thread_imc_mem_alloc(void *dummy)
+{
+	u64 ldbar_addr, ldbar_value;
+	int cpu_id = smp_processor_id();
+
+	per_cpu_add[cpu_id] = (u64)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+						    0);
+	ldbar_addr = (u64)virt_to_phys((void *)per_cpu_add[cpu_id]);
+	ldbar_value = (ldbar_addr & (u64)THREAD_IMC_LDBAR_MASK) |
+		(u64)THREAD_IMC_ENABLE;
+	mtspr(SPRN_LDBAR, ldbar_value);
+}
+
+void thread_imc_cpu_init(void)
+{
+	on_each_cpu(thread_imc_mem_alloc, NULL, 1);
+}
+
 /*
  * init_imc_pmu : Setup the IMC pmu device in "pmu_ptr" and its events
  *                "events".
@@ -581,6 +733,9 @@ int init_imc_pmu(struct imc_events *events, int idx,
 		if (ret)
 			return ret;
 		break;
+	case IMC_DOMAIN_THREAD:
+		thread_imc_cpu_init();
+		break;
 	default:
 		return -1;  /* Unknown domain */
 	}
@@ -612,5 +767,9 @@ int init_imc_pmu(struct imc_events *events, int idx,
 	if (pmu_ptr->domain == IMC_DOMAIN_CORE)
 		cleanup_all_core_imc_memory();
 
+	/* For thread_imc, we have allocated memory, we need to free it */
+	if (pmu_ptr->domain == IMC_DOMAIN_THREAD)
+		cleanup_all_thread_imc_memory();
+
 	return ret;
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 11/13] powerpc/powernv: Add device shutdown function for Core IMC
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (9 preceding siblings ...)
  2017-03-16  7:35 ` [PATCH v5 10/13] powerpc/perf: Thread IMC PMU functions Madhavan Srinivasan
@ 2017-03-16  7:35 ` Madhavan Srinivasan
  2017-03-16  7:35 ` [PATCH 12/13] powerpc/perf: Thread imc cpuhotplug support Madhavan Srinivasan
  2017-03-16  7:35 ` [PATCH 13/13] powerpc/perf: Enable/disable core engine during cpuhotplug Madhavan Srinivasan
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:35 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Madhavan Srinivasan,
	Gautham R . Shenoy, Balbir Singh, Benjamin Herrenschmidt,
	Paul Mackerras, Anton Blanchard, Sukadev Bhattiprolu,
	Michael Neuling, Stewart Smith, Daniel Axtens, Stephane Eranian,
	Anju T Sudhakar

Core In Memory Collection device programs the hardware
counters and have them runing always. But if the hardware
counter were not stopped at device shutdown (like kexec),
could lead to memory corruption. Patch to stop the hardware
counters via device "shutdown" callback.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/imc-pmu.h        |  2 ++
 arch/powerpc/perf/imc-pmu.c               | 12 +++++++++++-
 arch/powerpc/platforms/powernv/opal-imc.c |  9 +++++++++
 3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/imc-pmu.h b/arch/powerpc/include/asm/imc-pmu.h
index 8b7141ba2f2b..00f380fce1a5 100644
--- a/arch/powerpc/include/asm/imc-pmu.h
+++ b/arch/powerpc/include/asm/imc-pmu.h
@@ -80,4 +80,6 @@ struct imc_pmu {
 #define UNKNOWN_DOMAIN		-1
 
 int imc_get_domain(struct device_node *pmu_dev);
+void core_imc_disable(void);
+void thread_imc_disable(void);
 #endif /* PPC_POWERNV_IMC_PMU_DEF_H */
diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 6fc1fbc0067c..6802960db51c 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -317,7 +317,7 @@ static void core_imc_control_disable(void)
 	opal_core_imc_counters_control(OPAL_CORE_IMC_DISABLE, 0, 0, 0);
 }
 
-static void core_imc_disable(void)
+void core_imc_disable(void)
 {
 	on_each_cpu_mask(&core_imc_cpumask,
 			 (smp_call_func_t)core_imc_control_disable, NULL, 1);
@@ -710,6 +710,16 @@ void thread_imc_cpu_init(void)
 	on_each_cpu(thread_imc_mem_alloc, NULL, 1);
 }
 
+static void thread_imc_ldbar_disable(void *dummy)
+{
+        mtspr(SPRN_LDBAR, 0);
+}
+
+void thread_imc_disable(void)
+{
+        on_each_cpu(thread_imc_ldbar_disable, NULL, 1);
+}
+
 /*
  * init_imc_pmu : Setup the IMC pmu device in "pmu_ptr" and its events
  *                "events".
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c b/arch/powerpc/platforms/powernv/opal-imc.c
index 70f4b0924fae..2bc05ba19e3b 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -34,6 +34,8 @@
 extern struct perchip_nest_info nest_perchip_info[IMC_MAX_CHIPS];
 extern struct imc_pmu *per_nest_pmu_arr[IMC_MAX_PMUS];
 extern struct imc_pmu *core_imc_pmu;
+extern void core_imc_disable(void);
+extern void thread_imc_disable(void);
 
 extern int init_imc_pmu(struct imc_events *events,
 			int idx, struct imc_pmu *pmu_ptr);
@@ -532,6 +534,12 @@ static int opal_imc_counters_probe(struct platform_device *pdev)
 	return -ENODEV;
 }
 
+static void opal_imc_counters_shutdown(struct platform_device *pdev)
+{
+	core_imc_disable();
+	thread_imc_disable();
+}
+
 static const struct of_device_id opal_imc_match[] = {
 	{ .compatible = IMC_DTB_COMPAT },
 	{},
@@ -543,6 +551,7 @@ static struct platform_driver opal_imc_driver = {
 		.of_match_table = opal_imc_match,
 	},
 	.probe = opal_imc_counters_probe,
+	.shutdown = opal_imc_counters_shutdown,
 };
 
 MODULE_DEVICE_TABLE(of, opal_imc_match);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 12/13] powerpc/perf: Thread imc cpuhotplug support
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (10 preceding siblings ...)
  2017-03-16  7:35 ` [PATCH 11/13] powerpc/powernv: Add device shutdown function for Core IMC Madhavan Srinivasan
@ 2017-03-16  7:35 ` Madhavan Srinivasan
  2017-03-23 17:15   ` Gautham R Shenoy
  2017-03-16  7:35 ` [PATCH 13/13] powerpc/perf: Enable/disable core engine during cpuhotplug Madhavan Srinivasan
  12 siblings, 1 reply; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:35 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Anju T Sudhakar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian,
	Madhavan Srinivasan

From: Anju T Sudhakar <anju@linux.vnet.ibm.com>

This patch adds support for thread IMC on cpuhotplug.

When a cpu goes offline, the LDBAR for that cpu is disabled, and when it comes
back online the previous ldbar value is written back to the LDBAR for that cpu.

To register the hotplug functions for thread_imc, a new state
CPUHP_AP_PERF_POWERPC_THREADIMC_ONLINE is added to the list of existing
states.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/perf/imc-pmu.c | 33 ++++++++++++++++++++++++++++-----
 include/linux/cpuhotplug.h  |  1 +
 2 files changed, 29 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 6802960db51c..2ff39fe2a5ce 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -687,6 +687,16 @@ static void cleanup_all_thread_imc_memory(void)
 	on_each_cpu(cleanup_thread_imc_memory, NULL, 1);
 }
 
+static void thread_imc_update_ldbar(unsigned int cpu_id)
+{
+	u64 ldbar_addr, ldbar_value;
+
+	ldbar_addr = (u64)virt_to_phys((void *)per_cpu_add[cpu_id]);
+	ldbar_value = (ldbar_addr & (u64)THREAD_IMC_LDBAR_MASK) |
+			(u64)THREAD_IMC_ENABLE;
+	mtspr(SPRN_LDBAR, ldbar_value);
+}
+
 /*
  * Allocates a page of memory for each of the online cpus, and, writes the
  * physical base address of that page to the LDBAR for that cpu. This starts
@@ -694,20 +704,33 @@ static void cleanup_all_thread_imc_memory(void)
  */
 static void thread_imc_mem_alloc(void *dummy)
 {
-	u64 ldbar_addr, ldbar_value;
 	int cpu_id = smp_processor_id();
 
 	per_cpu_add[cpu_id] = (u64)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
 						    0);
-	ldbar_addr = (u64)virt_to_phys((void *)per_cpu_add[cpu_id]);
-	ldbar_value = (ldbar_addr & (u64)THREAD_IMC_LDBAR_MASK) |
-		(u64)THREAD_IMC_ENABLE;
-	mtspr(SPRN_LDBAR, ldbar_value);
+	thread_imc_update_ldbar(cpu_id);
+}
+
+static int ppc_thread_imc_cpu_online(unsigned int cpu)
+{
+	thread_imc_update_ldbar(cpu);
+	return 0;
+
+}
+
+static int ppc_thread_imc_cpu_offline(unsigned int cpu)
+{
+	mtspr(SPRN_LDBAR, 0);
+	return 0;
 }
 
 void thread_imc_cpu_init(void)
 {
 	on_each_cpu(thread_imc_mem_alloc, NULL, 1);
+	cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_THREADIMC_ONLINE,
+			  "POWER_THREAD_IMC_ONLINE",
+			   ppc_thread_imc_cpu_online,
+			   ppc_thread_imc_cpu_offline);
 }
 
 static void thread_imc_ldbar_disable(void *dummy)
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index abde85d9511a..724df46b2c3c 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -138,6 +138,7 @@ enum cpuhp_state {
 	CPUHP_AP_PERF_ARM_L2X0_ONLINE,
 	CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
 	CPUHP_AP_PERF_POWERPC_COREIMC_ONLINE,
+	CPUHP_AP_PERF_POWERPC_THREADIMC_ONLINE,
 	CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE,
 	CPUHP_AP_WORKQUEUE_ONLINE,
 	CPUHP_AP_RCUTREE_ONLINE,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 13/13] powerpc/perf: Enable/disable core engine during cpuhotplug
  2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
                   ` (11 preceding siblings ...)
  2017-03-16  7:35 ` [PATCH 12/13] powerpc/perf: Thread imc cpuhotplug support Madhavan Srinivasan
@ 2017-03-16  7:35 ` Madhavan Srinivasan
  12 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-16  7:35 UTC (permalink / raw)
  To: mpe
  Cc: linuxppc-dev, linux-kernel, Anju T Sudhakar, Gautham R . Shenoy,
	Balbir Singh, Benjamin Herrenschmidt, Paul Mackerras,
	Anton Blanchard, Sukadev Bhattiprolu, Michael Neuling,
	Stewart Smith, Daniel Axtens, Stephane Eranian,
	Madhavan Srinivasan

From: Anju T Sudhakar <anju@linux.vnet.ibm.com>

This patch disables the core imc engine when we offline all the cpus available in
a core. Also it enables core imc when any of the cpu in that core comes back.
Enable/disable core imc is done through the opal calls OPAL_CORE_IMC_ENABLE
and OPAL_CORE_IMC_DISABLE respectively.

Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/perf/imc-pmu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
index 2ff39fe2a5ce..278c7a427b43 100644
--- a/arch/powerpc/perf/imc-pmu.c
+++ b/arch/powerpc/perf/imc-pmu.c
@@ -127,6 +127,7 @@ static int ppc_core_imc_cpu_online(unsigned int cpu)
 
 	/* Else, set the cpu in the mask, and change the context */
 	cpumask_set_cpu(cpu, &core_imc_cpumask);
+	opal_core_imc_counters_control(OPAL_CORE_IMC_ENABLE, 0, 0, 0);
 	core_imc_change_cpu_context(-1, cpu);
 	return 0;
 }
@@ -149,8 +150,10 @@ static int ppc_core_imc_cpu_offline(unsigned int cpu)
 	if (ncpu < nr_cpu_ids) {
 		target = ncpu;
 		cpumask_set_cpu(target, &core_imc_cpumask);
-	} else
+	} else {
+		opal_core_imc_counters_control(OPAL_CORE_IMC_DISABLE, 0, 0, 0);
 		target = -1;
+	}
 
 	/* migrate the context */
 	core_imc_change_cpu_context(cpu, target);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 06/13] powerpc/perf: IMC pmu cpumask and cpu hotplug support
  2017-03-16  7:35 ` [PATCH v5 06/13] powerpc/perf: IMC pmu cpumask and cpu hotplug support Madhavan Srinivasan
@ 2017-03-23 11:52   ` Gautham R Shenoy
  2017-03-27 10:34     ` Anju T Sudhakar
  0 siblings, 1 reply; 20+ messages in thread
From: Gautham R Shenoy @ 2017-03-23 11:52 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: mpe, linuxppc-dev, linux-kernel, Hemant Kumar,
	Gautham R . Shenoy, Balbir Singh, Benjamin Herrenschmidt,
	Paul Mackerras, Anton Blanchard, Sukadev Bhattiprolu,
	Michael Neuling, Stewart Smith, Daniel Axtens, Stephane Eranian,
	Anju T Sudhakar

Hi Hemant, Maddy, 

On Thu, Mar 16, 2017 at 01:05:00PM +0530, Madhavan Srinivasan wrote:
> From: Hemant Kumar <hemant@linux.vnet.ibm.com>
> 
> Adds cpumask attribute to be used by each IMC pmu. Only one cpu (any
> online CPU) from each chip for nest PMUs is designated to read counters.
> 
> On CPU hotplug, dying CPU is checked to see whether it is one of the
> designated cpus, if yes, next online cpu from the same chip (for nest
> units) is designated as new cpu to read counters. For this purpose, we
> introduce a new state : CPUHP_AP_PERF_POWERPC_NEST_ONLINE.
> 
> Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Anton Blanchard <anton@samba.org>
> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Cc: Michael Neuling <mikey@neuling.org>
> Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
> Cc: Daniel Axtens <dja@axtens.net>
> Cc: Stephane Eranian <eranian@google.com>
> Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
> Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/opal-api.h            |   3 +-
>  arch/powerpc/include/asm/opal.h                |   3 +
>  arch/powerpc/perf/imc-pmu.c                    | 163 ++++++++++++++++++++++++-
>  arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
>  include/linux/cpuhotplug.h                     |   1 +
>  5 files changed, 169 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
> index a0aa285869b5..e1c3d4837857 100644
> --- a/arch/powerpc/include/asm/opal-api.h
> +++ b/arch/powerpc/include/asm/opal-api.h
> @@ -168,7 +168,8 @@
>  #define OPAL_INT_SET_MFRR			125
>  #define OPAL_PCI_TCE_KILL			126
>  #define OPAL_NMMU_SET_PTCR			127
> -#define OPAL_LAST				127
> +#define OPAL_NEST_IMC_COUNTERS_CONTROL		145
> +#define OPAL_LAST				145
> 
>  /* Device tree flags */
> 
> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
> index 1ff03a6da76e..d93d08204243 100644
> --- a/arch/powerpc/include/asm/opal.h
> +++ b/arch/powerpc/include/asm/opal.h
> @@ -227,6 +227,9 @@ int64_t opal_pci_tce_kill(uint64_t phb_id, uint32_t kill_type,
>  			  uint64_t dma_addr, uint32_t npages);
>  int64_t opal_nmmu_set_ptcr(uint64_t chip_id, uint64_t ptcr);
> 
> +int64_t opal_nest_imc_counters_control(uint64_t mode, uint64_t value1,
> +				uint64_t value2, uint64_t value3);
> +
>  /* Internal functions */
>  extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
>  				   int depth, void *data);
> diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
> index f6f1ef9f56af..e46ff6d2a584 100644
> --- a/arch/powerpc/perf/imc-pmu.c
> +++ b/arch/powerpc/perf/imc-pmu.c
> @@ -16,6 +16,7 @@
> +static int ppc_nest_imc_cpu_online(unsigned int cpu)
> +{

I take it that 'cpu' is coming online.

> +	int nid, fcpu, ncpu;
> +	struct cpumask *l_cpumask, tmp_mask;
> +
> +	/* Fint the cpumask of this node */
> +	nid = cpu_to_node(cpu);
> +	l_cpumask = cpumask_of_node(nid);
> +
> +	/*
> +	 * If any of the cpu from this node is already present in the mask,
> +	 * just return, if not, then set this cpu in the mask.
> +	 */
> +	if (!cpumask_and(&tmp_mask, l_cpumask, &nest_imc_cpumask)) {

In this case, none of the cpus in the node are in the mask. So we set
and this cpu in the imc cpumask and return.

> +		cpumask_set_cpu(cpu, &nest_imc_cpumask);
> +		return 0;
> +	}

But this case implies that there is already a CPU from the node which
is in the imc_cpumask. As per the comment above, we are supposed to
just return. So why are we doing the following ?

Either the comment above is incorrect or I am missing something here.

> +
> +	fcpu = cpumask_first(l_cpumask);
> +	ncpu = cpumask_next(cpu, l_cpumask);
> +	if (cpu == fcpu) {
> +		if (cpumask_test_and_clear_cpu(ncpu, &nest_imc_cpumask)) {
> +			cpumask_set_cpu(cpu, &nest_imc_cpumask);
> +			nest_change_cpu_context(ncpu, cpu);
> +		}
> +	}

It seems that we want to set only the smallest online cpu in the node
in the nest_imc_cpumask. So, if the newly onlined cpu is the smallest,
we replace the previous representative with cpu.

So, the comment above needs to be fixed.

> +
> +	return 0;
> +}
> +
> +static int ppc_nest_imc_cpu_offline(unsigned int cpu)
> +{
> +	int nid, target = -1;
> +	struct cpumask *l_cpumask;
> +
> +	/*
> +	 * Check in the designated list for this cpu. Dont bother
> +	 * if not one of them.
> +	 */
> +	if (!cpumask_test_and_clear_cpu(cpu, &nest_imc_cpumask))
> +		return 0;
> +
> +	/*
> +	 * Now that this cpu is one of the designated,
> +	 * find a next cpu a) which is online and b) in same chip.
> +	 */
> +	nid = cpu_to_node(cpu);
> +	l_cpumask = cpumask_of_node(nid);
> +	target = cpumask_next(cpu, l_cpumask);
> +
> +	/*
> +	 * Update the cpumask with the target cpu and
> +	 * migrate the context if needed
> +	 */
> +	if (target >= 0 && target <= nr_cpu_ids) {
> +		cpumask_set_cpu(target, &nest_imc_cpumask);
> +		nest_change_cpu_context(cpu, target);
> +	}
> +	return 0;
> +}
> +
> +static int nest_pmu_cpumask_init(void)
> +{
> +	const struct cpumask *l_cpumask;
> +	int cpu, nid;
> +	int *cpus_opal_rc;
> +
> +	if (!cpumask_empty(&nest_imc_cpumask))
> +		return 0;
> +

	
> +	/*
> +	 * Nest PMUs are per-chip counters. So designate a cpu
> +	 * from each chip for counter collection.
> +	 */
> +	for_each_online_node(nid) {
> +		l_cpumask = cpumask_of_node(nid);
> +
> +		/* designate first online cpu in this node */
> +		cpu = cpumask_first(l_cpumask);
> +		cpumask_set_cpu(cpu, &nest_imc_cpumask);
> +	}
> +


What happens if a cpu in nest_imc_cpumask goes offline at this point ?

Is that possible ?

> +	/*
> +	 * Memory for OPAL call return value.
> +	 */
> +	cpus_opal_rc = kzalloc((sizeof(int) * nr_cpu_ids), GFP_KERNEL);
> +	if (!cpus_opal_rc)
> +		goto fail;
> +
> +	/* Initialize Nest PMUs in each node using designated cpus */
> +	on_each_cpu_mask(&nest_imc_cpumask, (smp_call_func_t)nest_init,
> +						(void *)cpus_opal_rc, 1);
> +
> +	/* Check return value array for any OPAL call failure */
> +	for_each_cpu(cpu, &nest_imc_cpumask) {
> +		if (cpus_opal_rc[cpu])
> +			goto fail;
> +	}
> +
> +	cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
> +			  "POWER_NEST_IMC_ONLINE",
> +			  ppc_nest_imc_cpu_online,
> +			  ppc_nest_imc_cpu_offline);
> +
> +	return 0;
> +
> +fail:
> +	return -ENODEV;
> +}
> +
>  static int nest_imc_event_init(struct perf_event *event)
>  {
>  	int chip_id;
> @@ -63,7 +218,7 @@ static int nest_imc_event_init(struct perf_event *event)
>  	chip_id = topology_physical_package_id(event->cpu);
>  	pcni = &nest_perchip_info[chip_id];
>  	event->hw.event_base = pcni->vbase[config/PAGE_SIZE] +
> -							(config & ~PAGE_MASK);
> +		(config & ~PAGE_MASK);
> 
>  	return 0;
>  }
> @@ -122,6 +277,7 @@ static int update_pmu_ops(struct imc_pmu *pmu)
>  	pmu->pmu.stop = imc_event_stop;
>  	pmu->pmu.read = imc_perf_event_update;
>  	pmu->attr_groups[1] = &imc_format_group;
> +	pmu->attr_groups[2] = &imc_pmu_cpumask_attr_group;
>  	pmu->pmu.attr_groups = pmu->attr_groups;
> 
>  	return 0;
> @@ -189,6 +345,11 @@ int init_imc_pmu(struct imc_events *events, int idx,
>  {
>  	int ret = -ENODEV;
> 
> +	/* Add cpumask and register for hotplug notification */
> +	ret = nest_pmu_cpumask_init();
> +	if (ret)
> +		return ret;
> +
>  	ret = update_events_in_group(events, idx, pmu_ptr);
>  	if (ret)
>  		goto err_free;
> diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
> index da8a0f7a035c..b7208d8e6cc0 100644
> --- a/arch/powerpc/platforms/powernv/opal-wrappers.S
> +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
> @@ -301,3 +301,4 @@ OPAL_CALL(opal_int_eoi,				OPAL_INT_EOI);
>  OPAL_CALL(opal_int_set_mfrr,			OPAL_INT_SET_MFRR);
>  OPAL_CALL(opal_pci_tce_kill,			OPAL_PCI_TCE_KILL);
>  OPAL_CALL(opal_nmmu_set_ptcr,			OPAL_NMMU_SET_PTCR);
> +OPAL_CALL(opal_nest_imc_counters_control,	OPAL_NEST_IMC_COUNTERS_CONTROL);
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index 62d240e962f0..cfb0cedc72af 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -136,6 +136,7 @@ enum cpuhp_state {
>  	CPUHP_AP_PERF_ARM_CCI_ONLINE,
>  	CPUHP_AP_PERF_ARM_CCN_ONLINE,
>  	CPUHP_AP_PERF_ARM_L2X0_ONLINE,
> +	CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
>  	CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE,
>  	CPUHP_AP_WORKQUEUE_ONLINE,
>  	CPUHP_AP_RCUTREE_ONLINE,
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 08/13] powerpc/perf: PMU functions for Core IMC and hotplugging
  2017-03-16  7:35 ` [PATCH v5 08/13] powerpc/perf: PMU functions for Core IMC and hotplugging Madhavan Srinivasan
@ 2017-03-23 13:09   ` Gautham R Shenoy
  2017-03-28  4:41     ` Madhavan Srinivasan
  0 siblings, 1 reply; 20+ messages in thread
From: Gautham R Shenoy @ 2017-03-23 13:09 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: mpe, linuxppc-dev, linux-kernel, Hemant Kumar,
	Gautham R . Shenoy, Balbir Singh, Benjamin Herrenschmidt,
	Paul Mackerras, Anton Blanchard, Sukadev Bhattiprolu,
	Michael Neuling, Stewart Smith, Daniel Axtens, Stephane Eranian,
	Anju T Sudhakar

Hi Maddy, Hemant, Anju,

On Thu, Mar 16, 2017 at 01:05:02PM +0530, Madhavan Srinivasan wrote:

[..snip..]

> +
> +static void core_imc_change_cpu_context(int old_cpu, int new_cpu)
> +{
> +	if (!core_imc_pmu)
> +		return;
> +	perf_pmu_migrate_context(&core_imc_pmu->pmu, old_cpu, new_cpu);
> +}
> +
> +
> +static int ppc_core_imc_cpu_online(unsigned int cpu)
> +{
> +	int ret;
> +
> +	/* If a cpu for this core is already set, then, don't do anything */
> +	ret = cpumask_any_and(&core_imc_cpumask,
> +				 cpu_sibling_mask(cpu));
> +	if (ret < nr_cpu_ids)
> +		return 0;
> +
> +	/* Else, set the cpu in the mask, and change the context */
> +	cpumask_set_cpu(cpu, &core_imc_cpumask);
> +	core_imc_change_cpu_context(-1, cpu);

So, in the core case, we are ok as long as any cpu in the core is
present in the imc_cpumask. It need not have to be the smallest online
cpu in the core.

Can the same logic be applied to the earlier nest case ?

We can have a single function for cpu_offline and cpu_online which
implements these checks and sets the cpu bit if required.

ppc_entity_imc_cpu_offline(unsigned int cpu, cpumask_t
			   entity_imc_mask,
			   entity_imc_change_cpu_context_fn)
{
	.
	.
	.
	
}


static ppc_nest_imc_cpu_offline(unsigned int cpu)
{
	return ppc_entity_imc_cpu_offline(cpu, nest_imc_mask,
					  nest_imc_change_cpu_context);
}

And similar ones for core imc and thread imc.

Does this sound reasonable ?

> +	return 0;
> +}
> +
> +static int ppc_core_imc_cpu_offline(unsigned int cpu)
> +{
> +	int target;
> +	unsigned int ncpu;
> +
> +	/*
> +	 * clear this cpu out of the mask, if not present in the mask,
> +	 * don't bother doing anything.
> +	 */
> +	if (!cpumask_test_and_clear_cpu(cpu, &core_imc_cpumask))
> +		return 0;
> +
> +	/* Find any online cpu in that core except the current "cpu" */
> +	ncpu = cpumask_any_but(cpu_sibling_mask(cpu), cpu);
> +
> +	if (ncpu < nr_cpu_ids) {
> +		target = ncpu;
> +		cpumask_set_cpu(target, &core_imc_cpumask);
> +	} else
> +		target = -1;
> +
> +	/* migrate the context */
> +	core_imc_change_cpu_context(cpu, target);
> +
> +	return 0;
> +}
> +

--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 12/13] powerpc/perf: Thread imc cpuhotplug support
  2017-03-16  7:35 ` [PATCH 12/13] powerpc/perf: Thread imc cpuhotplug support Madhavan Srinivasan
@ 2017-03-23 17:15   ` Gautham R Shenoy
  2017-03-27 10:35     ` Anju T Sudhakar
  0 siblings, 1 reply; 20+ messages in thread
From: Gautham R Shenoy @ 2017-03-23 17:15 UTC (permalink / raw)
  To: Madhavan Srinivasan
  Cc: mpe, linuxppc-dev, linux-kernel, Anju T Sudhakar,
	Gautham R . Shenoy, Balbir Singh, Benjamin Herrenschmidt,
	Paul Mackerras, Anton Blanchard, Sukadev Bhattiprolu,
	Michael Neuling, Stewart Smith, Daniel Axtens, Stephane Eranian

Hi Maddy, Anju,

On Thu, Mar 16, 2017 at 01:05:06PM +0530, Madhavan Srinivasan wrote:
> From: Anju T Sudhakar <anju@linux.vnet.ibm.com>
> 
> This patch adds support for thread IMC on cpuhotplug.
> 
> When a cpu goes offline, the LDBAR for that cpu is disabled, and when it comes
> back online the previous ldbar value is written back to the LDBAR for that cpu.
> 
> To register the hotplug functions for thread_imc, a new state
> CPUHP_AP_PERF_POWERPC_THREADIMC_ONLINE is added to the list of existing
> states.
> 
> Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Anton Blanchard <anton@samba.org>
> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Cc: Michael Neuling <mikey@neuling.org>
> Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
> Cc: Daniel Axtens <dja@axtens.net>
> Cc: Stephane Eranian <eranian@google.com>
> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
>  arch/powerpc/perf/imc-pmu.c | 33 ++++++++++++++++++++++++++++-----
>  include/linux/cpuhotplug.h  |  1 +
>  2 files changed, 29 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
> index 6802960db51c..2ff39fe2a5ce 100644
> --- a/arch/powerpc/perf/imc-pmu.c
> +++ b/arch/powerpc/perf/imc-pmu.c
> @@ -687,6 +687,16 @@ static void cleanup_all_thread_imc_memory(void)
>  	on_each_cpu(cleanup_thread_imc_memory, NULL, 1);
>  }
> 
> +static void thread_imc_update_ldbar(unsigned int cpu_id)
> +{
> +	u64 ldbar_addr, ldbar_value;
> +
> +	ldbar_addr = (u64)virt_to_phys((void *)per_cpu_add[cpu_id]);
> +	ldbar_value = (ldbar_addr & (u64)THREAD_IMC_LDBAR_MASK) |
> +			(u64)THREAD_IMC_ENABLE;
> +	mtspr(SPRN_LDBAR, ldbar_value);
> +}
> +
>  /*
>   * Allocates a page of memory for each of the online cpus, and, writes the
>   * physical base address of that page to the LDBAR for that cpu. This starts
> @@ -694,20 +704,33 @@ static void cleanup_all_thread_imc_memory(void)
>   */
>  static void thread_imc_mem_alloc(void *dummy)
>  {
> -	u64 ldbar_addr, ldbar_value;
>  	int cpu_id = smp_processor_id();
> 
>  	per_cpu_add[cpu_id] = (u64)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
>  						    0);
> -	ldbar_addr = (u64)virt_to_phys((void *)per_cpu_add[cpu_id]);
> -	ldbar_value = (ldbar_addr & (u64)THREAD_IMC_LDBAR_MASK) |
> -		(u64)THREAD_IMC_ENABLE;
> -	mtspr(SPRN_LDBAR, ldbar_value);
> +	thread_imc_update_ldbar(cpu_id);
> +}
> +
> +static int ppc_thread_imc_cpu_online(unsigned int cpu)
> +{
> +	thread_imc_update_ldbar(cpu);
> +	return 0;
> +
> +}
> +
> +static int ppc_thread_imc_cpu_offline(unsigned int cpu)
> +{
> +	mtspr(SPRN_LDBAR, 0);
> +	return 0;
>  }

This patch looks ok to me.

So it appears that in case of a full-core deep stop entry/exit you
will need to save/restore LDBAR as well. But I will take it up for the
next set of stop cleanups.

For this patch,
Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

> 
>  void thread_imc_cpu_init(void)
>  {
>  	on_each_cpu(thread_imc_mem_alloc, NULL, 1);
> +	cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_THREADIMC_ONLINE,
> +			  "POWER_THREAD_IMC_ONLINE",
> +			   ppc_thread_imc_cpu_online,
> +			   ppc_thread_imc_cpu_offline);
>  }
> 
>  static void thread_imc_ldbar_disable(void *dummy)
> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
> index abde85d9511a..724df46b2c3c 100644
> --- a/include/linux/cpuhotplug.h
> +++ b/include/linux/cpuhotplug.h
> @@ -138,6 +138,7 @@ enum cpuhp_state {
>  	CPUHP_AP_PERF_ARM_L2X0_ONLINE,
>  	CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
>  	CPUHP_AP_PERF_POWERPC_COREIMC_ONLINE,
> +	CPUHP_AP_PERF_POWERPC_THREADIMC_ONLINE,
>  	CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE,
>  	CPUHP_AP_WORKQUEUE_ONLINE,
>  	CPUHP_AP_RCUTREE_ONLINE,
> -- 
> 2.7.4
> 
--
Thanks and Regards
gautham.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 06/13] powerpc/perf: IMC pmu cpumask and cpu hotplug support
  2017-03-23 11:52   ` Gautham R Shenoy
@ 2017-03-27 10:34     ` Anju T Sudhakar
  0 siblings, 0 replies; 20+ messages in thread
From: Anju T Sudhakar @ 2017-03-27 10:34 UTC (permalink / raw)
  To: ego, Madhavan Srinivasan
  Cc: mpe, linuxppc-dev, linux-kernel, Hemant Kumar, Balbir Singh,
	Benjamin Herrenschmidt, Paul Mackerras, Anton Blanchard,
	Sukadev Bhattiprolu, Michael Neuling, Stewart Smith,
	Daniel Axtens, Stephane Eranian

Hi Gautham,


Thank you for reviewing the patch.


On Thursday 23 March 2017 05:22 PM, Gautham R Shenoy wrote:
> Hi Hemant, Maddy,
>
> On Thu, Mar 16, 2017 at 01:05:00PM +0530, Madhavan Srinivasan wrote:
>> From: Hemant Kumar <hemant@linux.vnet.ibm.com>
>>
>> Adds cpumask attribute to be used by each IMC pmu. Only one cpu (any
>> online CPU) from each chip for nest PMUs is designated to read counters.
>>
>> On CPU hotplug, dying CPU is checked to see whether it is one of the
>> designated cpus, if yes, next online cpu from the same chip (for nest
>> units) is designated as new cpu to read counters. For this purpose, we
>> introduce a new state : CPUHP_AP_PERF_POWERPC_NEST_ONLINE.
>>
>> Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>> Cc: Balbir Singh <bsingharora@gmail.com>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Anton Blanchard <anton@samba.org>
>> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
>> Cc: Michael Neuling <mikey@neuling.org>
>> Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
>> Cc: Daniel Axtens <dja@axtens.net>
>> Cc: Stephane Eranian <eranian@google.com>
>> Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
>> Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/include/asm/opal-api.h            |   3 +-
>>   arch/powerpc/include/asm/opal.h                |   3 +
>>   arch/powerpc/perf/imc-pmu.c                    | 163 ++++++++++++++++++++++++-
>>   arch/powerpc/platforms/powernv/opal-wrappers.S |   1 +
>>   include/linux/cpuhotplug.h                     |   1 +
>>   5 files changed, 169 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/opal-api.h b/arch/powerpc/include/asm/opal-api.h
>> index a0aa285869b5..e1c3d4837857 100644
>> --- a/arch/powerpc/include/asm/opal-api.h
>> +++ b/arch/powerpc/include/asm/opal-api.h
>> @@ -168,7 +168,8 @@
>>   #define OPAL_INT_SET_MFRR			125
>>   #define OPAL_PCI_TCE_KILL			126
>>   #define OPAL_NMMU_SET_PTCR			127
>> -#define OPAL_LAST				127
>> +#define OPAL_NEST_IMC_COUNTERS_CONTROL		145
>> +#define OPAL_LAST				145
>>
>>   /* Device tree flags */
>>
>> diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
>> index 1ff03a6da76e..d93d08204243 100644
>> --- a/arch/powerpc/include/asm/opal.h
>> +++ b/arch/powerpc/include/asm/opal.h
>> @@ -227,6 +227,9 @@ int64_t opal_pci_tce_kill(uint64_t phb_id, uint32_t kill_type,
>>   			  uint64_t dma_addr, uint32_t npages);
>>   int64_t opal_nmmu_set_ptcr(uint64_t chip_id, uint64_t ptcr);
>>
>> +int64_t opal_nest_imc_counters_control(uint64_t mode, uint64_t value1,
>> +				uint64_t value2, uint64_t value3);
>> +
>>   /* Internal functions */
>>   extern int early_init_dt_scan_opal(unsigned long node, const char *uname,
>>   				   int depth, void *data);
>> diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
>> index f6f1ef9f56af..e46ff6d2a584 100644
>> --- a/arch/powerpc/perf/imc-pmu.c
>> +++ b/arch/powerpc/perf/imc-pmu.c
>> @@ -16,6 +16,7 @@
>> +static int ppc_nest_imc_cpu_online(unsigned int cpu)
>> +{
> I take it that 'cpu' is coming online.
>
>> +	int nid, fcpu, ncpu;
>> +	struct cpumask *l_cpumask, tmp_mask;
>> +
>> +	/* Fint the cpumask of this node */
>> +	nid = cpu_to_node(cpu);
>> +	l_cpumask = cpumask_of_node(nid);
>> +
>> +	/*
>> +	 * If any of the cpu from this node is already present in the mask,
>> +	 * just return, if not, then set this cpu in the mask.
>> +	 */
>> +	if (!cpumask_and(&tmp_mask, l_cpumask, &nest_imc_cpumask)) {
> In this case, none of the cpus in the node are in the mask. So we set
> and this cpu in the imc cpumask and return.
>
>> +		cpumask_set_cpu(cpu, &nest_imc_cpumask);
>> +		return 0;
>> +	}
> But this case implies that there is already a CPU from the node which
> is in the imc_cpumask. As per the comment above, we are supposed to
> just return. So why are we doing the following ?
>
> Either the comment above is incorrect or I am missing something here.
>
>> +
>> +	fcpu = cpumask_first(l_cpumask);
>> +	ncpu = cpumask_next(cpu, l_cpumask);
>> +	if (cpu == fcpu) {
>> +		if (cpumask_test_and_clear_cpu(ncpu, &nest_imc_cpumask)) {
>> +			cpumask_set_cpu(cpu, &nest_imc_cpumask);
>> +			nest_change_cpu_context(ncpu, cpu);
>> +		}
>> +	}
> It seems that we want to set only the smallest online cpu in the node
> in the nest_imc_cpumask. So, if the newly onlined cpu is the smallest,
> we replace the previous representative with cpu.

Yes. you are right. Here we are designating the smallest online cpu in 
the node in the
nest_imc_mask. The comment above is only for the 'if' code block.


> So, the comment above needs to be fixed.

Will update the comment to avoid confusion. :-)


Thanks,

Anju
>> +
>> +	return 0;
>> +}
>> +
>> +static int ppc_nest_imc_cpu_offline(unsigned int cpu)
>> +{
>> +	int nid, target = -1;
>> +	struct cpumask *l_cpumask;
>> +
>> +	/*
>> +	 * Check in the designated list for this cpu. Dont bother
>> +	 * if not one of them.
>> +	 */
>> +	if (!cpumask_test_and_clear_cpu(cpu, &nest_imc_cpumask))
>> +		return 0;
>> +
>> +	/*
>> +	 * Now that this cpu is one of the designated,
>> +	 * find a next cpu a) which is online and b) in same chip.
>> +	 */
>> +	nid = cpu_to_node(cpu);
>> +	l_cpumask = cpumask_of_node(nid);
>> +	target = cpumask_next(cpu, l_cpumask);
>> +
>> +	/*
>> +	 * Update the cpumask with the target cpu and
>> +	 * migrate the context if needed
>> +	 */
>> +	if (target >= 0 && target <= nr_cpu_ids) {
>> +		cpumask_set_cpu(target, &nest_imc_cpumask);
>> +		nest_change_cpu_context(cpu, target);
>> +	}
>> +	return 0;
>> +}
>> +
>> +static int nest_pmu_cpumask_init(void)
>> +{
>> +	const struct cpumask *l_cpumask;
>> +	int cpu, nid;
>> +	int *cpus_opal_rc;
>> +
>> +	if (!cpumask_empty(&nest_imc_cpumask))
>> +		return 0;
>> +
> 	
>> +	/*
>> +	 * Nest PMUs are per-chip counters. So designate a cpu
>> +	 * from each chip for counter collection.
>> +	 */
>> +	for_each_online_node(nid) {
>> +		l_cpumask = cpumask_of_node(nid);
>> +
>> +		/* designate first online cpu in this node */
>> +		cpu = cpumask_first(l_cpumask);
>> +		cpumask_set_cpu(cpu, &nest_imc_cpumask);
>> +	}
>> +
>
> What happens if a cpu in nest_imc_cpumask goes offline at this point ?
>
> Is that possible ?
>
>> +	/*
>> +	 * Memory for OPAL call return value.
>> +	 */
>> +	cpus_opal_rc = kzalloc((sizeof(int) * nr_cpu_ids), GFP_KERNEL);
>> +	if (!cpus_opal_rc)
>> +		goto fail;
>> +
>> +	/* Initialize Nest PMUs in each node using designated cpus */
>> +	on_each_cpu_mask(&nest_imc_cpumask, (smp_call_func_t)nest_init,
>> +						(void *)cpus_opal_rc, 1);
>> +
>> +	/* Check return value array for any OPAL call failure */
>> +	for_each_cpu(cpu, &nest_imc_cpumask) {
>> +		if (cpus_opal_rc[cpu])
>> +			goto fail;
>> +	}
>> +
>> +	cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
>> +			  "POWER_NEST_IMC_ONLINE",
>> +			  ppc_nest_imc_cpu_online,
>> +			  ppc_nest_imc_cpu_offline);
>> +
>> +	return 0;
>> +
>> +fail:
>> +	return -ENODEV;
>> +}
>> +
>>   static int nest_imc_event_init(struct perf_event *event)
>>   {
>>   	int chip_id;
>> @@ -63,7 +218,7 @@ static int nest_imc_event_init(struct perf_event *event)
>>   	chip_id = topology_physical_package_id(event->cpu);
>>   	pcni = &nest_perchip_info[chip_id];
>>   	event->hw.event_base = pcni->vbase[config/PAGE_SIZE] +
>> -							(config & ~PAGE_MASK);
>> +		(config & ~PAGE_MASK);
>>
>>   	return 0;
>>   }
>> @@ -122,6 +277,7 @@ static int update_pmu_ops(struct imc_pmu *pmu)
>>   	pmu->pmu.stop = imc_event_stop;
>>   	pmu->pmu.read = imc_perf_event_update;
>>   	pmu->attr_groups[1] = &imc_format_group;
>> +	pmu->attr_groups[2] = &imc_pmu_cpumask_attr_group;
>>   	pmu->pmu.attr_groups = pmu->attr_groups;
>>
>>   	return 0;
>> @@ -189,6 +345,11 @@ int init_imc_pmu(struct imc_events *events, int idx,
>>   {
>>   	int ret = -ENODEV;
>>
>> +	/* Add cpumask and register for hotplug notification */
>> +	ret = nest_pmu_cpumask_init();
>> +	if (ret)
>> +		return ret;
>> +
>>   	ret = update_events_in_group(events, idx, pmu_ptr);
>>   	if (ret)
>>   		goto err_free;
>> diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S b/arch/powerpc/platforms/powernv/opal-wrappers.S
>> index da8a0f7a035c..b7208d8e6cc0 100644
>> --- a/arch/powerpc/platforms/powernv/opal-wrappers.S
>> +++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
>> @@ -301,3 +301,4 @@ OPAL_CALL(opal_int_eoi,				OPAL_INT_EOI);
>>   OPAL_CALL(opal_int_set_mfrr,			OPAL_INT_SET_MFRR);
>>   OPAL_CALL(opal_pci_tce_kill,			OPAL_PCI_TCE_KILL);
>>   OPAL_CALL(opal_nmmu_set_ptcr,			OPAL_NMMU_SET_PTCR);
>> +OPAL_CALL(opal_nest_imc_counters_control,	OPAL_NEST_IMC_COUNTERS_CONTROL);
>> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> index 62d240e962f0..cfb0cedc72af 100644
>> --- a/include/linux/cpuhotplug.h
>> +++ b/include/linux/cpuhotplug.h
>> @@ -136,6 +136,7 @@ enum cpuhp_state {
>>   	CPUHP_AP_PERF_ARM_CCI_ONLINE,
>>   	CPUHP_AP_PERF_ARM_CCN_ONLINE,
>>   	CPUHP_AP_PERF_ARM_L2X0_ONLINE,
>> +	CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
>>   	CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE,
>>   	CPUHP_AP_WORKQUEUE_ONLINE,
>>   	CPUHP_AP_RCUTREE_ONLINE,
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 12/13] powerpc/perf: Thread imc cpuhotplug support
  2017-03-23 17:15   ` Gautham R Shenoy
@ 2017-03-27 10:35     ` Anju T Sudhakar
  0 siblings, 0 replies; 20+ messages in thread
From: Anju T Sudhakar @ 2017-03-27 10:35 UTC (permalink / raw)
  To: ego, Madhavan Srinivasan
  Cc: mpe, linuxppc-dev, linux-kernel, Balbir Singh,
	Benjamin Herrenschmidt, Paul Mackerras, Anton Blanchard,
	Sukadev Bhattiprolu, Michael Neuling, Stewart Smith,
	Daniel Axtens, Stephane Eranian



On Thursday 23 March 2017 10:45 PM, Gautham R Shenoy wrote:
> Hi Maddy, Anju,
>
> On Thu, Mar 16, 2017 at 01:05:06PM +0530, Madhavan Srinivasan wrote:
>> From: Anju T Sudhakar <anju@linux.vnet.ibm.com>
>>
>> This patch adds support for thread IMC on cpuhotplug.
>>
>> When a cpu goes offline, the LDBAR for that cpu is disabled, and when it comes
>> back online the previous ldbar value is written back to the LDBAR for that cpu.
>>
>> To register the hotplug functions for thread_imc, a new state
>> CPUHP_AP_PERF_POWERPC_THREADIMC_ONLINE is added to the list of existing
>> states.
>>
>> Cc: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
>> Cc: Balbir Singh <bsingharora@gmail.com>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Anton Blanchard <anton@samba.org>
>> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
>> Cc: Michael Neuling <mikey@neuling.org>
>> Cc: Stewart Smith <stewart@linux.vnet.ibm.com>
>> Cc: Daniel Axtens <dja@axtens.net>
>> Cc: Stephane Eranian <eranian@google.com>
>> Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>>   arch/powerpc/perf/imc-pmu.c | 33 ++++++++++++++++++++++++++++-----
>>   include/linux/cpuhotplug.h  |  1 +
>>   2 files changed, 29 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/powerpc/perf/imc-pmu.c b/arch/powerpc/perf/imc-pmu.c
>> index 6802960db51c..2ff39fe2a5ce 100644
>> --- a/arch/powerpc/perf/imc-pmu.c
>> +++ b/arch/powerpc/perf/imc-pmu.c
>> @@ -687,6 +687,16 @@ static void cleanup_all_thread_imc_memory(void)
>>   	on_each_cpu(cleanup_thread_imc_memory, NULL, 1);
>>   }
>>
>> +static void thread_imc_update_ldbar(unsigned int cpu_id)
>> +{
>> +	u64 ldbar_addr, ldbar_value;
>> +
>> +	ldbar_addr = (u64)virt_to_phys((void *)per_cpu_add[cpu_id]);
>> +	ldbar_value = (ldbar_addr & (u64)THREAD_IMC_LDBAR_MASK) |
>> +			(u64)THREAD_IMC_ENABLE;
>> +	mtspr(SPRN_LDBAR, ldbar_value);
>> +}
>> +
>>   /*
>>    * Allocates a page of memory for each of the online cpus, and, writes the
>>    * physical base address of that page to the LDBAR for that cpu. This starts
>> @@ -694,20 +704,33 @@ static void cleanup_all_thread_imc_memory(void)
>>    */
>>   static void thread_imc_mem_alloc(void *dummy)
>>   {
>> -	u64 ldbar_addr, ldbar_value;
>>   	int cpu_id = smp_processor_id();
>>
>>   	per_cpu_add[cpu_id] = (u64)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
>>   						    0);
>> -	ldbar_addr = (u64)virt_to_phys((void *)per_cpu_add[cpu_id]);
>> -	ldbar_value = (ldbar_addr & (u64)THREAD_IMC_LDBAR_MASK) |
>> -		(u64)THREAD_IMC_ENABLE;
>> -	mtspr(SPRN_LDBAR, ldbar_value);
>> +	thread_imc_update_ldbar(cpu_id);
>> +}
>> +
>> +static int ppc_thread_imc_cpu_online(unsigned int cpu)
>> +{
>> +	thread_imc_update_ldbar(cpu);
>> +	return 0;
>> +
>> +}
>> +
>> +static int ppc_thread_imc_cpu_offline(unsigned int cpu)
>> +{
>> +	mtspr(SPRN_LDBAR, 0);
>> +	return 0;
>>   }
> This patch looks ok to me.
>
> So it appears that in case of a full-core deep stop entry/exit you
> will need to save/restore LDBAR as well. But I will take it up for the
> next set of stop cleanups.
>
> For this patch,
> Reviewed-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>

Thank you for  reviewing the patch Gautham.

-Anju

>
>>   void thread_imc_cpu_init(void)
>>   {
>>   	on_each_cpu(thread_imc_mem_alloc, NULL, 1);
>> +	cpuhp_setup_state(CPUHP_AP_PERF_POWERPC_THREADIMC_ONLINE,
>> +			  "POWER_THREAD_IMC_ONLINE",
>> +			   ppc_thread_imc_cpu_online,
>> +			   ppc_thread_imc_cpu_offline);
>>   }
>>
>>   static void thread_imc_ldbar_disable(void *dummy)
>> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
>> index abde85d9511a..724df46b2c3c 100644
>> --- a/include/linux/cpuhotplug.h
>> +++ b/include/linux/cpuhotplug.h
>> @@ -138,6 +138,7 @@ enum cpuhp_state {
>>   	CPUHP_AP_PERF_ARM_L2X0_ONLINE,
>>   	CPUHP_AP_PERF_POWERPC_NEST_ONLINE,
>>   	CPUHP_AP_PERF_POWERPC_COREIMC_ONLINE,
>> +	CPUHP_AP_PERF_POWERPC_THREADIMC_ONLINE,
>>   	CPUHP_AP_PERF_ARM_QCOM_L2_ONLINE,
>>   	CPUHP_AP_WORKQUEUE_ONLINE,
>>   	CPUHP_AP_RCUTREE_ONLINE,
>> -- 
>> 2.7.4
>>
> --
> Thanks and Regards
> gautham.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v5 08/13] powerpc/perf: PMU functions for Core IMC and hotplugging
  2017-03-23 13:09   ` Gautham R Shenoy
@ 2017-03-28  4:41     ` Madhavan Srinivasan
  0 siblings, 0 replies; 20+ messages in thread
From: Madhavan Srinivasan @ 2017-03-28  4:41 UTC (permalink / raw)
  To: ego
  Cc: mpe, linuxppc-dev, linux-kernel, Hemant Kumar, Balbir Singh,
	Benjamin Herrenschmidt, Paul Mackerras, Anton Blanchard,
	Sukadev Bhattiprolu, Michael Neuling, Stewart Smith,
	Daniel Axtens, Stephane Eranian, Anju T Sudhakar



On Thursday 23 March 2017 06:39 PM, Gautham R Shenoy wrote:
> Hi Maddy, Hemant, Anju,
>
> On Thu, Mar 16, 2017 at 01:05:02PM +0530, Madhavan Srinivasan wrote:
>
> [..snip..]
>
>> +
>> +static void core_imc_change_cpu_context(int old_cpu, int new_cpu)
>> +{
>> +	if (!core_imc_pmu)
>> +		return;
>> +	perf_pmu_migrate_context(&core_imc_pmu->pmu, old_cpu, new_cpu);
>> +}
>> +
>> +
>> +static int ppc_core_imc_cpu_online(unsigned int cpu)
>> +{
>> +	int ret;
>> +
>> +	/* If a cpu for this core is already set, then, don't do anything */
>> +	ret = cpumask_any_and(&core_imc_cpumask,
>> +				 cpu_sibling_mask(cpu));
>> +	if (ret < nr_cpu_ids)
>> +		return 0;
>> +
>> +	/* Else, set the cpu in the mask, and change the context */
>> +	cpumask_set_cpu(cpu, &core_imc_cpumask);
>> +	core_imc_change_cpu_context(-1, cpu);
> So, in the core case, we are ok as long as any cpu in the core is
> present in the imc_cpumask. It need not have to be the smallest online
> cpu in the core.
>
> Can the same logic be applied to the earlier nest case ?

Yes. This makes sense. Let me look at this.

Thanks for review
Maddy

>
> We can have a single function for cpu_offline and cpu_online which
> implements these checks and sets the cpu bit if required.
>
> ppc_entity_imc_cpu_offline(unsigned int cpu, cpumask_t
> 			   entity_imc_mask,
> 			   entity_imc_change_cpu_context_fn)
> {
> 	.
> 	.
> 	.
> 	
> }
>
>
> static ppc_nest_imc_cpu_offline(unsigned int cpu)
> {
> 	return ppc_entity_imc_cpu_offline(cpu, nest_imc_mask,
> 					  nest_imc_change_cpu_context);
> }
>
> And similar ones for core imc and thread imc.
>
> Does this sound reasonable ?

>> +	return 0;
>> +}
>> +
>> +static int ppc_core_imc_cpu_offline(unsigned int cpu)
>> +{
>> +	int target;
>> +	unsigned int ncpu;
>> +
>> +	/*
>> +	 * clear this cpu out of the mask, if not present in the mask,
>> +	 * don't bother doing anything.
>> +	 */
>> +	if (!cpumask_test_and_clear_cpu(cpu, &core_imc_cpumask))
>> +		return 0;
>> +
>> +	/* Find any online cpu in that core except the current "cpu" */
>> +	ncpu = cpumask_any_but(cpu_sibling_mask(cpu), cpu);
>> +
>> +	if (ncpu < nr_cpu_ids) {
>> +		target = ncpu;
>> +		cpumask_set_cpu(target, &core_imc_cpumask);
>> +	} else
>> +		target = -1;
>> +
>> +	/* migrate the context */
>> +	core_imc_change_cpu_context(cpu, target);
>> +
>> +	return 0;
>> +}
>> +
> --
> Thanks and Regards
> gautham.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-03-28  5:34 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-16  7:34 [PATCH v5 00/13] IMC Instrumentation Support Madhavan Srinivasan
2017-03-16  7:34 ` [PATCH v5 01/13] powerpc/powernv: Data structure and macros definitions Madhavan Srinivasan
2017-03-16  7:34 ` [PATCH v5 02/13] powerpc/powernv: Autoload IMC device driver module Madhavan Srinivasan
2017-03-16  7:34 ` [PATCH v5 03/13] powerpc/powernv: Detect supported IMC units and its events Madhavan Srinivasan
2017-03-16  7:34 ` [PATCH v5 04/13] powerpc/perf: Add event attribute and group to IMC pmus Madhavan Srinivasan
2017-03-16  7:34 ` [PATCH v5 05/13] powerpc/perf: Generic imc pmu event functions Madhavan Srinivasan
2017-03-16  7:35 ` [PATCH v5 06/13] powerpc/perf: IMC pmu cpumask and cpu hotplug support Madhavan Srinivasan
2017-03-23 11:52   ` Gautham R Shenoy
2017-03-27 10:34     ` Anju T Sudhakar
2017-03-16  7:35 ` [PATCH v5 07/13] powerpc/powernv: Core IMC events detection Madhavan Srinivasan
2017-03-16  7:35 ` [PATCH v5 08/13] powerpc/perf: PMU functions for Core IMC and hotplugging Madhavan Srinivasan
2017-03-23 13:09   ` Gautham R Shenoy
2017-03-28  4:41     ` Madhavan Srinivasan
2017-03-16  7:35 ` [PATCH v5 09/13] powerpc/powernv: Thread IMC events detection Madhavan Srinivasan
2017-03-16  7:35 ` [PATCH v5 10/13] powerpc/perf: Thread IMC PMU functions Madhavan Srinivasan
2017-03-16  7:35 ` [PATCH 11/13] powerpc/powernv: Add device shutdown function for Core IMC Madhavan Srinivasan
2017-03-16  7:35 ` [PATCH 12/13] powerpc/perf: Thread imc cpuhotplug support Madhavan Srinivasan
2017-03-23 17:15   ` Gautham R Shenoy
2017-03-27 10:35     ` Anju T Sudhakar
2017-03-16  7:35 ` [PATCH 13/13] powerpc/perf: Enable/disable core engine during cpuhotplug Madhavan Srinivasan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.