Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/7] Support PCIe3 uncore PMU on Snow Ridge
@ 2020-07-02 17:05 kan.liang
  2020-07-02 17:05 ` [PATCH 1/7] PCI/portdrv: Create a platform device for the perf uncore driver kan.liang
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: kan.liang @ 2020-07-02 17:05 UTC (permalink / raw)
  To: peterz, bhelgaas, mingo, linux-kernel, linux-pci
  Cc: jeffrey.t.kirsher, olof, dan.j.williams, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The Snow Ridge integrated PCIe3 uncore performance monitoring unit (PMU)
can be used to collect the performance data, e.g., the utilization
between the PCIe devices and the components (in M2IOSF) which are
responsible for translating and managing the requests to/from the
device. The performance data is very useful for analyzing the
performance of the PCIe devices.

The PCIe3 uncore PMU was once supported in the Linux kernel, but it was
removed by the commit id 2167f1625c2f ("perf/x86/intel/uncore: Remove
PCIe3 unit for SNR") due to the conflict between the PCIe Root Port
driver and the perf uncore driver. The counters of the PCIe3 uncore PMU
are located in the configuration space of the PCIe Root Port device,
which already has a bonded driver portdrv_pci. One device can only have
one bonded driver. The uncore driver is always failed to be loaded.

To re-enable the PCIe3 uncore PMU support in the uncore driver, a new
solution should be introduced, which has to meet the requirements as
below:
- must have a reliable way to find the PCIe Root Port device from the
  uncore driver;
- must be able to access the uncore counters of the PCIe Root Port
  device from the uncore driver;
- must support hotplug. When the PCIe Root Port device is removed, the
  uncore driver has to be notified and unregisters the PCIe3 uncore
  PMU.

A new platform device 'perf_uncore_pcieport' as a child device of the
PCIe Root Port device is introduced to address the issue, which:
- can be probed by a unique name, 'perf_uncore_pcieport'. The uncore
  driver is the only driver for the new platform device.
- can provide the struct pci_dev of its parent, aka the PCIe Root Port
  device. The struct pci_dev can be used by the uncore driver to
  register a PMU for accessing the counters in the PCIe Root Port
  device.
- can be notified if its parent PCIe Root Port device is removed.
  The uncore driver can unregister the PMU accordingly.

Kan Liang (7):
  PCI/portdrv: Create a platform device for the perf uncore driver
  perf/x86/intel/uncore: Factor out uncore_pci_get_die_info()
  perf/x86/intel/uncore: Factor out uncore_find_pmu_by_pci_dev()
  perf/x86/intel/uncore: Factor out uncore_pci_pmu_register()
  perf/x86/intel/uncore: Factor out uncore_pci_pmu_unregister()
  perf/x86/intel/uncore: Generic support for the platform device
  perf/x86/intel/uncore: Support PCIe3 unit on Snow Ridge

 arch/x86/events/intel/uncore.c       | 213 +++++++++++++++++++++++++----------
 arch/x86/events/intel/uncore.h       |  19 ++++
 arch/x86/events/intel/uncore_snbep.c |  60 ++++++++++
 drivers/pci/pcie/portdrv_pci.c       |  38 +++++++
 4 files changed, 271 insertions(+), 59 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/7] PCI/portdrv: Create a platform device for the perf uncore driver
  2020-07-02 17:05 [PATCH 0/7] Support PCIe3 uncore PMU on Snow Ridge kan.liang
@ 2020-07-02 17:05 ` kan.liang
  2020-07-07 19:48   ` Bjorn Helgaas
  2020-07-02 17:05 ` [PATCH 2/7] perf/x86/intel/uncore: Factor out uncore_pci_get_die_info() kan.liang
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: kan.liang @ 2020-07-02 17:05 UTC (permalink / raw)
  To: peterz, bhelgaas, mingo, linux-kernel, linux-pci
  Cc: jeffrey.t.kirsher, olof, dan.j.williams, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

On Snow Ridge server, several performance monitoring counters are added
in the Root Port Configuration Space of CPU Complex PCIe Root Ports A,
which can be used to collect the performance data between the PCIe
devices and the components (in M2IOSF) which are responsible for
translating and managing the requests to/from the device. The
performance data is very useful for analyzing the performance of the
PCIe devices.

However, the perf uncore driver cannot be loaded to register a
performance monitoring unit (PMU) for the counters, because the PCIe
Root Ports device already has a bonded driver portdrv_pci.

To enable the uncore PMU support for these counters on the uncore
driver, a new solution should be introduced, which has to meet the
requirements as below:
- must have a reliable way to find the PCIe Root Port device from the
  uncore driver;
- must be able to access the uncore counters of the PCIe Root Port
  device from the uncore driver;
- must support hotplug. When the PCIe Root Port device is removed, the
  uncore driver has to be notified and unregisters the uncore PMU.

A new platform device 'perf_uncore_pcieport' is introduced as part of
the new solution, which can facilitate the enabling of the uncore PMU in
the uncore driver. The new platform device
- is a child device of the PCIe Root Port device. It's allocated when
  the PCIe Root Ports A device is probed. (For SNR, the PMU counters are
  only located in the configuration space of the PCIe Root Ports A.)
- stores its pdev as the private driver data pointer of the PCIe Root
  Ports A. The pdev can be easily retrieved to check the existence of
  the platform device when removing the PCIe Root Ports A.
- is unregistered when the PCIe Root Port A is removed. The remove()
  method which is provided in the uncore driver will be invoked. The
  uncore PMU will be unregistered as well.
- doesn't share any memory and IRQ resources. The uncore driver will
  only touch the PMU counters in the configuration space of the PCIe
  Root Port A.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 drivers/pci/pcie/portdrv_pci.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 3acf151..47e33b2 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -15,6 +15,7 @@
 #include <linux/init.h>
 #include <linux/aer.h>
 #include <linux/dmi.h>
+#include <linux/platform_device.h>
 
 #include "../pci.h"
 #include "portdrv.h"
@@ -90,6 +91,40 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
 #define PCIE_PORTDRV_PM_OPS	NULL
 #endif /* !PM */
 
+static const struct pci_device_id perf_uncore_pcieport_ids[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x334a) },
+	{ },
+};
+
+static void perf_platform_device_register(struct pci_dev *dev)
+{
+	struct platform_device *pdev;
+
+	if (!pci_match_id(perf_uncore_pcieport_ids, dev))
+		return;
+
+	pdev = platform_device_alloc("perf_uncore_pcieport", PLATFORM_DEVID_AUTO);
+	if (!pdev)
+		return;
+
+	pdev->dev.parent = &dev->dev;
+
+	if (platform_device_add(pdev)) {
+		platform_device_put(pdev);
+		return;
+	}
+
+	pci_set_drvdata(dev, pdev);
+}
+
+static void perf_platform_device_unregister(struct pci_dev *dev)
+{
+	struct platform_device *pdev = pci_get_drvdata(dev);
+
+	if (pdev)
+		platform_device_unregister(pdev);
+}
+
 /*
  * pcie_portdrv_probe - Probe PCI-Express port devices
  * @dev: PCI-Express port device being probed
@@ -113,6 +148,8 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
 	if (status)
 		return status;
 
+	perf_platform_device_register(dev);
+
 	pci_save_state(dev);
 
 	dev_pm_set_driver_flags(&dev->dev, DPM_FLAG_NO_DIRECT_COMPLETE |
@@ -142,6 +179,7 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
 		pm_runtime_dont_use_autosuspend(&dev->dev);
 	}
 
+	perf_platform_device_unregister(dev);
 	pcie_port_device_remove(dev);
 }
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/7] perf/x86/intel/uncore: Factor out uncore_pci_get_die_info()
  2020-07-02 17:05 [PATCH 0/7] Support PCIe3 uncore PMU on Snow Ridge kan.liang
  2020-07-02 17:05 ` [PATCH 1/7] PCI/portdrv: Create a platform device for the perf uncore driver kan.liang
@ 2020-07-02 17:05 ` kan.liang
  2020-07-02 17:05 ` [PATCH 3/7] perf/x86/intel/uncore: Factor out uncore_find_pmu_by_pci_dev() kan.liang
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: kan.liang @ 2020-07-02 17:05 UTC (permalink / raw)
  To: peterz, bhelgaas, mingo, linux-kernel, linux-pci
  Cc: jeffrey.t.kirsher, olof, dan.j.williams, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The socket and die information is required when probing the platform
device.

Factor out the codes to get the socket and die information from a BUS
number into a separate function. The function will be used later.

There is no functional change.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/uncore.c | 25 +++++++++++++++++--------
 1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index d5c6d3b..0651ab7 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -988,6 +988,20 @@ uncore_types_init(struct intel_uncore_type **types, bool setid)
 	return 0;
 }
 
+static int uncore_pci_get_die_info(struct pci_dev *pdev,
+				   int *phys_id, int *die)
+{
+	*phys_id = uncore_pcibus_to_physid(pdev->bus);
+	if (*phys_id < 0)
+		return -ENODEV;
+
+	*die = (topology_max_die_per_package() > 1) ? *phys_id :
+				topology_phys_to_logical_pkg(*phys_id);
+	if (*die < 0)
+		return -EINVAL;
+
+	return 0;
+}
 /*
  * add a pci uncore device
  */
@@ -998,14 +1012,9 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 	struct intel_uncore_box *box;
 	int phys_id, die, ret;
 
-	phys_id = uncore_pcibus_to_physid(pdev->bus);
-	if (phys_id < 0)
-		return -ENODEV;
-
-	die = (topology_max_die_per_package() > 1) ? phys_id :
-					topology_phys_to_logical_pkg(phys_id);
-	if (die < 0)
-		return -EINVAL;
+	ret = uncore_pci_get_die_info(pdev, &phys_id, &die);
+	if (ret)
+		return ret;
 
 	if (UNCORE_PCI_DEV_TYPE(id->driver_data) == UNCORE_EXTRA_PCI_DEV) {
 		int idx = UNCORE_PCI_DEV_IDX(id->driver_data);
-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/7] perf/x86/intel/uncore: Factor out uncore_find_pmu_by_pci_dev()
  2020-07-02 17:05 [PATCH 0/7] Support PCIe3 uncore PMU on Snow Ridge kan.liang
  2020-07-02 17:05 ` [PATCH 1/7] PCI/portdrv: Create a platform device for the perf uncore driver kan.liang
  2020-07-02 17:05 ` [PATCH 2/7] perf/x86/intel/uncore: Factor out uncore_pci_get_die_info() kan.liang
@ 2020-07-02 17:05 ` kan.liang
  2020-07-02 17:05 ` [PATCH 4/7] perf/x86/intel/uncore: Factor out uncore_pci_pmu_register() kan.liang
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: kan.liang @ 2020-07-02 17:05 UTC (permalink / raw)
  To: peterz, bhelgaas, mingo, linux-kernel, linux-pci
  Cc: jeffrey.t.kirsher, olof, dan.j.williams, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

When a platform device is probed, the corresponding PMU has to be
registered.

Factor out the codes to find the corresponding PMU by comparing the
pci_device_id table. The function will be used later.

There is no functional change.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/uncore.c | 43 +++++++++++++++++++++++++++---------------
 1 file changed, 28 insertions(+), 15 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 0651ab7..6a87c1a 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1002,6 +1002,32 @@ static int uncore_pci_get_die_info(struct pci_dev *pdev,
 
 	return 0;
 }
+
+static struct intel_uncore_pmu *
+uncore_find_pmu_by_pci_dev(struct pci_dev *pdev, const struct pci_device_id *ids)
+{
+	struct intel_uncore_pmu *pmu = NULL;
+	struct intel_uncore_type *type;
+	kernel_ulong_t data;
+	unsigned int devfn;
+
+	while (ids && ids->vendor) {
+		if ((ids->vendor == pdev->vendor) &&
+		    (ids->device == pdev->device)) {
+			data = ids->driver_data;
+			devfn = PCI_DEVFN(UNCORE_PCI_DEV_DEV(data),
+					  UNCORE_PCI_DEV_FUNC(data));
+			if (devfn == pdev->devfn) {
+				type = uncore_pci_uncores[UNCORE_PCI_DEV_TYPE(data)];
+				pmu = &type->pmus[UNCORE_PCI_DEV_IDX(data)];
+				break;
+			}
+		}
+		ids++;
+	}
+	return pmu;
+}
+
 /*
  * add a pci uncore device
  */
@@ -1033,21 +1059,8 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 	 */
 	if (id->driver_data & ~0xffff) {
 		struct pci_driver *pci_drv = pdev->driver;
-		const struct pci_device_id *ids = pci_drv->id_table;
-		unsigned int devfn;
-
-		while (ids && ids->vendor) {
-			if ((ids->vendor == pdev->vendor) &&
-			    (ids->device == pdev->device)) {
-				devfn = PCI_DEVFN(UNCORE_PCI_DEV_DEV(ids->driver_data),
-						  UNCORE_PCI_DEV_FUNC(ids->driver_data));
-				if (devfn == pdev->devfn) {
-					pmu = &type->pmus[UNCORE_PCI_DEV_IDX(ids->driver_data)];
-					break;
-				}
-			}
-			ids++;
-		}
+
+		pmu = uncore_find_pmu_by_pci_dev(pdev, pci_drv->id_table);
 		if (pmu == NULL)
 			return -ENODEV;
 	} else {
-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/7] perf/x86/intel/uncore: Factor out uncore_pci_pmu_register()
  2020-07-02 17:05 [PATCH 0/7] Support PCIe3 uncore PMU on Snow Ridge kan.liang
                   ` (2 preceding siblings ...)
  2020-07-02 17:05 ` [PATCH 3/7] perf/x86/intel/uncore: Factor out uncore_find_pmu_by_pci_dev() kan.liang
@ 2020-07-02 17:05 ` kan.liang
  2020-07-02 17:05 ` [PATCH 5/7] perf/x86/intel/uncore: Factor out uncore_pci_pmu_unregister() kan.liang
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: kan.liang @ 2020-07-02 17:05 UTC (permalink / raw)
  To: peterz, bhelgaas, mingo, linux-kernel, linux-pci
  Cc: jeffrey.t.kirsher, olof, dan.j.williams, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The PMU registration of a platform device is similar as a PCI device.

Factor out the codes of register PCI PMU into a separate function. The
function will be used later.

pci_set_drvdata() is not included in uncore_pci_pmu_register(). Because
a platform device has its own function to set the drvdata.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/uncore.c | 74 ++++++++++++++++++++++++------------------
 1 file changed, 43 insertions(+), 31 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index 6a87c1a..ccc90b0 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1028,6 +1028,47 @@ uncore_find_pmu_by_pci_dev(struct pci_dev *pdev, const struct pci_device_id *ids
 	return pmu;
 }
 
+static int uncore_pci_pmu_register(struct pci_dev *pdev,
+				   struct intel_uncore_type *type,
+				   struct intel_uncore_pmu *pmu,
+				   int phys_id, int die)
+{
+	struct intel_uncore_box *box;
+	int ret;
+
+	if (WARN_ON_ONCE(pmu->boxes[die] != NULL))
+		return -EINVAL;
+
+	box = uncore_alloc_box(type, NUMA_NO_NODE);
+	if (!box)
+		return -ENOMEM;
+
+	if (pmu->func_id < 0)
+		pmu->func_id = pdev->devfn;
+	else
+		 WARN_ON_ONCE(pmu->func_id != pdev->devfn);
+
+	atomic_inc(&box->refcnt);
+	box->pci_phys_id = phys_id;
+	box->dieid = die;
+	box->pci_dev = pdev;
+	box->pmu = pmu;
+	uncore_box_init(box);
+
+	pmu->boxes[die] = box;
+	if (atomic_inc_return(&pmu->activeboxes) > 1)
+		return 0;
+
+	/* First active box registers the pmu */
+	ret = uncore_pmu_register(pmu);
+	if (ret) {
+		pmu->boxes[die] = NULL;
+		uncore_box_exit(box);
+		kfree(box);
+	}
+	return ret;
+}
+
 /*
  * add a pci uncore device
  */
@@ -1035,7 +1076,6 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 {
 	struct intel_uncore_type *type;
 	struct intel_uncore_pmu *pmu = NULL;
-	struct intel_uncore_box *box;
 	int phys_id, die, ret;
 
 	ret = uncore_pci_get_die_info(pdev, &phys_id, &die);
@@ -1071,38 +1111,10 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 		pmu = &type->pmus[UNCORE_PCI_DEV_IDX(id->driver_data)];
 	}
 
-	if (WARN_ON_ONCE(pmu->boxes[die] != NULL))
-		return -EINVAL;
-
-	box = uncore_alloc_box(type, NUMA_NO_NODE);
-	if (!box)
-		return -ENOMEM;
-
-	if (pmu->func_id < 0)
-		pmu->func_id = pdev->devfn;
-	else
-		WARN_ON_ONCE(pmu->func_id != pdev->devfn);
-
-	atomic_inc(&box->refcnt);
-	box->pci_phys_id = phys_id;
-	box->dieid = die;
-	box->pci_dev = pdev;
-	box->pmu = pmu;
-	uncore_box_init(box);
-	pci_set_drvdata(pdev, box);
+	ret = uncore_pci_pmu_register(pdev, type, pmu, phys_id, die);
 
-	pmu->boxes[die] = box;
-	if (atomic_inc_return(&pmu->activeboxes) > 1)
-		return 0;
+	pci_set_drvdata(pdev, pmu->boxes[die]);
 
-	/* First active box registers the pmu */
-	ret = uncore_pmu_register(pmu);
-	if (ret) {
-		pci_set_drvdata(pdev, NULL);
-		pmu->boxes[die] = NULL;
-		uncore_box_exit(box);
-		kfree(box);
-	}
 	return ret;
 }
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 5/7] perf/x86/intel/uncore: Factor out uncore_pci_pmu_unregister()
  2020-07-02 17:05 [PATCH 0/7] Support PCIe3 uncore PMU on Snow Ridge kan.liang
                   ` (3 preceding siblings ...)
  2020-07-02 17:05 ` [PATCH 4/7] perf/x86/intel/uncore: Factor out uncore_pci_pmu_register() kan.liang
@ 2020-07-02 17:05 ` kan.liang
  2020-07-02 17:05 ` [PATCH 6/7] perf/x86/intel/uncore: Generic support for the platform device kan.liang
  2020-07-02 17:05 ` [PATCH 7/7] perf/x86/intel/uncore: Support PCIe3 unit on Snow Ridge kan.liang
  6 siblings, 0 replies; 11+ messages in thread
From: kan.liang @ 2020-07-02 17:05 UTC (permalink / raw)
  To: peterz, bhelgaas, mingo, linux-kernel, linux-pci
  Cc: jeffrey.t.kirsher, olof, dan.j.williams, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The PMU unregistration of a platform device is similar as a PCI device.

Factor out the codes of unregister a PCI PMU into a separate function. The
function will be used later.

There is no functional change.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/uncore.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index ccc90b0..b702dc3 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -1118,6 +1118,16 @@ static int uncore_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id
 	return ret;
 }
 
+static void uncore_pci_pmu_unregister(struct intel_uncore_pmu *pmu,
+				      struct intel_uncore_box *box)
+{
+	pmu->boxes[box->dieid] = NULL;
+	if (atomic_dec_return(&pmu->activeboxes) == 0)
+		uncore_pmu_unregister(pmu);
+	uncore_box_exit(box);
+	kfree(box);
+}
+
 static void uncore_pci_remove(struct pci_dev *pdev)
 {
 	struct intel_uncore_box *box;
@@ -1145,11 +1155,8 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 		return;
 
 	pci_set_drvdata(pdev, NULL);
-	pmu->boxes[box->dieid] = NULL;
-	if (atomic_dec_return(&pmu->activeboxes) == 0)
-		uncore_pmu_unregister(pmu);
-	uncore_box_exit(box);
-	kfree(box);
+
+	uncore_pci_pmu_unregister(pmu, box);
 }
 
 static int __init uncore_pci_init(void)
-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 6/7] perf/x86/intel/uncore: Generic support for the platform device
  2020-07-02 17:05 [PATCH 0/7] Support PCIe3 uncore PMU on Snow Ridge kan.liang
                   ` (4 preceding siblings ...)
  2020-07-02 17:05 ` [PATCH 5/7] perf/x86/intel/uncore: Factor out uncore_pci_pmu_unregister() kan.liang
@ 2020-07-02 17:05 ` kan.liang
  2020-07-02 17:05 ` [PATCH 7/7] perf/x86/intel/uncore: Support PCIe3 unit on Snow Ridge kan.liang
  6 siblings, 0 replies; 11+ messages in thread
From: kan.liang @ 2020-07-02 17:05 UTC (permalink / raw)
  To: peterz, bhelgaas, mingo, linux-kernel, linux-pci
  Cc: jeffrey.t.kirsher, olof, dan.j.williams, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

Some uncore counters may be located in the configuration space of a PCI
device, which already has a bonded driver. Currently, the uncore driver
cannot register a PCI uncore PMU for these counters. Because, to
register a PCI uncore PMU, the uncore driver must be bond to the device.
However, one device can only have one bonded driver.

The PCI device can create a platform device as its child device. The
uncore driver can bind to the platform device instead.

The probe() and remove() methods are implemented to support the platform
device. When probing a platform device, its parent PCI device is
searched in an id table. If a matched device is found, a PMU for the
platform device will be registered, which will be used to monitor its
parents' PCI device. When the parent device is removed, the remove()
method will be triggered, which unregister the PMU.

Other modules' loading/unloading will trigger the loading/unloading of
the platform device, and impact the uncore driver. The potential race
condition can be prevented by the device_lock() of the platform device,
which guarantees the probe(), and remove() are sequential.

The id table varies on different platforms, which will be implemented in
the following platform-specific patch.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/uncore.c | 54 ++++++++++++++++++++++++++++++++++++++++++
 arch/x86/events/intel/uncore.h | 19 +++++++++++++++
 2 files changed, 73 insertions(+)

diff --git a/arch/x86/events/intel/uncore.c b/arch/x86/events/intel/uncore.c
index b702dc3..30fe187 100644
--- a/arch/x86/events/intel/uncore.c
+++ b/arch/x86/events/intel/uncore.c
@@ -12,6 +12,8 @@ struct intel_uncore_type **uncore_mmio_uncores = empty_uncore;
 
 static bool pcidrv_registered;
 struct pci_driver *uncore_pci_driver;
+struct uncore_platform_driver *uncore_platform_driver;
+
 /* pci bus to socket mapping */
 DEFINE_RAW_SPINLOCK(pci2phy_map_lock);
 struct list_head pci2phy_map_head = LIST_HEAD_INIT(pci2phy_map_head);
@@ -1159,6 +1161,42 @@ static void uncore_pci_remove(struct pci_dev *pdev)
 	uncore_pci_pmu_unregister(pmu, box);
 }
 
+static int uncore_platform_probe(struct platform_device *pdev)
+{
+	struct pci_dev *pci_dev = to_pci_dev(pdev->dev.parent);
+	struct intel_uncore_pmu *pmu;
+	int phys_id, die, ret;
+
+	pmu = uncore_find_pmu_by_pci_dev(pci_dev, uncore_platform_driver->pci_ids);
+	if (!pmu)
+		return -ENODEV;
+
+	ret = uncore_pci_get_die_info(pci_dev, &phys_id, &die);
+	if (ret)
+		return ret;
+
+	ret = uncore_pci_pmu_register(pci_dev, pmu->type, pmu, phys_id, die);
+
+	platform_set_drvdata(pdev, pmu->boxes[die]);
+
+	return ret;
+}
+
+static int uncore_platform_remove(struct platform_device *pdev)
+{
+	struct intel_uncore_box *box;
+
+	box = platform_get_drvdata(pdev);
+	if (!box)
+		return 0;
+
+	uncore_pci_pmu_unregister(box->pmu, box);
+
+	platform_set_drvdata(pdev, NULL);
+
+	return 0;
+}
+
 static int __init uncore_pci_init(void)
 {
 	size_t size;
@@ -1183,6 +1221,19 @@ static int __init uncore_pci_init(void)
 		goto errtype;
 
 	pcidrv_registered = true;
+
+	if (uncore_platform_driver) {
+		uncore_platform_driver->driver->probe = uncore_platform_probe;
+		uncore_platform_driver->driver->remove = uncore_platform_remove;
+
+		ret = platform_driver_register(uncore_platform_driver->driver);
+		if (ret) {
+			pr_warn("Failed to register platform driver. "
+				"Disable %s uncore unit.\n",
+				uncore_platform_driver->driver->driver.name);
+			uncore_platform_driver = NULL;
+		}
+	}
 	return 0;
 
 errtype:
@@ -1197,6 +1248,9 @@ static int __init uncore_pci_init(void)
 
 static void uncore_pci_exit(void)
 {
+	if (uncore_platform_driver)
+		platform_driver_unregister(uncore_platform_driver->driver);
+
 	if (pcidrv_registered) {
 		pcidrv_registered = false;
 		pci_unregister_driver(uncore_pci_driver);
diff --git a/arch/x86/events/intel/uncore.h b/arch/x86/events/intel/uncore.h
index 105fdc6..da4cb36 100644
--- a/arch/x86/events/intel/uncore.h
+++ b/arch/x86/events/intel/uncore.h
@@ -3,6 +3,7 @@
 #include <linux/pci.h>
 #include <asm/apicdef.h>
 #include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/platform_device.h>
 
 #include <linux/perf_event.h>
 #include "../perf_event.h"
@@ -161,6 +162,24 @@ struct uncore_event_desc {
 	const char *config;
 };
 
+/*
+ * A platform device created by other drivers for uncore monitoring
+ * @driver: platform_driver for the platform device
+ * @pci_ids: id_table for supported PCI devices
+ *           Used to compare with the platform device's parent PCI device
+ *
+ * Other drivers may create a platform device for uncore monitoring,
+ * e.g. pcieport driver on SNR. To match the platform device, the probe
+ * function has to compare the platform device's parent PCI device with
+ * pci_ids.
+ */
+struct uncore_platform_driver {
+	struct platform_driver		*driver;
+	const struct pci_device_id	*pci_ids;
+};
+
+extern struct uncore_platform_driver *uncore_platform_driver;
+
 struct freerunning_counters {
 	unsigned int counter_base;
 	unsigned int counter_offset;
-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 7/7] perf/x86/intel/uncore: Support PCIe3 unit on Snow Ridge
  2020-07-02 17:05 [PATCH 0/7] Support PCIe3 uncore PMU on Snow Ridge kan.liang
                   ` (5 preceding siblings ...)
  2020-07-02 17:05 ` [PATCH 6/7] perf/x86/intel/uncore: Generic support for the platform device kan.liang
@ 2020-07-02 17:05 ` kan.liang
  6 siblings, 0 replies; 11+ messages in thread
From: kan.liang @ 2020-07-02 17:05 UTC (permalink / raw)
  To: peterz, bhelgaas, mingo, linux-kernel, linux-pci
  Cc: jeffrey.t.kirsher, olof, dan.j.williams, ak, Kan Liang

From: Kan Liang <kan.liang@linux.intel.com>

The Snow Ridge integrated PCIe3 uncore unit can be used to collect
performance data, e.g. utilization, between PCIe devices, plugged into
the PCIe port, and the components (in M2IOSF) responsible for
translating and managing requests to/from the device. The performance
data is very useful for analyzing the performance of PCIe devices.

The portdrv_pci driver creates a platform device,
"perf_uncore_pcieport", for PCIe3 uncore PMON unit. The uncore driver
probes the platform device, instead of device ID.

Here are some difference between PCIe3 uncore unit and other uncore
pci units.
- There may be several Root Ports on a system. But the uncore counters
  only exist in the Root Port A. A user can configure the channel mask
  to collect the data from other Root Ports.
- The event format of the PCIe3 uncore unit is the same as IIO unit of
  SKX.
- The Control Register of PCIe3 uncore unit is 64 bits.
- The offset of each counters is 8, which is the same as M2M unit of SNR.
- New MSR addresses for unit control, counter and counter config.

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
---
 arch/x86/events/intel/uncore_snbep.c | 60 ++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c
index 62e88ad..34fc9d7 100644
--- a/arch/x86/events/intel/uncore_snbep.c
+++ b/arch/x86/events/intel/uncore_snbep.c
@@ -393,6 +393,11 @@
 #define SNR_M2M_PCI_PMON_BOX_CTL		0x438
 #define SNR_M2M_PCI_PMON_UMASK_EXT		0xff
 
+/* SNR PCIE3 */
+#define SNR_PCIE3_PCI_PMON_CTL0			0x508
+#define SNR_PCIE3_PCI_PMON_CTR0			0x4e8
+#define SNR_PCIE3_PCI_PMON_BOX_CTL		0x4e0
+
 /* SNR IMC */
 #define SNR_IMC_MMIO_PMON_FIXED_CTL		0x54
 #define SNR_IMC_MMIO_PMON_FIXED_CTR		0x38
@@ -4551,12 +4556,46 @@ static struct intel_uncore_type snr_uncore_m2m = {
 	.format_group	= &snr_m2m_uncore_format_group,
 };
 
+static void snr_uncore_pci_enable_event(struct intel_uncore_box *box, struct perf_event *event)
+{
+	struct pci_dev *pdev = box->pci_dev;
+	struct hw_perf_event *hwc = &event->hw;
+
+	pci_write_config_dword(pdev, hwc->config_base, (u32)(hwc->config | SNBEP_PMON_CTL_EN));
+	pci_write_config_dword(pdev, hwc->config_base + 4, (u32)(hwc->config >> 32));
+}
+
+static struct intel_uncore_ops snr_pcie3_uncore_pci_ops = {
+	.init_box	= snr_m2m_uncore_pci_init_box,
+	.disable_box	= snbep_uncore_pci_disable_box,
+	.enable_box	= snbep_uncore_pci_enable_box,
+	.disable_event	= snbep_uncore_pci_disable_event,
+	.enable_event	= snr_uncore_pci_enable_event,
+	.read_counter	= snbep_uncore_pci_read_counter,
+};
+
+static struct intel_uncore_type snr_uncore_pcie3 = {
+	.name		= "pcie3",
+	.num_counters	= 4,
+	.num_boxes	= 1,
+	.perf_ctr_bits	= 48,
+	.perf_ctr	= SNR_PCIE3_PCI_PMON_CTR0,
+	.event_ctl	= SNR_PCIE3_PCI_PMON_CTL0,
+	.event_mask	= SKX_IIO_PMON_RAW_EVENT_MASK,
+	.event_mask_ext	= SKX_IIO_PMON_RAW_EVENT_MASK_EXT,
+	.box_ctl	= SNR_PCIE3_PCI_PMON_BOX_CTL,
+	.ops		= &snr_pcie3_uncore_pci_ops,
+	.format_group	= &skx_uncore_iio_format_group,
+};
+
 enum {
 	SNR_PCI_UNCORE_M2M,
+	SNR_PCI_UNCORE_PCIE3,
 };
 
 static struct intel_uncore_type *snr_pci_uncores[] = {
 	[SNR_PCI_UNCORE_M2M]		= &snr_uncore_m2m,
+	[SNR_PCI_UNCORE_PCIE3]		= &snr_uncore_pcie3,
 	NULL,
 };
 
@@ -4573,6 +4612,25 @@ static struct pci_driver snr_uncore_pci_driver = {
 	.id_table	= snr_uncore_pci_ids,
 };
 
+static const struct pci_device_id snr_uncore_platform_pci_ids[] = {
+	{ /* PCIe3 RP */
+		PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x334a),
+		.driver_data = UNCORE_PCI_DEV_FULL_DATA(4, 0, SNR_PCI_UNCORE_PCIE3, 0),
+	},
+	{ /* end: all zeroes */ }
+};
+
+static struct platform_driver snr_uncore_pcieport = {
+	.driver = {
+		.name = "perf_uncore_pcieport",
+	},
+};
+
+static struct uncore_platform_driver snr_uncore_platform_driver = {
+	.driver		= &snr_uncore_pcieport,
+	.pci_ids	= snr_uncore_platform_pci_ids,
+};
+
 int snr_uncore_pci_init(void)
 {
 	/* SNR UBOX DID */
@@ -4584,6 +4642,8 @@ int snr_uncore_pci_init(void)
 
 	uncore_pci_uncores = snr_pci_uncores;
 	uncore_pci_driver = &snr_uncore_pci_driver;
+	uncore_platform_driver = &snr_uncore_platform_driver;
+
 	return 0;
 }
 
-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/7] PCI/portdrv: Create a platform device for the perf uncore driver
  2020-07-02 17:05 ` [PATCH 1/7] PCI/portdrv: Create a platform device for the perf uncore driver kan.liang
@ 2020-07-07 19:48   ` Bjorn Helgaas
  2020-07-08  3:01     ` Liang, Kan
  0 siblings, 1 reply; 11+ messages in thread
From: Bjorn Helgaas @ 2020-07-07 19:48 UTC (permalink / raw)
  To: kan.liang
  Cc: peterz, bhelgaas, mingo, linux-kernel, linux-pci,
	jeffrey.t.kirsher, olof, dan.j.williams, ak, Stephane Eranian

[+cc Stephane in case he has thoughts on the perf driver claim issue]

On Thu, Jul 02, 2020 at 10:05:11AM -0700, kan.liang@linux.intel.com wrote:
> From: Kan Liang <kan.liang@linux.intel.com>
> 
> On Snow Ridge server, several performance monitoring counters are added
> in the Root Port Configuration Space of CPU Complex PCIe Root Ports A,
> which can be used to collect the performance data between the PCIe
> devices and the components (in M2IOSF) which are responsible for
> translating and managing the requests to/from the device. The
> performance data is very useful for analyzing the performance of the
> PCIe devices.
> 
> However, the perf uncore driver cannot be loaded to register a
> performance monitoring unit (PMU) for the counters, because the PCIe
> Root Ports device already has a bonded driver portdrv_pci.
> 
> To enable the uncore PMU support for these counters on the uncore
> driver, a new solution should be introduced, which has to meet the
> requirements as below:
> - must have a reliable way to find the PCIe Root Port device from the
>   uncore driver;
> - must be able to access the uncore counters of the PCIe Root Port
>   device from the uncore driver;
> - must support hotplug. When the PCIe Root Port device is removed, the
>   uncore driver has to be notified and unregisters the uncore PMU.
> 
> A new platform device 'perf_uncore_pcieport' is introduced as part of
> the new solution, which can facilitate the enabling of the uncore PMU in
> the uncore driver. The new platform device
> - is a child device of the PCIe Root Port device. It's allocated when
>   the PCIe Root Ports A device is probed. (For SNR, the PMU counters are
>   only located in the configuration space of the PCIe Root Ports A.)
> - stores its pdev as the private driver data pointer of the PCIe Root
>   Ports A. The pdev can be easily retrieved to check the existence of
>   the platform device when removing the PCIe Root Ports A.
> - is unregistered when the PCIe Root Port A is removed. The remove()
>   method which is provided in the uncore driver will be invoked. The
>   uncore PMU will be unregistered as well.
> - doesn't share any memory and IRQ resources. The uncore driver will
>   only touch the PMU counters in the configuration space of the PCIe
>   Root Port A.

I have to admit this is clever.  I don't really *like* it, but we
don't have any very good alternatives at the moment.

I don't like the idea of a list of PCI IDs
(perf_uncore_pcieport_ids[]) below that must be updated for every
device that needs something like this.  That PCI ID information is
normally in the drivers themselves, not in bus-level code like this.

And I don't like the way this subverts the device ownership model.
Now we have several drivers (pciehp, aer, dpc, etc, plus this new perf
driver) that share the same PCI device.  And we rely on the assumption
that none of these drivers interferes with the others.

I think the best way to deal with this would be to incorporate the
existing portdrv users (pciehp, aer, dpc, etc) directly into the PCI
core so portdrv would not use pci_register_driver(), leaving the Root
Port device available for the perf driver to claim it the normal way.
But realistically I don't know when or even whether this will be done.

I think Stephane has worked around this problem in a different way,
IIRC by using pci_get_device() in a perf driver to find Ports of
interest.  That also subverts the device ownership model, and it
doesn't work naturally with hotplug, but at least it gets the device
IDs out of the PCI core and into the driver where they belong.  And
there's value in solving the same problem in the same way.

Wait a minute!  You've already used the pci_get_device() strategy
several times:

  2b3b76b5ec67 ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
  fdb64822443e ("perf/x86: Add Intel Tiger Lake uncore support")
  ee49532b38dd ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge")

So what's really different about *this* situation?  Why would you not
just continue using the same strategy?

> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> ---
>  drivers/pci/pcie/portdrv_pci.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index 3acf151..47e33b2 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -15,6 +15,7 @@
>  #include <linux/init.h>
>  #include <linux/aer.h>
>  #include <linux/dmi.h>
> +#include <linux/platform_device.h>
>  
>  #include "../pci.h"
>  #include "portdrv.h"
> @@ -90,6 +91,40 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>  #define PCIE_PORTDRV_PM_OPS	NULL
>  #endif /* !PM */
>  
> +static const struct pci_device_id perf_uncore_pcieport_ids[] = {
> +	{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x334a) },
> +	{ },
> +};
> +
> +static void perf_platform_device_register(struct pci_dev *dev)
> +{
> +	struct platform_device *pdev;
> +
> +	if (!pci_match_id(perf_uncore_pcieport_ids, dev))
> +		return;
> +
> +	pdev = platform_device_alloc("perf_uncore_pcieport", PLATFORM_DEVID_AUTO);
> +	if (!pdev)
> +		return;
> +
> +	pdev->dev.parent = &dev->dev;
> +
> +	if (platform_device_add(pdev)) {
> +		platform_device_put(pdev);
> +		return;
> +	}
> +
> +	pci_set_drvdata(dev, pdev);
> +}
> +
> +static void perf_platform_device_unregister(struct pci_dev *dev)
> +{
> +	struct platform_device *pdev = pci_get_drvdata(dev);
> +
> +	if (pdev)
> +		platform_device_unregister(pdev);
> +}
> +
>  /*
>   * pcie_portdrv_probe - Probe PCI-Express port devices
>   * @dev: PCI-Express port device being probed
> @@ -113,6 +148,8 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
>  	if (status)
>  		return status;
>  
> +	perf_platform_device_register(dev);
> +
>  	pci_save_state(dev);
>  
>  	dev_pm_set_driver_flags(&dev->dev, DPM_FLAG_NO_DIRECT_COMPLETE |
> @@ -142,6 +179,7 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
>  		pm_runtime_dont_use_autosuspend(&dev->dev);
>  	}
>  
> +	perf_platform_device_unregister(dev);
>  	pcie_port_device_remove(dev);
>  }
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/7] PCI/portdrv: Create a platform device for the perf uncore driver
  2020-07-07 19:48   ` Bjorn Helgaas
@ 2020-07-08  3:01     ` Liang, Kan
  2020-07-08 18:30       ` Bjorn Helgaas
  0 siblings, 1 reply; 11+ messages in thread
From: Liang, Kan @ 2020-07-08  3:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: peterz, bhelgaas, mingo, linux-kernel, linux-pci,
	jeffrey.t.kirsher, olof, dan.j.williams, ak, Stephane Eranian



On 7/7/2020 3:48 PM, Bjorn Helgaas wrote:
> [+cc Stephane in case he has thoughts on the perf driver claim issue]
> 
> On Thu, Jul 02, 2020 at 10:05:11AM -0700, kan.liang@linux.intel.com wrote:
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> On Snow Ridge server, several performance monitoring counters are added
>> in the Root Port Configuration Space of CPU Complex PCIe Root Ports A,
>> which can be used to collect the performance data between the PCIe
>> devices and the components (in M2IOSF) which are responsible for
>> translating and managing the requests to/from the device. The
>> performance data is very useful for analyzing the performance of the
>> PCIe devices.
>>
>> However, the perf uncore driver cannot be loaded to register a
>> performance monitoring unit (PMU) for the counters, because the PCIe
>> Root Ports device already has a bonded driver portdrv_pci.
>>
>> To enable the uncore PMU support for these counters on the uncore
>> driver, a new solution should be introduced, which has to meet the
>> requirements as below:
>> - must have a reliable way to find the PCIe Root Port device from the
>>    uncore driver;
>> - must be able to access the uncore counters of the PCIe Root Port
>>    device from the uncore driver;
>> - must support hotplug. When the PCIe Root Port device is removed, the
>>    uncore driver has to be notified and unregisters the uncore PMU.
>>
>> A new platform device 'perf_uncore_pcieport' is introduced as part of
>> the new solution, which can facilitate the enabling of the uncore PMU in
>> the uncore driver. The new platform device
>> - is a child device of the PCIe Root Port device. It's allocated when
>>    the PCIe Root Ports A device is probed. (For SNR, the PMU counters are
>>    only located in the configuration space of the PCIe Root Ports A.)
>> - stores its pdev as the private driver data pointer of the PCIe Root
>>    Ports A. The pdev can be easily retrieved to check the existence of
>>    the platform device when removing the PCIe Root Ports A.
>> - is unregistered when the PCIe Root Port A is removed. The remove()
>>    method which is provided in the uncore driver will be invoked. The
>>    uncore PMU will be unregistered as well.
>> - doesn't share any memory and IRQ resources. The uncore driver will
>>    only touch the PMU counters in the configuration space of the PCIe
>>    Root Port A.
> 
> I have to admit this is clever.  I don't really *like* it, but we
> don't have any very good alternatives at the moment.
> 
> I don't like the idea of a list of PCI IDs
> (perf_uncore_pcieport_ids[]) below that must be updated for every
> device that needs something like this.  That PCI ID information is
> normally in the drivers themselves, not in bus-level code like this.
>

I don't want to create a platform device for every single device. So I 
added a check here. Yes, it doesn't look pretty, but I don't have a 
better solution for now.

> And I don't like the way this subverts the device ownership model.
> Now we have several drivers (pciehp, aer, dpc, etc, plus this new perf
> driver) that share the same PCI device.  And we rely on the assumption
> that none of these drivers interferes with the others.
> 
> I think the best way to deal with this would be to incorporate the
> existing portdrv users (pciehp, aer, dpc, etc) directly into the PCI
> core so portdrv would not use pci_register_driver(), leaving the Root
> Port device available for the perf driver to claim it the normal way.
> But realistically I don't know when or even whether this will be done.
> 
> I think Stephane has worked around this problem in a different way,
> IIRC by using pci_get_device() in a perf driver to find Ports of
> interest.  That also subverts the device ownership model, and it
> doesn't work naturally with hotplug, but at least it gets the device
> IDs out of the PCI core and into the driver where they belong.  And
> there's value in solving the same problem in the same way.
> 
> Wait a minute!  You've already used the pci_get_device() strategy
> several times:
> 
>    2b3b76b5ec67 ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
>    fdb64822443e ("perf/x86: Add Intel Tiger Lake uncore support")
>    ee49532b38dd ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge")
> 
> So what's really different about *this* situation?  Why would you not
> just continue using the same strategy?
>

There are three different methods (MSR, PCICFG, and MMIO) to access the 
uncore counter, while each counter can only accessed using one of the 
method. Many devices have uncore counters. All counters on the same 
device must be accessed using the same method.

The perf uncore driver abstracts three code paths for the above three 
different methods. Each code path is shared among the devices which have 
the corresponding access method.

The pci_get_device() strategy is used by the device in which counters 
can be accessed by MMIO. Currently, the only such device is the IMC 
(Integrated Memory Controller) device. The problem is that the BAR 
address of the IMC counters is located in the PCI Configuration Space of 
the Configuration Agent (Ubox) device. The perf driver is not supposed 
to bind the Ubox device while accessing the counters in another device, 
the IMC device. So the pci_get_device() is used to retrieve the pci_dev 
of the Ubox. The perf driver reads the BAR address from the Ubox and 
maps it to access the IMC counters.

The counters in the Root Port device are in the PCI configuration space 
of the device. For a device whose counters are in the PCI configuration 
space of itself, the perf driver should probe and bind the device. 
However, the Root Port device is already bound by the portdrv driver.
To maximize the code reuse in the perf driver, a platform device is 
introduced as a child of the Root Port device. So the perf driver can 
probe and bind the platform device in a similar method.

The pci_get_device() strategy may work for the case, but from the 
perspective of the perf driver, I don't thing it's a good solution. We 
have to specially handle the pci_get_device() strategy in the code path 
of the PCICFG access method. Also, it's not a complete solution, e.g. as 
you said it doesn't work naturally with hotplug.


Thanks,
Kan

>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> ---
>>   drivers/pci/pcie/portdrv_pci.c | 38 ++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 38 insertions(+)
>>
>> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
>> index 3acf151..47e33b2 100644
>> --- a/drivers/pci/pcie/portdrv_pci.c
>> +++ b/drivers/pci/pcie/portdrv_pci.c
>> @@ -15,6 +15,7 @@
>>   #include <linux/init.h>
>>   #include <linux/aer.h>
>>   #include <linux/dmi.h>
>> +#include <linux/platform_device.h>
>>   
>>   #include "../pci.h"
>>   #include "portdrv.h"
>> @@ -90,6 +91,40 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>>   #define PCIE_PORTDRV_PM_OPS	NULL
>>   #endif /* !PM */
>>   
>> +static const struct pci_device_id perf_uncore_pcieport_ids[] = {
>> +	{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x334a) },
>> +	{ },
>> +};
>> +
>> +static void perf_platform_device_register(struct pci_dev *dev)
>> +{
>> +	struct platform_device *pdev;
>> +
>> +	if (!pci_match_id(perf_uncore_pcieport_ids, dev))
>> +		return;
>> +
>> +	pdev = platform_device_alloc("perf_uncore_pcieport", PLATFORM_DEVID_AUTO);
>> +	if (!pdev)
>> +		return;
>> +
>> +	pdev->dev.parent = &dev->dev;
>> +
>> +	if (platform_device_add(pdev)) {
>> +		platform_device_put(pdev);
>> +		return;
>> +	}
>> +
>> +	pci_set_drvdata(dev, pdev);
>> +}
>> +
>> +static void perf_platform_device_unregister(struct pci_dev *dev)
>> +{
>> +	struct platform_device *pdev = pci_get_drvdata(dev);
>> +
>> +	if (pdev)
>> +		platform_device_unregister(pdev);
>> +}
>> +
>>   /*
>>    * pcie_portdrv_probe - Probe PCI-Express port devices
>>    * @dev: PCI-Express port device being probed
>> @@ -113,6 +148,8 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
>>   	if (status)
>>   		return status;
>>   
>> +	perf_platform_device_register(dev);
>> +
>>   	pci_save_state(dev);
>>   
>>   	dev_pm_set_driver_flags(&dev->dev, DPM_FLAG_NO_DIRECT_COMPLETE |
>> @@ -142,6 +179,7 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
>>   		pm_runtime_dont_use_autosuspend(&dev->dev);
>>   	}
>>   
>> +	perf_platform_device_unregister(dev);
>>   	pcie_port_device_remove(dev);
>>   }
>>   
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/7] PCI/portdrv: Create a platform device for the perf uncore driver
  2020-07-08  3:01     ` Liang, Kan
@ 2020-07-08 18:30       ` Bjorn Helgaas
  0 siblings, 0 replies; 11+ messages in thread
From: Bjorn Helgaas @ 2020-07-08 18:30 UTC (permalink / raw)
  To: Liang, Kan
  Cc: peterz, bhelgaas, mingo, linux-kernel, linux-pci,
	jeffrey.t.kirsher, olof, dan.j.williams, ak, Stephane Eranian

On Tue, Jul 07, 2020 at 11:01:38PM -0400, Liang, Kan wrote:
> On 7/7/2020 3:48 PM, Bjorn Helgaas wrote:
> > [+cc Stephane in case he has thoughts on the perf driver claim issue]
> > 
> > On Thu, Jul 02, 2020 at 10:05:11AM -0700, kan.liang@linux.intel.com wrote:
> > > From: Kan Liang <kan.liang@linux.intel.com>
> > > 
> > > On Snow Ridge server, several performance monitoring counters are added
> > > in the Root Port Configuration Space of CPU Complex PCIe Root Ports A,
> > > which can be used to collect the performance data between the PCIe
> > > devices and the components (in M2IOSF) which are responsible for
> > > translating and managing the requests to/from the device. The
> > > performance data is very useful for analyzing the performance of the
> > > PCIe devices.
> > > 
> > > However, the perf uncore driver cannot be loaded to register a
> > > performance monitoring unit (PMU) for the counters, because the PCIe
> > > Root Ports device already has a bonded driver portdrv_pci.
> > > 
> > > To enable the uncore PMU support for these counters on the uncore
> > > driver, a new solution should be introduced, which has to meet the
> > > requirements as below:
> > > - must have a reliable way to find the PCIe Root Port device from the
> > >    uncore driver;
> > > - must be able to access the uncore counters of the PCIe Root Port
> > >    device from the uncore driver;
> > > - must support hotplug. When the PCIe Root Port device is removed, the
> > >    uncore driver has to be notified and unregisters the uncore PMU.
> > > 
> > > A new platform device 'perf_uncore_pcieport' is introduced as part of
> > > the new solution, which can facilitate the enabling of the uncore PMU in
> > > the uncore driver. The new platform device
> > > - is a child device of the PCIe Root Port device. It's allocated when
> > >    the PCIe Root Ports A device is probed. (For SNR, the PMU counters are
> > >    only located in the configuration space of the PCIe Root Ports A.)
> > > - stores its pdev as the private driver data pointer of the PCIe Root
> > >    Ports A. The pdev can be easily retrieved to check the existence of
> > >    the platform device when removing the PCIe Root Ports A.
> > > - is unregistered when the PCIe Root Port A is removed. The remove()
> > >    method which is provided in the uncore driver will be invoked. The
> > >    uncore PMU will be unregistered as well.
> > > - doesn't share any memory and IRQ resources. The uncore driver will
> > >    only touch the PMU counters in the configuration space of the PCIe
> > >    Root Port A.
> > 
> > I have to admit this is clever.  I don't really *like* it, but we
> > don't have any very good alternatives at the moment.
> > 
> > I don't like the idea of a list of PCI IDs
> > (perf_uncore_pcieport_ids[]) below that must be updated for every
> > device that needs something like this.  That PCI ID information is
> > normally in the drivers themselves, not in bus-level code like this.
> 
> I don't want to create a platform device for every single device. So I added
> a check here. Yes, it doesn't look pretty, but I don't have a better
> solution for now.

I do not want to merge a stream of device IDs in portdrv_pci.c.  As
far as I'm concerned, that's a non-starter.  New devices should not
require changes in the PCI core.

> > And I don't like the way this subverts the device ownership model.
> > Now we have several drivers (pciehp, aer, dpc, etc, plus this new perf
> > driver) that share the same PCI device.  And we rely on the assumption
> > that none of these drivers interferes with the others.
> > 
> > I think the best way to deal with this would be to incorporate the
> > existing portdrv users (pciehp, aer, dpc, etc) directly into the PCI
> > core so portdrv would not use pci_register_driver(), leaving the Root
> > Port device available for the perf driver to claim it the normal way.
> > But realistically I don't know when or even whether this will be done.
> > 
> > I think Stephane has worked around this problem in a different way,
> > IIRC by using pci_get_device() in a perf driver to find Ports of
> > interest.  That also subverts the device ownership model, and it
> > doesn't work naturally with hotplug, but at least it gets the device
> > IDs out of the PCI core and into the driver where they belong.  And
> > there's value in solving the same problem in the same way.
> > 
> > Wait a minute!  You've already used the pci_get_device() strategy
> > several times:
> > 
> >    2b3b76b5ec67 ("perf/x86/intel/uncore: Add Ice Lake server uncore support")
> >    fdb64822443e ("perf/x86: Add Intel Tiger Lake uncore support")
> >    ee49532b38dd ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge")
> > 
> > So what's really different about *this* situation?  Why would you not
> > just continue using the same strategy?
> 
> There are three different methods (MSR, PCICFG, and MMIO) to access the
> uncore counter, while each counter can only accessed using one of the
> method. Many devices have uncore counters. All counters on the same device
> must be accessed using the same method.
> 
> The perf uncore driver abstracts three code paths for the above three
> different methods. Each code path is shared among the devices which have the
> corresponding access method.
> 
> The pci_get_device() strategy is used by the device in which counters can be
> accessed by MMIO. Currently, the only such device is the IMC (Integrated
> Memory Controller) device. The problem is that the BAR address of the IMC
> counters is located in the PCI Configuration Space of the Configuration
> Agent (Ubox) device. The perf driver is not supposed to bind the Ubox device
> while accessing the counters in another device, the IMC device. So the
> pci_get_device() is used to retrieve the pci_dev of the Ubox. The perf
> driver reads the BAR address from the Ubox and maps it to access the IMC
> counters.
> 
> The counters in the Root Port device are in the PCI configuration space of
> the device. For a device whose counters are in the PCI configuration space
> of itself, the perf driver should probe and bind the device. However, the
> Root Port device is already bound by the portdrv driver.

Understood.  That's why I said above that the current portdrv design
is a problem.

> To maximize the code reuse in the perf driver, a platform device is
> introduced as a child of the Root Port device. So the perf driver can probe
> and bind the platform device in a similar method.

The current portdrv design has two struct devices for each port: the
usual one in struct pci_dev and a second in struct pcie_device.
That's already one too many, and this platform device plan would
add a third one.

It also would add an arbitrary new way to bind a driver to a PCI
device, so we end up with *two* drivers bound to the same device.  I
don't think that's a good design.

> The pci_get_device() strategy may work for the case, but from the
> perspective of the perf driver, I don't think it's a good solution. We have
> to specially handle the pci_get_device() strategy in the code path of the
> PCICFG access method. Also, it's not a complete solution, e.g. as you said
> it doesn't work naturally with hotplug.

I think pci_get_device() is also a pretty poor solution, but I think
it's the best of the current poor options because it's already pretty
widely used by EDAC drivers and others, and at least it doesn't give
the illusion that we're using a safe driver binding model.

I think the hotplug issue could be handled with bus-level
notifications (also ugly, but at least workable).  xen_pci_notifier()
is a possible example.

I would welcome any help in reworking portdrv so we could do this more
cleanly.

> > > Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> > > ---
> > >   drivers/pci/pcie/portdrv_pci.c | 38 ++++++++++++++++++++++++++++++++++++++
> > >   1 file changed, 38 insertions(+)
> > > 
> > > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> > > index 3acf151..47e33b2 100644
> > > --- a/drivers/pci/pcie/portdrv_pci.c
> > > +++ b/drivers/pci/pcie/portdrv_pci.c
> > > @@ -15,6 +15,7 @@
> > >   #include <linux/init.h>
> > >   #include <linux/aer.h>
> > >   #include <linux/dmi.h>
> > > +#include <linux/platform_device.h>
> > >   #include "../pci.h"
> > >   #include "portdrv.h"
> > > @@ -90,6 +91,40 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
> > >   #define PCIE_PORTDRV_PM_OPS	NULL
> > >   #endif /* !PM */
> > > +static const struct pci_device_id perf_uncore_pcieport_ids[] = {
> > > +	{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, 0x334a) },
> > > +	{ },
> > > +};
> > > +
> > > +static void perf_platform_device_register(struct pci_dev *dev)
> > > +{
> > > +	struct platform_device *pdev;
> > > +
> > > +	if (!pci_match_id(perf_uncore_pcieport_ids, dev))
> > > +		return;
> > > +
> > > +	pdev = platform_device_alloc("perf_uncore_pcieport", PLATFORM_DEVID_AUTO);
> > > +	if (!pdev)
> > > +		return;
> > > +
> > > +	pdev->dev.parent = &dev->dev;
> > > +
> > > +	if (platform_device_add(pdev)) {
> > > +		platform_device_put(pdev);
> > > +		return;
> > > +	}
> > > +
> > > +	pci_set_drvdata(dev, pdev);
> > > +}
> > > +
> > > +static void perf_platform_device_unregister(struct pci_dev *dev)
> > > +{
> > > +	struct platform_device *pdev = pci_get_drvdata(dev);
> > > +
> > > +	if (pdev)
> > > +		platform_device_unregister(pdev);
> > > +}
> > > +
> > >   /*
> > >    * pcie_portdrv_probe - Probe PCI-Express port devices
> > >    * @dev: PCI-Express port device being probed
> > > @@ -113,6 +148,8 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
> > >   	if (status)
> > >   		return status;
> > > +	perf_platform_device_register(dev);
> > > +
> > >   	pci_save_state(dev);
> > >   	dev_pm_set_driver_flags(&dev->dev, DPM_FLAG_NO_DIRECT_COMPLETE |
> > > @@ -142,6 +179,7 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
> > >   		pm_runtime_dont_use_autosuspend(&dev->dev);
> > >   	}
> > > +	perf_platform_device_unregister(dev);
> > >   	pcie_port_device_remove(dev);
> > >   }
> > > -- 
> > > 2.7.4
> > > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02 17:05 [PATCH 0/7] Support PCIe3 uncore PMU on Snow Ridge kan.liang
2020-07-02 17:05 ` [PATCH 1/7] PCI/portdrv: Create a platform device for the perf uncore driver kan.liang
2020-07-07 19:48   ` Bjorn Helgaas
2020-07-08  3:01     ` Liang, Kan
2020-07-08 18:30       ` Bjorn Helgaas
2020-07-02 17:05 ` [PATCH 2/7] perf/x86/intel/uncore: Factor out uncore_pci_get_die_info() kan.liang
2020-07-02 17:05 ` [PATCH 3/7] perf/x86/intel/uncore: Factor out uncore_find_pmu_by_pci_dev() kan.liang
2020-07-02 17:05 ` [PATCH 4/7] perf/x86/intel/uncore: Factor out uncore_pci_pmu_register() kan.liang
2020-07-02 17:05 ` [PATCH 5/7] perf/x86/intel/uncore: Factor out uncore_pci_pmu_unregister() kan.liang
2020-07-02 17:05 ` [PATCH 6/7] perf/x86/intel/uncore: Generic support for the platform device kan.liang
2020-07-02 17:05 ` [PATCH 7/7] perf/x86/intel/uncore: Support PCIe3 unit on Snow Ridge kan.liang

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git