All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/8] Support for 24x7 hcall interface version 2
@ 2017-06-28 23:10 Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 1/8] powerpc/perf/hv-24x7: Fix passing of catalog version number Thiago Jung Bauermann
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: Thiago Jung Bauermann @ 2017-06-28 23:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Sukadev Bhattiprolu, Thiago Jung Bauermann

Hello,

The hypervisor interface to access 24x7 performance counters (which collect
performance information from system power on to system power off) has been
extended in POWER9 adding new fields to the request and result element
structures.

Also, results for some domains now return more than one result element and
those need to be added to get a total count.

The first two patches fix bugs in the existing code. The following 4 patches
are code improvements and the last two finally implement support for the
changes in POWER9 described above.

POWER8 systems only support version 1 of the interface, while POWER9 systems
only support version 2. I tested these patches on POWER8 to verify that
there are no regressions, and also on POWER9 DD1.

Changes since v1:

Patch "powerpc/perf/hv-24x7: Support v2 of the hypervisor API"
- Use function instead of API version-specific macros to obtain the number
  of requests that fit in the request buffer (suggested by Sukadev).
- Added interface version parameter to request size macro instead of
  defining one macro per version (suggested by Sukadev).
- Return early with -EIO instead of -ENOTSUPP after finding the start of the
  next result in get_count_from_result if there's more than one result
  element.
- Changed h_24x7_event_commit_txn to bail out if any result has an error
  (suggested by Sukadev).

Thiago Jung Bauermann (8):
  powerpc/perf/hv-24x7: Fix passing of catalog version number
  powerpc/perf/hv-24x7: Fix off-by-one error in request_buffer check
  powerpc/perf/hv-24x7: Properly iterate through results
  powerpc-perf/hx-24x7: Don't log failed hcall twice
  powerpc/perf/hv-24x7: Fix return value of hcalls
  powerpc/perf/hv-24x7: Minor improvements
  powerpc/perf/hv-24x7: Support v2 of the hypervisor API
  powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8

 arch/powerpc/perf/hv-24x7.c            | 240 ++++++++++++++++++++++++---------
 arch/powerpc/perf/hv-24x7.h            |  69 ++++++++--
 arch/powerpc/platforms/pseries/Kconfig |   2 +-
 3 files changed, 239 insertions(+), 72 deletions(-)

-- 
2.13.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2 1/8] powerpc/perf/hv-24x7: Fix passing of catalog version number
  2017-06-28 23:10 [PATCH v2 0/8] Support for 24x7 hcall interface version 2 Thiago Jung Bauermann
@ 2017-06-28 23:10 ` Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 2/8] powerpc/perf/hv-24x7: Fix off-by-one error in request_buffer check Thiago Jung Bauermann
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Thiago Jung Bauermann @ 2017-06-28 23:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Sukadev Bhattiprolu, Thiago Jung Bauermann

H_GET_24X7_CATALOG_PAGE needs to be passed the version number obtained from
the first catalog page obtained previously. This is a 64 bit number, but
create_events_from_catalog truncates it to 32-bit.

This worked on POWER8, but POWER9 actually uses the upper bits so the call
fails with H_P3 because the hypervisor doesn't recognize the version.

This patch also adds the hcall return code to the error message, which is
helpful when debugging the problem.

Fixes: 5c5cd7b50259 ("powerpc/perf/hv-24x7: parse catalog and populate sysfs with events")
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 7b2ca16b1eb4..1354cde7095c 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -670,7 +670,7 @@ static int create_events_from_catalog(struct attribute ***events_,
 	       event_data_bytes, junk_events, event_idx, event_attr_ct, i,
 	       attr_max, event_idx_last, desc_ct, long_desc_ct;
 	ssize_t ct, ev_len;
-	uint32_t catalog_version_num;
+	uint64_t catalog_version_num;
 	struct attribute **events, **event_descs, **event_long_descs;
 	struct hv_24x7_catalog_page_0 *page_0 =
 		kmem_cache_alloc(hv_page_cache, GFP_KERNEL);
@@ -706,8 +706,8 @@ static int create_events_from_catalog(struct attribute ***events_,
 	event_data_offs   = be16_to_cpu(page_0->event_data_offs);
 	event_data_len    = be16_to_cpu(page_0->event_data_len);
 
-	pr_devel("cv %zu cl %zu eec %zu edo %zu edl %zu\n",
-			(size_t)catalog_version_num, catalog_len,
+	pr_devel("cv %llu cl %zu eec %zu edo %zu edl %zu\n",
+			catalog_version_num, catalog_len,
 			event_entry_count, event_data_offs, event_data_len);
 
 	if ((MAX_4K < event_data_len)
@@ -761,8 +761,8 @@ static int create_events_from_catalog(struct attribute ***events_,
 				catalog_version_num,
 				i + event_data_offs);
 		if (hret) {
-			pr_err("failed to get event data in page %zu\n",
-					i + event_data_offs);
+			pr_err("Failed to get event data in page %zu: rc=%ld\n",
+			       i + event_data_offs, hret);
 			ret = -EIO;
 			goto e_event_data;
 		}
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 2/8] powerpc/perf/hv-24x7: Fix off-by-one error in request_buffer check
  2017-06-28 23:10 [PATCH v2 0/8] Support for 24x7 hcall interface version 2 Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 1/8] powerpc/perf/hv-24x7: Fix passing of catalog version number Thiago Jung Bauermann
@ 2017-06-28 23:10 ` Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 3/8] powerpc/perf/hv-24x7: Properly iterate through results Thiago Jung Bauermann
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Thiago Jung Bauermann @ 2017-06-28 23:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Sukadev Bhattiprolu, Thiago Jung Bauermann

request_buffer can hold 254 requests, so if it already has that number of
entries we can't add a new one.

Also, define constant to show where the number comes from.

Fixes: e3ee15dc5d19 ("powerpc/perf/hv-24x7: Define add_event_to_24x7_request()")
Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 1354cde7095c..141de0f708f7 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -166,6 +166,10 @@ DEFINE_PER_CPU(struct hv_24x7_hw, hv_24x7_hw);
 DEFINE_PER_CPU(char, hv_24x7_reqb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
 DEFINE_PER_CPU(char, hv_24x7_resb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
 
+#define MAX_NUM_REQUESTS	((H24x7_DATA_BUFFER_SIZE -		       \
+					sizeof(struct hv_24x7_request_buffer)) \
+					/ sizeof(struct hv_24x7_request))
+
 static char *event_name(struct hv_24x7_event_data *ev, int *len)
 {
 	*len = be16_to_cpu(ev->event_name_len) - 2;
@@ -1107,7 +1111,7 @@ static int add_event_to_24x7_request(struct perf_event *event,
 	int i;
 	struct hv_24x7_request *req;
 
-	if (request_buffer->num_requests > 254) {
+	if (request_buffer->num_requests >= MAX_NUM_REQUESTS) {
 		pr_devel("Too many requests for 24x7 HCALL %d\n",
 				request_buffer->num_requests);
 		return -EINVAL;
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 3/8] powerpc/perf/hv-24x7: Properly iterate through results
  2017-06-28 23:10 [PATCH v2 0/8] Support for 24x7 hcall interface version 2 Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 1/8] powerpc/perf/hv-24x7: Fix passing of catalog version number Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 2/8] powerpc/perf/hv-24x7: Fix off-by-one error in request_buffer check Thiago Jung Bauermann
@ 2017-06-28 23:10 ` Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 4/8] powerpc-perf/hx-24x7: Don't log failed hcall twice Thiago Jung Bauermann
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Thiago Jung Bauermann @ 2017-06-28 23:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Sukadev Bhattiprolu, Thiago Jung Bauermann

hv-24x7.h has a comment mentioning that result_buffer->results can't be
indexed as a normal array because it may contain results of variable sizes,
so fix the loop in h_24x7_event_commit_txn to take the variation into
account when iterating through results.

Another problem in that loop is that it sets h24x7hw->events[i] to NULL.
This assumes that only the i'th result maps to the i'th request, but that
is not guaranteed to be true. We need to leave the event in the array so
that we don't dereference a NULL pointer in case more than one result maps
to one request.

We still assume that each result has only one result element, so warn if
that assumption is violated.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 141de0f708f7..8b1a3e849d23 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1144,7 +1144,9 @@ static int add_event_to_24x7_request(struct perf_event *event,
 
 static unsigned long single_24x7_request(struct perf_event *event, u64 *count)
 {
+	u16 num_elements;
 	unsigned long ret;
+	struct hv_24x7_result *result;
 	struct hv_24x7_request_buffer *request_buffer;
 	struct hv_24x7_data_result_buffer *result_buffer;
 
@@ -1166,8 +1168,14 @@ static unsigned long single_24x7_request(struct perf_event *event, u64 *count)
 		goto out;
 	}
 
+	result = result_buffer->results;
+
+	/* This code assumes that a result has only one element. */
+	num_elements = be16_to_cpu(result->num_elements_returned);
+	WARN_ON_ONCE(num_elements != 1);
+
 	/* process result from hcall */
-	*count = be64_to_cpu(result_buffer->results[0].elements[0].element_data[0]);
+	*count = be64_to_cpu(result->elements[0].element_data[0]);
 
 out:
 	put_cpu_var(hv_24x7_reqb);
@@ -1400,8 +1408,7 @@ static int h_24x7_event_commit_txn(struct pmu *pmu)
 {
 	struct hv_24x7_request_buffer *request_buffer;
 	struct hv_24x7_data_result_buffer *result_buffer;
-	struct hv_24x7_result *resb;
-	struct perf_event *event;
+	struct hv_24x7_result *res, *next_res;
 	u64 count;
 	int i, ret, txn_flags;
 	struct hv_24x7_hw *h24x7hw;
@@ -1428,13 +1435,20 @@ static int h_24x7_event_commit_txn(struct pmu *pmu)
 
 	h24x7hw = &get_cpu_var(hv_24x7_hw);
 
-	/* Update event counts from hcall */
-	for (i = 0; i < request_buffer->num_requests; i++) {
-		resb = &result_buffer->results[i];
-		count = be64_to_cpu(resb->elements[0].element_data[0]);
-		event = h24x7hw->events[i];
-		h24x7hw->events[i] = NULL;
+	/* Go through results in the result buffer to update event counts. */
+	for (i = 0, res = result_buffer->results;
+	     i < result_buffer->num_results; i++, res = next_res) {
+		struct perf_event *event = h24x7hw->events[res->result_ix];
+		u16 num_elements = be16_to_cpu(res->num_elements_returned);
+		u16 data_size = be16_to_cpu(res->result_element_data_size);
+
+		/* This code assumes that a result has only one element. */
+		WARN_ON_ONCE(num_elements != 1);
+
+		count = be64_to_cpu(res->elements[0].element_data[0]);
 		update_event_count(event, count);
+
+		next_res = (void *) res->elements[0].element_data + data_size;
 	}
 
 	put_cpu_var(hv_24x7_hw);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 4/8] powerpc-perf/hx-24x7: Don't log failed hcall twice
  2017-06-28 23:10 [PATCH v2 0/8] Support for 24x7 hcall interface version 2 Thiago Jung Bauermann
                   ` (2 preceding siblings ...)
  2017-06-28 23:10 ` [PATCH v2 3/8] powerpc/perf/hv-24x7: Properly iterate through results Thiago Jung Bauermann
@ 2017-06-28 23:10 ` Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 5/8] powerpc/perf/hv-24x7: Fix return value of hcalls Thiago Jung Bauermann
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Thiago Jung Bauermann @ 2017-06-28 23:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Sukadev Bhattiprolu, Thiago Jung Bauermann

make_24x7_request already calls log_24x7_hcall if it fails, so callers
don't have to do it again.

In fact, since the latter is now only called from the former, there's no
need for a separate log_24x7_hcall anymore so remove it.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c | 35 ++++++++++++-----------------------
 1 file changed, 12 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 8b1a3e849d23..111c61ea7416 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1044,21 +1044,6 @@ static const struct attribute_group *attr_groups[] = {
 	NULL,
 };
 
-static void log_24x7_hcall(struct hv_24x7_request_buffer *request_buffer,
-			   struct hv_24x7_data_result_buffer *result_buffer,
-			   unsigned long ret)
-{
-	struct hv_24x7_request *req;
-
-	req = &request_buffer->requests[0];
-	pr_notice_ratelimited("hcall failed: [%d %#x %#x %d] => "
-			"ret 0x%lx (%ld) detail=0x%x failing ix=%x\n",
-			req->performance_domain, req->data_offset,
-			req->starting_ix, req->starting_lpar_ix, ret, ret,
-			result_buffer->detailed_rc,
-			result_buffer->failing_request_ix);
-}
-
 /*
  * Start the process for a new H_GET_24x7_DATA hcall.
  */
@@ -1091,8 +1076,16 @@ static int make_24x7_request(struct hv_24x7_request_buffer *request_buffer,
 			virt_to_phys(request_buffer), H24x7_DATA_BUFFER_SIZE,
 			virt_to_phys(result_buffer),  H24x7_DATA_BUFFER_SIZE);
 
-	if (ret)
-		log_24x7_hcall(request_buffer, result_buffer, ret);
+	if (ret) {
+		struct hv_24x7_request *req;
+
+		req = &request_buffer->requests[0];
+		pr_notice_ratelimited("hcall failed: [%d %#x %#x %d] => ret 0x%lx (%ld) detail=0x%x failing ix=%x\n",
+				      req->performance_domain, req->data_offset,
+				      req->starting_ix, req->starting_lpar_ix,
+				      ret, ret, result_buffer->detailed_rc,
+				      result_buffer->failing_request_ix);
+	}
 
 	return ret;
 }
@@ -1163,10 +1156,8 @@ static unsigned long single_24x7_request(struct perf_event *event, u64 *count)
 		goto out;
 
 	ret = make_24x7_request(request_buffer, result_buffer);
-	if (ret) {
-		log_24x7_hcall(request_buffer, result_buffer, ret);
+	if (ret)
 		goto out;
-	}
 
 	result = result_buffer->results;
 
@@ -1428,10 +1419,8 @@ static int h_24x7_event_commit_txn(struct pmu *pmu)
 	result_buffer = (void *)get_cpu_var(hv_24x7_resb);
 
 	ret = make_24x7_request(request_buffer, result_buffer);
-	if (ret) {
-		log_24x7_hcall(request_buffer, result_buffer, ret);
+	if (ret)
 		goto put_reqb;
-	}
 
 	h24x7hw = &get_cpu_var(hv_24x7_hw);
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 5/8] powerpc/perf/hv-24x7: Fix return value of hcalls
  2017-06-28 23:10 [PATCH v2 0/8] Support for 24x7 hcall interface version 2 Thiago Jung Bauermann
                   ` (3 preceding siblings ...)
  2017-06-28 23:10 ` [PATCH v2 4/8] powerpc-perf/hx-24x7: Don't log failed hcall twice Thiago Jung Bauermann
@ 2017-06-28 23:10 ` Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 6/8] powerpc/perf/hv-24x7: Minor improvements Thiago Jung Bauermann
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: Thiago Jung Bauermann @ 2017-06-28 23:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Sukadev Bhattiprolu, Thiago Jung Bauermann

The H_GET_24X7_CATALOG_PAGE hcall can return a signed error code, so fix
this in the code.

The H_GET_24X7_DATA hcall can return a signed error code, so fix this in
the code. Also, don't truncate it to 32 bit to use as return value for
make_24x7_request. In case of error h_24x7_event_commit_txn passes that
return value to generic code, so it should be a proper errno. The other
caller of make_24x7_request is single_24x7_request, whose callers don't
actually care which error code is returned so they are not affected by this
change.

Finally, h_24x7_get_value doesn't use the error code from
single_24x7_request, so there's no need to store it.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c | 28 +++++++++++++---------------
 1 file changed, 13 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 111c61ea7416..7e47dcdea62a 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -264,9 +264,8 @@ static void *event_end(struct hv_24x7_event_data *ev, void *end)
 	return start + nl + dl + ldl;
 }
 
-static unsigned long h_get_24x7_catalog_page_(unsigned long phys_4096,
-					      unsigned long version,
-					      unsigned long index)
+static long h_get_24x7_catalog_page_(unsigned long phys_4096,
+				     unsigned long version, unsigned long index)
 {
 	pr_devel("h_get_24x7_catalog_page(0x%lx, %lu, %lu)",
 			phys_4096, version, index);
@@ -277,8 +276,7 @@ static unsigned long h_get_24x7_catalog_page_(unsigned long phys_4096,
 			phys_4096, version, index);
 }
 
-static unsigned long h_get_24x7_catalog_page(char page[],
-					     u64 version, u32 index)
+static long h_get_24x7_catalog_page(char page[], u64 version, u32 index)
 {
 	return h_get_24x7_catalog_page_(virt_to_phys(page),
 					version, index);
@@ -668,7 +666,7 @@ static int create_events_from_catalog(struct attribute ***events_,
 				      struct attribute ***event_descs_,
 				      struct attribute ***event_long_descs_)
 {
-	unsigned long hret;
+	long hret;
 	size_t catalog_len, catalog_page_len, event_entry_count,
 	       event_data_len, event_data_offs,
 	       event_data_bytes, junk_events, event_idx, event_attr_ct, i,
@@ -907,7 +905,7 @@ static ssize_t catalog_read(struct file *filp, struct kobject *kobj,
 			    struct bin_attribute *bin_attr, char *buf,
 			    loff_t offset, size_t count)
 {
-	unsigned long hret;
+	long hret;
 	ssize_t ret = 0;
 	size_t catalog_len = 0, catalog_page_len = 0;
 	loff_t page_offset = 0;
@@ -992,7 +990,7 @@ static ssize_t _name##_show(struct device *dev,			\
 			    struct device_attribute *dev_attr,	\
 			    char *buf)				\
 {								\
-	unsigned long hret;					\
+	long hret;						\
 	ssize_t ret = 0;					\
 	void *page = kmem_cache_alloc(hv_page_cache, GFP_USER);	\
 	struct hv_24x7_catalog_page_0 *page_0 = page;		\
@@ -1065,7 +1063,7 @@ static void init_24x7_request(struct hv_24x7_request_buffer *request_buffer,
 static int make_24x7_request(struct hv_24x7_request_buffer *request_buffer,
 			     struct hv_24x7_data_result_buffer *result_buffer)
 {
-	unsigned long ret;
+	long ret;
 
 	/*
 	 * NOTE: Due to variable number of array elements in request and
@@ -1085,9 +1083,10 @@ static int make_24x7_request(struct hv_24x7_request_buffer *request_buffer,
 				      req->starting_ix, req->starting_lpar_ix,
 				      ret, ret, result_buffer->detailed_rc,
 				      result_buffer->failing_request_ix);
+		return -EIO;
 	}
 
-	return ret;
+	return 0;
 }
 
 /*
@@ -1135,10 +1134,10 @@ static int add_event_to_24x7_request(struct perf_event *event,
 	return 0;
 }
 
-static unsigned long single_24x7_request(struct perf_event *event, u64 *count)
+static int single_24x7_request(struct perf_event *event, u64 *count)
 {
+	int ret;
 	u16 num_elements;
-	unsigned long ret;
 	struct hv_24x7_result *result;
 	struct hv_24x7_request_buffer *request_buffer;
 	struct hv_24x7_data_result_buffer *result_buffer;
@@ -1253,10 +1252,9 @@ static int h_24x7_event_init(struct perf_event *event)
 
 static u64 h_24x7_get_value(struct perf_event *event)
 {
-	unsigned long ret;
 	u64 ct;
-	ret = single_24x7_request(event, &ct);
-	if (ret)
+
+	if (single_24x7_request(event, &ct))
 		/* We checked this in event init, shouldn't fail here... */
 		return 0;
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 6/8] powerpc/perf/hv-24x7: Minor improvements
  2017-06-28 23:10 [PATCH v2 0/8] Support for 24x7 hcall interface version 2 Thiago Jung Bauermann
                   ` (4 preceding siblings ...)
  2017-06-28 23:10 ` [PATCH v2 5/8] powerpc/perf/hv-24x7: Fix return value of hcalls Thiago Jung Bauermann
@ 2017-06-28 23:10 ` Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 7/8] powerpc/perf/hv-24x7: Support v2 of the hypervisor API Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 8/8] powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8 Thiago Jung Bauermann
  7 siblings, 0 replies; 9+ messages in thread
From: Thiago Jung Bauermann @ 2017-06-28 23:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Sukadev Bhattiprolu, Thiago Jung Bauermann

There's an H24x7_DATA_BUFFER_SIZE constant, so use it in init_24x7_request.

There's also an HV_PERF_DOMAIN_MAX constant, so use it in
h_24x7_event_init. This makes the comment above the check redundant,
so remove it.

In add_event_to_24x7_request, a statement is terminated with a comma
instead of a semicolon. Fix it.

In hv-24x7.h, improve comments in struct hv_24x7_result.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c |  9 ++++-----
 arch/powerpc/perf/hv-24x7.h | 11 ++++++++++-
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 7e47dcdea62a..043cbc78be98 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1049,8 +1049,8 @@ static void init_24x7_request(struct hv_24x7_request_buffer *request_buffer,
 			      struct hv_24x7_data_result_buffer *result_buffer)
 {
 
-	memset(request_buffer, 0, 4096);
-	memset(result_buffer, 0, 4096);
+	memset(request_buffer, 0, H24x7_DATA_BUFFER_SIZE);
+	memset(result_buffer, 0, H24x7_DATA_BUFFER_SIZE);
 
 	request_buffer->interface_version = HV_24X7_IF_VERSION_CURRENT;
 	/* memset above set request_buffer->num_requests to 0 */
@@ -1126,7 +1126,7 @@ static int add_event_to_24x7_request(struct perf_event *event,
 	req->performance_domain = event_get_domain(event);
 	req->data_size = cpu_to_be16(8);
 	req->data_offset = cpu_to_be32(event_get_offset(event));
-	req->starting_lpar_ix = cpu_to_be16(event_get_lpar(event)),
+	req->starting_lpar_ix = cpu_to_be16(event_get_lpar(event));
 	req->max_num_lpars = cpu_to_be16(1);
 	req->starting_ix = cpu_to_be16(idx);
 	req->max_ix = cpu_to_be16(1);
@@ -1218,9 +1218,8 @@ static int h_24x7_event_init(struct perf_event *event)
 		return -EINVAL;
 	}
 
-	/* Domains above 6 are invalid */
 	domain = event_get_domain(event);
-	if (domain > 6) {
+	if (domain > HV_PERF_DOMAIN_MAX) {
 		pr_devel("invalid domain %d\n", domain);
 		return -EINVAL;
 	}
diff --git a/arch/powerpc/perf/hv-24x7.h b/arch/powerpc/perf/hv-24x7.h
index 634ef4082cdc..b95909400b2a 100644
--- a/arch/powerpc/perf/hv-24x7.h
+++ b/arch/powerpc/perf/hv-24x7.h
@@ -71,6 +71,10 @@ struct hv_24x7_result_element {
 } __packed;
 
 struct hv_24x7_result {
+	/*
+	 * The index of the 24x7 Request Structure in the 24x7 Request Buffer
+	 * used to request this result.
+	 */
 	__u8 result_ix;
 
 	/*
@@ -81,7 +85,12 @@ struct hv_24x7_result {
 	__u8 results_complete;
 	__be16 num_elements_returned;
 
-	/* This is a copy of @data_size from the corresponding hv_24x7_request */
+	/*
+	 * This is a copy of @data_size from the corresponding hv_24x7_request
+	 *
+	 * Warning: to obtain the size of each element in @elements you have
+	 * to add the size of the other members of the result_element struct.
+	 */
 	__be16 result_element_data_size;
 	__u8 reserved[0x2];
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 7/8] powerpc/perf/hv-24x7: Support v2 of the hypervisor API
  2017-06-28 23:10 [PATCH v2 0/8] Support for 24x7 hcall interface version 2 Thiago Jung Bauermann
                   ` (5 preceding siblings ...)
  2017-06-28 23:10 ` [PATCH v2 6/8] powerpc/perf/hv-24x7: Minor improvements Thiago Jung Bauermann
@ 2017-06-28 23:10 ` Thiago Jung Bauermann
  2017-06-28 23:10 ` [PATCH v2 8/8] powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8 Thiago Jung Bauermann
  7 siblings, 0 replies; 9+ messages in thread
From: Thiago Jung Bauermann @ 2017-06-28 23:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Sukadev Bhattiprolu, Thiago Jung Bauermann

POWER9 introduces a new version of the hypervisor API to access the 24x7
perf counters. The new version changed some of the structures used for
requests and results.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c            | 132 +++++++++++++++++++++++++++------
 arch/powerpc/perf/hv-24x7.h            |  58 ++++++++++++---
 arch/powerpc/platforms/pseries/Kconfig |   2 +-
 3 files changed, 158 insertions(+), 34 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 043cbc78be98..38f6a81683a8 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -18,6 +18,7 @@
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 
+#include <asm/cputhreads.h>
 #include <asm/firmware.h>
 #include <asm/hvcall.h>
 #include <asm/io.h>
@@ -27,6 +28,9 @@
 #include "hv-24x7-catalog.h"
 #include "hv-common.h"
 
+/* Version of the 24x7 hypervisor API that we should use in this machine. */
+static int interface_version;
+
 static bool domain_is_valid(unsigned domain)
 {
 	switch (domain) {
@@ -74,7 +78,11 @@ static const char *domain_name(unsigned domain)
 
 static bool catalog_entry_domain_is_valid(unsigned domain)
 {
-	return is_physical_domain(domain);
+	/* POWER8 doesn't support virtual domains. */
+	if (interface_version == 1)
+		return is_physical_domain(domain);
+	else
+		return domain_is_valid(domain);
 }
 
 /*
@@ -166,9 +174,11 @@ DEFINE_PER_CPU(struct hv_24x7_hw, hv_24x7_hw);
 DEFINE_PER_CPU(char, hv_24x7_reqb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
 DEFINE_PER_CPU(char, hv_24x7_resb[H24x7_DATA_BUFFER_SIZE]) __aligned(4096);
 
-#define MAX_NUM_REQUESTS	((H24x7_DATA_BUFFER_SIZE -		       \
-					sizeof(struct hv_24x7_request_buffer)) \
-					/ sizeof(struct hv_24x7_request))
+static unsigned int max_num_requests(int interface_version)
+{
+	return (H24x7_DATA_BUFFER_SIZE - sizeof(struct hv_24x7_request_buffer))
+		/ H24x7_REQUEST_SIZE(interface_version);
+}
 
 static char *event_name(struct hv_24x7_event_data *ev, int *len)
 {
@@ -1052,7 +1062,7 @@ static void init_24x7_request(struct hv_24x7_request_buffer *request_buffer,
 	memset(request_buffer, 0, H24x7_DATA_BUFFER_SIZE);
 	memset(result_buffer, 0, H24x7_DATA_BUFFER_SIZE);
 
-	request_buffer->interface_version = HV_24X7_IF_VERSION_CURRENT;
+	request_buffer->interface_version = interface_version;
 	/* memset above set request_buffer->num_requests to 0 */
 }
 
@@ -1077,7 +1087,7 @@ static int make_24x7_request(struct hv_24x7_request_buffer *request_buffer,
 	if (ret) {
 		struct hv_24x7_request *req;
 
-		req = &request_buffer->requests[0];
+		req = request_buffer->requests;
 		pr_notice_ratelimited("hcall failed: [%d %#x %#x %d] => ret 0x%lx (%ld) detail=0x%x failing ix=%x\n",
 				      req->performance_domain, req->data_offset,
 				      req->starting_ix, req->starting_lpar_ix,
@@ -1101,9 +1111,11 @@ static int add_event_to_24x7_request(struct perf_event *event,
 {
 	u16 idx;
 	int i;
+	size_t req_size;
 	struct hv_24x7_request *req;
 
-	if (request_buffer->num_requests >= MAX_NUM_REQUESTS) {
+	if (request_buffer->num_requests >=
+	    max_num_requests(request_buffer->interface_version)) {
 		pr_devel("Too many requests for 24x7 HCALL %d\n",
 				request_buffer->num_requests);
 		return -EINVAL;
@@ -1120,8 +1132,10 @@ static int add_event_to_24x7_request(struct perf_event *event,
 		idx = event_get_vcpu(event);
 	}
 
+	req_size = H24x7_REQUEST_SIZE(request_buffer->interface_version);
+
 	i = request_buffer->num_requests++;
-	req = &request_buffer->requests[i];
+	req = (void *) request_buffer->requests + i * req_size;
 
 	req->performance_domain = event_get_domain(event);
 	req->data_size = cpu_to_be16(8);
@@ -1131,14 +1145,86 @@ static int add_event_to_24x7_request(struct perf_event *event,
 	req->starting_ix = cpu_to_be16(idx);
 	req->max_ix = cpu_to_be16(1);
 
+	if (request_buffer->interface_version > 1 &&
+	    req->performance_domain != HV_PERF_DOMAIN_PHYS_CHIP) {
+		req->starting_thread_group_ix = idx % 2;
+		req->max_num_thread_groups = 1;
+	}
+
+	return 0;
+}
+
+/**
+ * get_count_from_result - get event count from the given result
+ *
+ * @event:	Event associated with @res.
+ * @resb:	Result buffer containing @res.
+ * @res:	Result to work on.
+ * @countp:	Output variable containing the event count.
+ * @next:	Optional output variable pointing to the next result in @resb.
+ */
+static int get_count_from_result(struct perf_event *event,
+				 struct hv_24x7_data_result_buffer *resb,
+				 struct hv_24x7_result *res, u64 *countp,
+				 struct hv_24x7_result **next)
+{
+	u16 num_elements = be16_to_cpu(res->num_elements_returned);
+	u16 data_size = be16_to_cpu(res->result_element_data_size);
+	unsigned int data_offset;
+	void *element_data;
+
+	/*
+	 * We can bail out early if the result is empty.
+	 */
+	if (!num_elements) {
+		pr_debug("Result of request %hhu is empty, nothing to do\n",
+			 res->result_ix);
+
+		if (next)
+			*next = (struct hv_24x7_result *) res->elements;
+
+		return -ENODATA;
+	}
+
+	/*
+	 * Since we always specify 1 as the maximum for the smallest resource
+	 * we're requesting, there should to be only one element per result.
+	 */
+	if (num_elements != 1) {
+		pr_err("Error: result of request %hhu has %hu elements\n",
+		       res->result_ix, num_elements);
+
+		return -EIO;
+	}
+
+	if (data_size != sizeof(u64)) {
+		pr_debug("Error: result of request %hhu has data of %hu bytes\n",
+			 res->result_ix, data_size);
+
+		return -ENOTSUPP;
+	}
+
+	if (resb->interface_version == 1)
+		data_offset = offsetof(struct hv_24x7_result_element_v1,
+				       element_data);
+	else
+		data_offset = offsetof(struct hv_24x7_result_element_v2,
+				       element_data);
+
+	element_data = res->elements + data_offset;
+
+	*countp = be64_to_cpu(*((u64 *) element_data));
+
+	/* The next result is after the result element. */
+	if (next)
+		*next = element_data + data_size;
+
 	return 0;
 }
 
 static int single_24x7_request(struct perf_event *event, u64 *count)
 {
 	int ret;
-	u16 num_elements;
-	struct hv_24x7_result *result;
 	struct hv_24x7_request_buffer *request_buffer;
 	struct hv_24x7_data_result_buffer *result_buffer;
 
@@ -1158,14 +1244,9 @@ static int single_24x7_request(struct perf_event *event, u64 *count)
 	if (ret)
 		goto out;
 
-	result = result_buffer->results;
-
-	/* This code assumes that a result has only one element. */
-	num_elements = be16_to_cpu(result->num_elements_returned);
-	WARN_ON_ONCE(num_elements != 1);
-
 	/* process result from hcall */
-	*count = be64_to_cpu(result->elements[0].element_data[0]);
+	ret = get_count_from_result(event, result_buffer,
+				    result_buffer->results, count, NULL);
 
 out:
 	put_cpu_var(hv_24x7_reqb);
@@ -1425,16 +1506,13 @@ static int h_24x7_event_commit_txn(struct pmu *pmu)
 	for (i = 0, res = result_buffer->results;
 	     i < result_buffer->num_results; i++, res = next_res) {
 		struct perf_event *event = h24x7hw->events[res->result_ix];
-		u16 num_elements = be16_to_cpu(res->num_elements_returned);
-		u16 data_size = be16_to_cpu(res->result_element_data_size);
 
-		/* This code assumes that a result has only one element. */
-		WARN_ON_ONCE(num_elements != 1);
+		ret = get_count_from_result(event, result_buffer, res, &count,
+					    &next_res);
+		if (ret)
+			break;
 
-		count = be64_to_cpu(res->elements[0].element_data[0]);
 		update_event_count(event, count);
-
-		next_res = (void *) res->elements[0].element_data + data_size;
 	}
 
 	put_cpu_var(hv_24x7_hw);
@@ -1486,6 +1564,12 @@ static int hv_24x7_init(void)
 		return -ENODEV;
 	}
 
+	/* POWER8 only supports v1, while POWER9 only supports v2. */
+	if (cpu_has_feature(CPU_FTR_ARCH_300))
+		interface_version = 2;
+	else
+		interface_version = 1;
+
 	hret = hv_perf_caps_get(&caps);
 	if (hret) {
 		pr_debug("could not obtain capabilities, not enabling, rc=%ld\n",
diff --git a/arch/powerpc/perf/hv-24x7.h b/arch/powerpc/perf/hv-24x7.h
index b95909400b2a..5092c4a222a6 100644
--- a/arch/powerpc/perf/hv-24x7.h
+++ b/arch/powerpc/perf/hv-24x7.h
@@ -10,6 +10,8 @@ enum hv_perf_domains {
 	HV_PERF_DOMAIN_MAX,
 };
 
+#define H24x7_REQUEST_SIZE(iface_version)	(iface_version == 1 ? 16 : 32)
+
 struct hv_24x7_request {
 	/* PHYSICAL domains require enabling via phyp/hmc. */
 	__u8 performance_domain;
@@ -42,19 +44,47 @@ struct hv_24x7_request {
 	/* chip, core, or virtual processor based on @performance_domain */
 	__be16 starting_ix;
 	__be16 max_ix;
+
+	/* The following fields were added in v2 of the 24x7 interface. */
+
+	__u8 starting_thread_group_ix;
+
+	/* -1 means all thread groups starting at @starting_thread_group_ix */
+	__u8 max_num_thread_groups;
+
+	__u8 reserved2[0xE];
 } __packed;
 
 struct hv_24x7_request_buffer {
 	/* 0 - ? */
 	/* 1 - ? */
-#define HV_24X7_IF_VERSION_CURRENT 0x01
 	__u8 interface_version;
 	__u8 num_requests;
 	__u8 reserved[0xE];
-	struct hv_24x7_request requests[1];
+	struct hv_24x7_request requests[];
+} __packed;
+
+struct hv_24x7_result_element_v1 {
+	__be16 lpar_ix;
+
+	/*
+	 * represents the core, chip, or virtual processor based on the
+	 * request's @performance_domain
+	 */
+	__be16 domain_ix;
+
+	/* -1 if @performance_domain does not refer to a virtual processor */
+	__be32 lpar_cfg_instance_id;
+
+	/* size = @result_element_data_size of containing result. */
+	__u64 element_data[];
 } __packed;
 
-struct hv_24x7_result_element {
+/*
+ * We need a separate struct for v2 because the offset of @element_data changed
+ * between versions.
+ */
+struct hv_24x7_result_element_v2 {
 	__be16 lpar_ix;
 
 	/*
@@ -66,8 +96,12 @@ struct hv_24x7_result_element {
 	/* -1 if @performance_domain does not refer to a virtual processor */
 	__be32 lpar_cfg_instance_id;
 
+	__u8 thread_group_ix;
+
+	__u8 reserved[7];
+
 	/* size = @result_element_data_size of containing result. */
-	__u64 element_data[1];
+	__u64 element_data[];
 } __packed;
 
 struct hv_24x7_result {
@@ -94,10 +128,16 @@ struct hv_24x7_result {
 	__be16 result_element_data_size;
 	__u8 reserved[0x2];
 
-	/* WARNING: only valid for first result element due to variable sizes
-	 *          of result elements */
-	/* struct hv_24x7_result_element[@num_elements_returned] */
-	struct hv_24x7_result_element elements[1];
+	/*
+	 * Either
+	 *	struct hv_24x7_result_element_v1[@num_elements_returned]
+	 * or
+	 *	struct hv_24x7_result_element_v2[@num_elements_returned]
+	 *
+	 * depending on the interface_version field of the
+	 * struct hv_24x7_data_result_buffer containing this result.
+	 */
+	char elements[];
 } __packed;
 
 struct hv_24x7_data_result_buffer {
@@ -113,7 +153,7 @@ struct hv_24x7_data_result_buffer {
 	__u8 reserved2[0x8];
 	/* WARNING: only valid for the first result due to variable sizes of
 	 *	    results */
-	struct hv_24x7_result results[1]; /* [@num_results] */
+	struct hv_24x7_result results[]; /* [@num_results] */
 } __packed;
 
 #endif
diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig
index 913c54e23eea..3a6dfd14f64b 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -124,7 +124,7 @@ config HV_PERF_CTRS
 	  Enable access to hypervisor supplied counters in perf. Currently,
 	  this enables code that uses the hcall GetPerfCounterInfo and 24x7
 	  interfaces to retrieve counters. GPCI exists on Power 6 and later
-	  systems. 24x7 is available on Power 8 systems.
+	  systems. 24x7 is available on Power 8 and later systems.
 
           If unsure, select Y.
 
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v2 8/8] powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8
  2017-06-28 23:10 [PATCH v2 0/8] Support for 24x7 hcall interface version 2 Thiago Jung Bauermann
                   ` (6 preceding siblings ...)
  2017-06-28 23:10 ` [PATCH v2 7/8] powerpc/perf/hv-24x7: Support v2 of the hypervisor API Thiago Jung Bauermann
@ 2017-06-28 23:10 ` Thiago Jung Bauermann
  7 siblings, 0 replies; 9+ messages in thread
From: Thiago Jung Bauermann @ 2017-06-28 23:10 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Sukadev Bhattiprolu, Thiago Jung Bauermann

On POWER9 SMT8 the 24x7 API returns two result elements for physical core
and virtual CPU events and we need to add their counts to get the final
result.

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/powerpc/perf/hv-24x7.c | 54 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 42 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 38f6a81683a8..8a61b2415186 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -31,6 +31,9 @@
 /* Version of the 24x7 hypervisor API that we should use in this machine. */
 static int interface_version;
 
+/* Whether we have to aggregate result data for some domains. */
+static bool aggregate_result_elements;
+
 static bool domain_is_valid(unsigned domain)
 {
 	switch (domain) {
@@ -58,6 +61,15 @@ static bool is_physical_domain(unsigned domain)
 	}
 }
 
+/* Domains for which more than one result element are returned for each event. */
+static bool domain_needs_aggregation(unsigned int domain)
+{
+	return aggregate_result_elements &&
+			(domain == HV_PERF_DOMAIN_PHYS_CORE ||
+			 (domain >= HV_PERF_DOMAIN_VCPU_HOME_CORE &&
+			  domain <= HV_PERF_DOMAIN_VCPU_REMOTE_NODE));
+}
+
 static const char *domain_name(unsigned domain)
 {
 	if (!domain_is_valid(domain))
@@ -1145,17 +1157,23 @@ static int add_event_to_24x7_request(struct perf_event *event,
 	req->starting_ix = cpu_to_be16(idx);
 	req->max_ix = cpu_to_be16(1);
 
-	if (request_buffer->interface_version > 1 &&
-	    req->performance_domain != HV_PERF_DOMAIN_PHYS_CHIP) {
-		req->starting_thread_group_ix = idx % 2;
-		req->max_num_thread_groups = 1;
+	if (request_buffer->interface_version > 1) {
+		if (domain_needs_aggregation(req->performance_domain))
+			req->max_num_thread_groups = -1;
+		else if (req->performance_domain != HV_PERF_DOMAIN_PHYS_CHIP) {
+			req->starting_thread_group_ix = idx % 2;
+			req->max_num_thread_groups = 1;
+		}
 	}
 
 	return 0;
 }
 
 /**
- * get_count_from_result - get event count from the given result
+ * get_count_from_result - get event count from all result elements in result
+ *
+ * If the event corresponding to this result needs aggregation of the result
+ * element values, then this function does that.
  *
  * @event:	Event associated with @res.
  * @resb:	Result buffer containing @res.
@@ -1172,6 +1190,8 @@ static int get_count_from_result(struct perf_event *event,
 	u16 data_size = be16_to_cpu(res->result_element_data_size);
 	unsigned int data_offset;
 	void *element_data;
+	int i;
+	u64 count;
 
 	/*
 	 * We can bail out early if the result is empty.
@@ -1189,8 +1209,10 @@ static int get_count_from_result(struct perf_event *event,
 	/*
 	 * Since we always specify 1 as the maximum for the smallest resource
 	 * we're requesting, there should to be only one element per result.
+	 * Except when an event needs aggregation, in which case there are more.
 	 */
-	if (num_elements != 1) {
+	if (num_elements != 1 &&
+	    !domain_needs_aggregation(event_get_domain(event))) {
 		pr_err("Error: result of request %hhu has %hu elements\n",
 		       res->result_ix, num_elements);
 
@@ -1211,13 +1233,17 @@ static int get_count_from_result(struct perf_event *event,
 		data_offset = offsetof(struct hv_24x7_result_element_v2,
 				       element_data);
 
-	element_data = res->elements + data_offset;
+	/* Go through the result elements in the result. */
+	for (i = count = 0, element_data = res->elements + data_offset;
+	     i < num_elements;
+	     i++, element_data += data_size + data_offset)
+		count += be64_to_cpu(*((u64 *) element_data));
 
-	*countp = be64_to_cpu(*((u64 *) element_data));
+	*countp = count;
 
-	/* The next result is after the result element. */
+	/* The next result is after the last result element. */
 	if (next)
-		*next = element_data + data_size;
+		*next = element_data - data_offset;
 
 	return 0;
 }
@@ -1565,9 +1591,13 @@ static int hv_24x7_init(void)
 	}
 
 	/* POWER8 only supports v1, while POWER9 only supports v2. */
-	if (cpu_has_feature(CPU_FTR_ARCH_300))
+	if (cpu_has_feature(CPU_FTR_ARCH_300)) {
 		interface_version = 2;
-	else
+
+		/* SMT8 in POWER9 needs to aggregate result elements. */
+		if (threads_per_core == 8)
+			aggregate_result_elements = true;
+	} else
 		interface_version = 1;
 
 	hret = hv_perf_caps_get(&caps);
-- 
2.13.0

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-06-28 23:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-28 23:10 [PATCH v2 0/8] Support for 24x7 hcall interface version 2 Thiago Jung Bauermann
2017-06-28 23:10 ` [PATCH v2 1/8] powerpc/perf/hv-24x7: Fix passing of catalog version number Thiago Jung Bauermann
2017-06-28 23:10 ` [PATCH v2 2/8] powerpc/perf/hv-24x7: Fix off-by-one error in request_buffer check Thiago Jung Bauermann
2017-06-28 23:10 ` [PATCH v2 3/8] powerpc/perf/hv-24x7: Properly iterate through results Thiago Jung Bauermann
2017-06-28 23:10 ` [PATCH v2 4/8] powerpc-perf/hx-24x7: Don't log failed hcall twice Thiago Jung Bauermann
2017-06-28 23:10 ` [PATCH v2 5/8] powerpc/perf/hv-24x7: Fix return value of hcalls Thiago Jung Bauermann
2017-06-28 23:10 ` [PATCH v2 6/8] powerpc/perf/hv-24x7: Minor improvements Thiago Jung Bauermann
2017-06-28 23:10 ` [PATCH v2 7/8] powerpc/perf/hv-24x7: Support v2 of the hypervisor API Thiago Jung Bauermann
2017-06-28 23:10 ` [PATCH v2 8/8] powerpc/perf/hv-24x7: Aggregate result elements on POWER9 SMT8 Thiago Jung Bauermann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.