linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates
@ 2017-02-17 22:06 Andrew Banman
  2017-02-17 22:06 ` [PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated constants Andrew Banman
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Andrew Banman @ 2017-02-17 22:06 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

The following patch series adds the necessary functionality to make the BAU
on UV4 operational. The purpose of these patches is to implement the correct
message completion logic on UV4. Also included is a bug fix to add a field
to the INTD payload. This is needed to verify the source of each message.

As of this patch set, the BAU operates without errors and performance tests
show TLB shootdowns take up to 42% less time with the BAU enabled.

The patches are summarized as follows:

(1) Populate a message payload field to verify messages at the destination.
    Without this verification, the destination agent triggers a HUB error,
    resulting in an NMI.

    [PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated
    [PATCH v2 2/6] x86/platform/uv/BAU: Add payload descriptor qualifier

    This bug fix is included at the start of the series to avoid conflicts
    in a code path shared by the rest of the series.

(2) Make the wait_completion routine part of the bau_operations interface,
    and add a uv4_wait_completion routine to employ new completion logic.

    The message completion logic for previous generations relies on software-
    defined timeouts that are not implemented on UV4. Without these patches,
    the BAU driver on UV4 erroneously identifies a UV2-WAR timeout during
    normal operation.

    [PATCH 3/6] x86/platform/uv/BAU: Cleanup bau_operations declaration
    [PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to
    [PATCH v2 5/6] x86/platform/uv/BAU: Add wait_completion to
    [PATCH v2 6/6] x86/platform/uv/BAU: Implement uv4_wait_completion with


Please see the commit messages for details on the motivation and content of
each patch.

Thank you,

Andrew Banman
HPE, Linux Kernel Engineer
<abanman@hpe.com>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated constants
  2017-02-17 22:06 [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
@ 2017-02-17 22:06 ` Andrew Banman
  2017-02-17 22:06 ` [PATCH v2 2/6] x86/platform/uv/BAU: Add payload descriptor qualifier Andrew Banman
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andrew Banman @ 2017-02-17 22:06 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

Define enumerated constants for each UV hub version and replace magic
numbers with the appropriate constant. This makes our checks against
uvhub_version more robust, and any use of unsupported archs will be
caught during compilation.

Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Mike Travis <mike.travis@hpe.com>
---
 arch/x86/include/asm/uv/uv_bau.h |  7 +++++++
 arch/x86/platform/uv/tlb_uv.c    | 16 ++++++++--------
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 57ab86d..768093f 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -185,6 +185,13 @@
 #define MSG_REGULAR			1
 #define MSG_RETRY			2
 
+enum uv_bau_version {
+	UV_BAU_V1 = 1,
+	UV_BAU_V2,
+	UV_BAU_V3,
+	UV_BAU_V4,
+};
+
 /*
  * Distribution: 32 bytes (256 bits) (bytes 0-0x1f of descriptor)
  * If the 'multilevel' flag in the header portion of the descriptor
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index f25982c..f4f5aa6 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -724,7 +724,7 @@ static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, l
 		right_shift = ((desc - UV_CPUS_PER_AS) * UV_ACT_STATUS_SIZE);
 	}
 
-	if (bcp->uvhub_version == 1)
+	if (bcp->uvhub_version == UV_BAU_V1)
 		return uv1_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
 	else
 		return uv2_3_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
@@ -918,7 +918,7 @@ int uv_flush_send_and_wait(struct cpumask *flush_mask, struct bau_control *bcp,
 	struct uv1_bau_msg_header *uv1_hdr = NULL;
 	struct uv2_3_bau_msg_header *uv2_3_hdr = NULL;
 
-	if (bcp->uvhub_version == 1) {
+	if (bcp->uvhub_version == UV_BAU_V1) {
 		uv1 = 1;
 		uv1_throttle(hmaster, stat);
 	}
@@ -1296,7 +1296,7 @@ void uv_bau_message_interrupt(struct pt_regs *regs)
 
 		msgdesc.msg_slot = msg - msgdesc.queue_first;
 		msgdesc.msg = msg;
-		if (bcp->uvhub_version == 2)
+		if (bcp->uvhub_version == UV_BAU_V2)
 			process_uv2_message(&msgdesc, bcp);
 		else
 			/* no error workaround for uv1 or uv3 */
@@ -1838,7 +1838,7 @@ static void pq_init(int node, int pnode)
 	 * and the payload queue tail must be maintained by the kernel.
 	 */
 	bcp = &per_cpu(bau_control, smp_processor_id());
-	if (bcp->uvhub_version <= 3) {
+	if (bcp->uvhub_version <= UV_BAU_V3) {
 		tail = first;
 		gnode = uv_gpa_to_gnode(uv_gpa(pqp));
 		first = (gnode << UV_PAYLOADQ_GNODE_SHIFT) | tail;
@@ -2052,13 +2052,13 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
 		bcp->socket_master = *smasterp;
 		bcp->uvhub = bdp->uvhub;
 		if (is_uv1_hub())
-			bcp->uvhub_version = 1;
+			bcp->uvhub_version = UV_BAU_V1;
 		else if (is_uv2_hub())
-			bcp->uvhub_version = 2;
+			bcp->uvhub_version = UV_BAU_V2;
 		else if (is_uv3_hub())
-			bcp->uvhub_version = 3;
+			bcp->uvhub_version = UV_BAU_V3;
 		else if (is_uv4_hub())
-			bcp->uvhub_version = 4;
+			bcp->uvhub_version = UV_BAU_V4;
 		else {
 			pr_emerg("uvhub version not 1, 2, 3, or 4\n");
 			return 1;
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/6] x86/platform/uv/BAU: Add payload descriptor qualifier
  2017-02-17 22:06 [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
  2017-02-17 22:06 ` [PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated constants Andrew Banman
@ 2017-02-17 22:06 ` Andrew Banman
  2017-02-17 22:06 ` [PATCH 3/6] x86/platform/uv/BAU: Cleanup bau_operations declaration and instances Andrew Banman
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andrew Banman @ 2017-02-17 22:06 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

On UV4, the destination agent verifies each message by checking the
descriptor qualifier field of the message payload. Messages without this
field set to 0x534749 will cause a hub error to assert.

Split bau_message_payload into uv1_2_3 and uv4 versions to account for the
different payload formats. Enforce the size of each field by using the
appropriate integer type. Replace extraneous comments with a kernel-doc
comment for each struct.

Populate the descriptor qualifier field before the INTD broadcast is
initiated by the sender. This ensures each message sent by with the BAU
driver is verified.

Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Mike Travis <mike.travis@hpe.com>
---
 arch/x86/include/asm/uv/uv_bau.h | 40 ++++++++++++++++++++++++++++------------
 arch/x86/platform/uv/tlb_uv.c    | 27 +++++++++++++++++++--------
 2 files changed, 47 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 768093f..1ed0574 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -185,6 +185,8 @@
 #define MSG_REGULAR			1
 #define MSG_RETRY			2
 
+#define BAU_DESC_QUALIFIER		0x534749
+
 enum uv_bau_version {
 	UV_BAU_V1 = 1,
 	UV_BAU_V2,
@@ -229,20 +231,31 @@ struct bau_local_cpumask {
  *   the s/w ack bit vector  ]
  */
 
-/*
- * The payload is software-defined for INTD transactions
+/**
+ * struct uv1_2_3_bau_msg_payload - defines payload for INTD transactions
+ * @address: signifies a page or all TLB's of the cpu
+ * @acknowledge_count: CPUs on the destination Hub that received the interrupt
  */
-struct bau_msg_payload {
-	unsigned long	address;		/* signifies a page or all
-						   TLB's of the cpu */
-	/* 64 bits */
-	unsigned short	sending_cpu;		/* filled in by sender */
-	/* 16 bits */
-	unsigned short	acknowledge_count;	/* filled in by destination */
-	/* 16 bits */
-	unsigned int	reserved1:32;		/* not usable */
+struct uv1_2_3_bau_msg_payload {
+	u64 address;
+	u16 sending_cpu;
+	u16 acknowledge_count;
+	u32 reserved1;
 };
 
+/**
+ * struct uv4_bau_msg_payload - defines payload for INTD transactions
+ * @address: signifies a page or all TLB's of the cpu
+ * @acknowledge_count: CPUs on the destination Hub that received the interrupt
+ * @qualifier: set by source to verify origin of INTD broadcast
+ */
+struct uv4_bau_msg_payload {
+	u64 address;
+	u16 sending_cpu;
+	u16 acknowledge_count;
+	u32 reserved1:8;
+	u32 qualifier:24;
+};
 
 /*
  * UV1 Message header:  16 bytes (128 bits) (bytes 0x30-0x3f of descriptor)
@@ -418,7 +431,10 @@ struct bau_desc {
 		struct uv2_3_bau_msg_header	uv2_3_hdr;
 	} header;
 
-	struct bau_msg_payload			payload;
+	union bau_payload_header {
+		struct uv1_2_3_bau_msg_payload	uv1_2_3;
+		struct uv4_bau_msg_payload	uv4;
+	} payload;
 };
 /* UV1:
  *   -payload--    ---------header------
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index f4f5aa6..70721c4 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -1114,15 +1114,12 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 						unsigned long end,
 						unsigned int cpu)
 {
-	int locals = 0;
-	int remotes = 0;
-	int hubs = 0;
+	int locals = 0, remotes = 0, hubs = 0;
 	struct bau_desc *bau_desc;
 	struct cpumask *flush_mask;
 	struct ptc_stats *stat;
 	struct bau_control *bcp;
-	unsigned long descriptor_status;
-	unsigned long status;
+	unsigned long descriptor_status, status, address;
 
 	bcp = &per_cpu(bau_control, cpu);
 
@@ -1171,10 +1168,24 @@ const struct cpumask *uv_flush_tlb_others(const struct cpumask *cpumask,
 	record_send_statistics(stat, locals, hubs, remotes, bau_desc);
 
 	if (!end || (end - start) <= PAGE_SIZE)
-		bau_desc->payload.address = start;
+		address = start;
 	else
-		bau_desc->payload.address = TLB_FLUSH_ALL;
-	bau_desc->payload.sending_cpu = cpu;
+		address = TLB_FLUSH_ALL;
+
+	switch (bcp->uvhub_version) {
+	case UV_BAU_V1:
+	case UV_BAU_V2:
+	case UV_BAU_V3:
+		bau_desc->payload.uv1_2_3.address = address;
+		bau_desc->payload.uv1_2_3.sending_cpu = cpu;
+		break;
+	case UV_BAU_V4:
+		bau_desc->payload.uv4.address = address;
+		bau_desc->payload.uv4.sending_cpu = cpu;
+		bau_desc->payload.uv4.qualifier = BAU_DESC_QUALIFIER;
+		break;
+	}
+
 	/*
 	 * uv_flush_send_and_wait returns 0 if all cpu's were messaged,
 	 * or 1 if it gave up and the original cpumask should be returned.
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/6] x86/platform/uv/BAU: Cleanup bau_operations declaration and instances
  2017-02-17 22:06 [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
  2017-02-17 22:06 ` [PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated constants Andrew Banman
  2017-02-17 22:06 ` [PATCH v2 2/6] x86/platform/uv/BAU: Add payload descriptor qualifier Andrew Banman
@ 2017-02-17 22:06 ` Andrew Banman
  2017-02-17 22:06 ` [PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to bau_control Andrew Banman
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andrew Banman @ 2017-02-17 22:06 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

Move the bau_operations declaration after bau struct declarations so the
bau structs can be referenced when adding new functions to
bau_operations. That way we avoid forward declarations of the bau
structs.

Likewise, move uv*_bau_ops structs down to avoid forward declarations of
new functions defined in the same file. Declare these structs __initconst
since they are only used during initialization. Similarly, declare the
bau_operations ops instance __ro_after_init as it is read-only after
initialization.

This is a preparatory patch for adding wait_completion to bau_operations.

Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Mike Travis <mike.travis@hpe.com>
---
 arch/x86/include/asm/uv/uv_bau.h | 22 ++++++++++----------
 arch/x86/platform/uv/tlb_uv.c    | 43 ++++++++++++++++++++--------------------
 2 files changed, 32 insertions(+), 33 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 1ed0574..59ae8a7 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -405,17 +405,6 @@ struct uv2_3_bau_msg_header {
 	/* bits 127:120 */
 };
 
-/* Abstracted BAU functions */
-struct bau_operations {
-	unsigned long (*read_l_sw_ack)(void);
-	unsigned long (*read_g_sw_ack)(int pnode);
-	unsigned long (*bau_gpa_to_offset)(unsigned long vaddr);
-	void (*write_l_sw_ack)(unsigned long mmr);
-	void (*write_g_sw_ack)(int pnode, unsigned long mmr);
-	void (*write_payload_first)(int pnode, unsigned long mmr);
-	void (*write_payload_last)(int pnode, unsigned long mmr);
-};
-
 /*
  * The activation descriptor:
  * The format of the message to send, plus all accompanying control
@@ -667,6 +656,17 @@ struct bau_control {
 	struct hub_and_pnode	*thp;
 };
 
+/* Abstracted BAU functions */
+struct bau_operations {
+	unsigned long	(*read_l_sw_ack)(void);
+	unsigned long	(*read_g_sw_ack)(int pnode);
+	unsigned long	(*bau_gpa_to_offset)(unsigned long vaddr);
+	void		(*write_l_sw_ack)(unsigned long mmr);
+	void		(*write_g_sw_ack)(int pnode, unsigned long mmr);
+	void		(*write_payload_first)(int pnode, unsigned long mmr);
+	void		(*write_payload_last)(int pnode, unsigned long mmr);
+};
+
 static inline void write_mmr_data_broadcast(int pnode, unsigned long mmr_image)
 {
 	write_gmmr(pnode, UVH_BAU_DATA_BROADCAST, mmr_image);
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 70721c4..e6994fd 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -23,28 +23,7 @@
 #include <asm/irq_vectors.h>
 #include <asm/timer.h>
 
-static struct bau_operations ops;
-
-static struct bau_operations uv123_bau_ops = {
-	.bau_gpa_to_offset       = uv_gpa_to_offset,
-	.read_l_sw_ack           = read_mmr_sw_ack,
-	.read_g_sw_ack           = read_gmmr_sw_ack,
-	.write_l_sw_ack          = write_mmr_sw_ack,
-	.write_g_sw_ack          = write_gmmr_sw_ack,
-	.write_payload_first     = write_mmr_payload_first,
-	.write_payload_last      = write_mmr_payload_last,
-};
-
-static struct bau_operations uv4_bau_ops = {
-	.bau_gpa_to_offset       = uv_gpa_to_soc_phys_ram,
-	.read_l_sw_ack           = read_mmr_proc_sw_ack,
-	.read_g_sw_ack           = read_gmmr_proc_sw_ack,
-	.write_l_sw_ack          = write_mmr_proc_sw_ack,
-	.write_g_sw_ack          = write_gmmr_proc_sw_ack,
-	.write_payload_first     = write_mmr_proc_payload_first,
-	.write_payload_last      = write_mmr_proc_payload_last,
-};
-
+static struct bau_operations ops __ro_after_init;
 
 /* timeouts in nanoseconds (indexed by UVH_AGING_PRESCALE_SEL urgency7 30:28) */
 static int timeout_base_ns[] = {
@@ -2158,6 +2137,26 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
 	return 1;
 }
 
+static const struct bau_operations uv123_bau_ops __initconst = {
+	.bau_gpa_to_offset       = uv_gpa_to_offset,
+	.read_l_sw_ack           = read_mmr_sw_ack,
+	.read_g_sw_ack           = read_gmmr_sw_ack,
+	.write_l_sw_ack          = write_mmr_sw_ack,
+	.write_g_sw_ack          = write_gmmr_sw_ack,
+	.write_payload_first     = write_mmr_payload_first,
+	.write_payload_last      = write_mmr_payload_last,
+};
+
+static const struct bau_operations uv4_bau_ops __initconst = {
+	.bau_gpa_to_offset       = uv_gpa_to_soc_phys_ram,
+	.read_l_sw_ack           = read_mmr_proc_sw_ack,
+	.read_g_sw_ack           = read_gmmr_proc_sw_ack,
+	.write_l_sw_ack          = write_mmr_proc_sw_ack,
+	.write_g_sw_ack          = write_gmmr_proc_sw_ack,
+	.write_payload_first     = write_mmr_proc_payload_first,
+	.write_payload_last      = write_mmr_proc_payload_last,
+};
+
 /*
  * Initialization of BAU-related structures
  */
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to bau_control
  2017-02-17 22:06 [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
                   ` (2 preceding siblings ...)
  2017-02-17 22:06 ` [PATCH 3/6] x86/platform/uv/BAU: Cleanup bau_operations declaration and instances Andrew Banman
@ 2017-02-17 22:06 ` Andrew Banman
  2017-02-27 19:14   ` Andrew Banman
  2017-02-17 22:06 ` [PATCH v2 5/6] x86/platform/uv/BAU: Add wait_completion to bau_operations Andrew Banman
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Andrew Banman @ 2017-02-17 22:06 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

The location of the ERROR and BUSY status bits depends on the descriptor
index, i.e. the CPU, of the message. Since this index does not change,
there is no need to calculate the MMR and index location during message
processing. The less work we do in the hot path the better.

Add status_mmr and status_index fields to bau_control and compute their
values during initialization. Update uv*_wait_completion to use these
fields rather than receiving the information as parameters.

Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Mike Travis <mike.travis@hpe.com>

---
 arch/x86/include/asm/uv/uv_bau.h | 10 +++++++--
 arch/x86/platform/uv/tlb_uv.c    | 48 +++++++++++++++++++---------------------
 2 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 59ae8a7..a02019a 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -600,8 +600,12 @@ struct uvhub_desc {
 	struct socket_desc	socket[2];
 };
 
-/*
- * one per-cpu; to locate the software tables
+/**
+ * struct bau_control
+ * @status_mmr: location of status mrr, determined by uvhub_cpu
+ * @status_index: index of ERR|BUSY bits in status mrr, determined by uvhub_cpu
+ *
+ * Per-cpu control struct containing CPU topology information and BAU tuneables.
  */
 struct bau_control {
 	struct bau_desc		*descriptor_base;
@@ -619,6 +623,8 @@ struct bau_control {
 	int			timeout_tries;
 	int			ipi_attempts;
 	int			conseccompletes;
+	u64			status_mmr;
+	int			status_index;
 	bool			nobau;
 	short			baudisabled;
 	short			cpu;
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index e6994fd..b8d3830 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -527,11 +527,12 @@ static unsigned long uv1_read_status(unsigned long mmr_offset, int right_shift)
  * return COMPLETE, RETRY(PLUGGED or TIMEOUT) or GIVEUP
  */
 static int uv1_wait_completion(struct bau_desc *bau_desc,
-				unsigned long mmr_offset, int right_shift,
 				struct bau_control *bcp, long try)
 {
 	unsigned long descriptor_status;
 	cycles_t ttm;
+	u64 mmr_offset = bcp->status_mmr;
+	int right_shift = bcp->status_index;
 	struct ptc_stats *stat = bcp->statp;
 
 	descriptor_status = uv1_read_status(mmr_offset, right_shift);
@@ -619,11 +620,12 @@ int handle_uv2_busy(struct bau_control *bcp)
 }
 
 static int uv2_3_wait_completion(struct bau_desc *bau_desc,
-				unsigned long mmr_offset, int right_shift,
 				struct bau_control *bcp, long try)
 {
 	unsigned long descriptor_stat;
 	cycles_t ttm;
+	u64 mmr_offset = bcp->status_mmr;
+	int right_shift = bcp->status_index;
 	int desc = bcp->uvhub_cpu;
 	long busy_reps = 0;
 	struct ptc_stats *stat = bcp->statp;
@@ -684,29 +686,12 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
 	return FLUSH_COMPLETE;
 }
 
-/*
- * There are 2 status registers; each and array[32] of 2 bits. Set up for
- * which register to read and position in that register based on cpu in
- * current hub.
- */
 static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, long try)
 {
-	int right_shift;
-	unsigned long mmr_offset;
-	int desc = bcp->uvhub_cpu;
-
-	if (desc < UV_CPUS_PER_AS) {
-		mmr_offset = UVH_LB_BAU_SB_ACTIVATION_STATUS_0;
-		right_shift = desc * UV_ACT_STATUS_SIZE;
-	} else {
-		mmr_offset = UVH_LB_BAU_SB_ACTIVATION_STATUS_1;
-		right_shift = ((desc - UV_CPUS_PER_AS) * UV_ACT_STATUS_SIZE);
-	}
-
-	if (bcp->uvhub_version == UV_BAU_V1)
-		return uv1_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
+	if (bcp->uvhub_version == 1)
+		return uv1_wait_completion(bau_desc, bcp, try);
 	else
-		return uv2_3_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
+		return uv2_3_wait_completion(bau_desc, bcp, try);
 }
 
 /*
@@ -2024,8 +2009,7 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
 			struct bau_control **smasterp,
 			struct bau_control **hmasterp)
 {
-	int i;
-	int cpu;
+	int i, cpu, uvhub_cpu;
 	struct bau_control *bcp;
 
 	for (i = 0; i < sdp->num_cpus; i++) {
@@ -2054,7 +2038,21 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
 			return 1;
 		}
 		bcp->uvhub_master = *hmasterp;
-		bcp->uvhub_cpu = uv_cpu_blade_processor_id(cpu);
+		uvhub_cpu = uv_cpu_blade_processor_id(cpu);
+		bcp->uvhub_cpu = uvhub_cpu;
+
+		/*
+		 * The ERROR and BUSY status registers are located pairwise over
+		 * the STATUS_0 and STATUS_1 mmrs; each an array[32] of 2 bits.
+		 */
+		if (uvhub_cpu < UV_CPUS_PER_AS) {
+			bcp->status_mmr = UVH_LB_BAU_SB_ACTIVATION_STATUS_0;
+			bcp->status_index = uvhub_cpu * UV_ACT_STATUS_SIZE;
+		} else {
+			bcp->status_mmr = UVH_LB_BAU_SB_ACTIVATION_STATUS_1;
+			bcp->status_index = (uvhub_cpu - UV_CPUS_PER_AS)
+						* UV_ACT_STATUS_SIZE;
+		}
 
 		if (bcp->uvhub_cpu >= MAX_CPUS_PER_UVHUB) {
 			pr_emerg("%d cpus per uvhub invalid\n",
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/6] x86/platform/uv/BAU: Add wait_completion to bau_operations
  2017-02-17 22:06 [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
                   ` (3 preceding siblings ...)
  2017-02-17 22:06 ` [PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to bau_control Andrew Banman
@ 2017-02-17 22:06 ` Andrew Banman
  2017-02-27 19:26   ` Andrew Banman
  2017-02-17 22:06 ` [PATCH v2 6/6] x86/platform/uv/BAU: Implement uv4_wait_completion with read_status Andrew Banman
  2017-03-09  1:42 ` [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
  6 siblings, 1 reply; 13+ messages in thread
From: Andrew Banman @ 2017-02-17 22:06 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

Remove the present wait_completion routine and add a function pointer by
the same name to the bau_operations struct. Rather than switching on the
UV hub version during message processing, set the architecture-specific
uv*_wait_completion during initialization.

The uv123_bau_ops struct must be split into uv1 and uv2_3 versions to
accomodate the corresponding wait_completion routines.

Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Mike Travis <mike.travis@hpe.com>
---
 arch/x86/include/asm/uv/uv_bau.h |  2 ++
 arch/x86/platform/uv/tlb_uv.c    | 31 ++++++++++++++++++-------------
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index a02019a..b52b356 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -671,6 +671,8 @@ struct bau_operations {
 	void		(*write_g_sw_ack)(int pnode, unsigned long mmr);
 	void		(*write_payload_first)(int pnode, unsigned long mmr);
 	void		(*write_payload_last)(int pnode, unsigned long mmr);
+	int		(*wait_completion)(struct bau_desc*,
+				struct bau_control*, long try);
 };
 
 static inline void write_mmr_data_broadcast(int pnode, unsigned long mmr_image)
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index b8d3830..2a826dd 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -686,14 +686,6 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
 	return FLUSH_COMPLETE;
 }
 
-static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, long try)
-{
-	if (bcp->uvhub_version == 1)
-		return uv1_wait_completion(bau_desc, bcp, try);
-	else
-		return uv2_3_wait_completion(bau_desc, bcp, try);
-}
-
 /*
  * Our retries are blocked by all destination sw ack resources being
  * in use, and a timeout is pending. In that case hardware immediately
@@ -922,7 +914,7 @@ int uv_flush_send_and_wait(struct cpumask *flush_mask, struct bau_control *bcp,
 		write_mmr_activation(index);
 
 		try++;
-		completion_stat = wait_completion(bau_desc, bcp, try);
+		completion_stat = ops.wait_completion(bau_desc, bcp, try);
 
 		handle_cmplt(completion_stat, bau_desc, bcp, hmaster, stat);
 
@@ -2135,7 +2127,18 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
 	return 1;
 }
 
-static const struct bau_operations uv123_bau_ops __initconst = {
+static const struct bau_operations uv1_bau_ops __initconst = {
+	.bau_gpa_to_offset       = uv_gpa_to_offset,
+	.read_l_sw_ack           = read_mmr_sw_ack,
+	.read_g_sw_ack           = read_gmmr_sw_ack,
+	.write_l_sw_ack          = write_mmr_sw_ack,
+	.write_g_sw_ack          = write_gmmr_sw_ack,
+	.write_payload_first     = write_mmr_payload_first,
+	.write_payload_last      = write_mmr_payload_last,
+	.wait_completion	 = uv1_wait_completion,
+};
+
+static const struct bau_operations uv2_3_bau_ops __initconst = {
 	.bau_gpa_to_offset       = uv_gpa_to_offset,
 	.read_l_sw_ack           = read_mmr_sw_ack,
 	.read_g_sw_ack           = read_gmmr_sw_ack,
@@ -2143,6 +2146,7 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
 	.write_g_sw_ack          = write_gmmr_sw_ack,
 	.write_payload_first     = write_mmr_payload_first,
 	.write_payload_last      = write_mmr_payload_last,
+	.wait_completion	 = uv2_3_wait_completion,
 };
 
 static const struct bau_operations uv4_bau_ops __initconst = {
@@ -2153,6 +2157,7 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
 	.write_g_sw_ack          = write_gmmr_proc_sw_ack,
 	.write_payload_first     = write_mmr_proc_payload_first,
 	.write_payload_last      = write_mmr_proc_payload_last,
+	.wait_completion	 = uv2_3_wait_completion,
 };
 
 /*
@@ -2174,11 +2179,11 @@ static int __init uv_bau_init(void)
 	if (is_uv4_hub())
 		ops = uv4_bau_ops;
 	else if (is_uv3_hub())
-		ops = uv123_bau_ops;
+		ops = uv2_3_bau_ops;
 	else if (is_uv2_hub())
-		ops = uv123_bau_ops;
+		ops = uv2_3_bau_ops;
 	else if (is_uv1_hub())
-		ops = uv123_bau_ops;
+		ops = uv1_bau_ops;
 
 	for_each_possible_cpu(cur_cpu) {
 		mask = &per_cpu(uv_flush_tlb_mask, cur_cpu);
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 6/6] x86/platform/uv/BAU: Implement uv4_wait_completion with read_status
  2017-02-17 22:06 [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
                   ` (4 preceding siblings ...)
  2017-02-17 22:06 ` [PATCH v2 5/6] x86/platform/uv/BAU: Add wait_completion to bau_operations Andrew Banman
@ 2017-02-17 22:06 ` Andrew Banman
  2017-03-09  1:42 ` [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
  6 siblings, 0 replies; 13+ messages in thread
From: Andrew Banman @ 2017-02-17 22:06 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

UV4 does not employ a software-timeout as in previous generations so a new
wait_completion routine without this logic is required. Certain completion
statuses require the AUX status bit in addition to ERROR and BUSY.

Add the read_status routine to construct the full completion status. Use
read_status in the uv4_wait_completion routine to handle all possible
completion statuses.

Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Mike Travis <mike.travis@hpe.com>
---
 arch/x86/platform/uv/tlb_uv.c | 58 ++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 57 insertions(+), 1 deletion(-)

diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 2a826dd..42e65fe 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -687,6 +687,62 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
 }
 
 /*
+ * Returns the status of current BAU message for cpu desc as a bit field
+ * [Error][Busy][Aux]
+ */
+static u64 read_status(u64 status_mmr, int index, int desc)
+{
+	u64 stat;
+
+	stat = ((read_lmmr(status_mmr) >> index) & UV_ACT_STATUS_MASK) << 1;
+	stat |= (read_lmmr(UVH_LB_BAU_SB_ACTIVATION_STATUS_2) >> desc) & 0x1;
+
+	return stat;
+}
+
+static int uv4_wait_completion(struct bau_desc *bau_desc,
+				struct bau_control *bcp, long try)
+{
+	struct ptc_stats *stat = bcp->statp;
+	u64 descriptor_stat;
+	u64 mmr = bcp->status_mmr;
+	int index = bcp->status_index;
+	int desc = bcp->uvhub_cpu;
+
+	descriptor_stat = read_status(mmr, index, desc);
+
+	/* spin on the status MMR, waiting for it to go idle */
+	while (descriptor_stat != UV2H_DESC_IDLE) {
+		switch (descriptor_stat) {
+		case UV2H_DESC_SOURCE_TIMEOUT:
+			stat->s_stimeout++;
+			return FLUSH_GIVEUP;
+
+		case UV2H_DESC_DEST_TIMEOUT:
+			stat->s_dtimeout++;
+			bcp->conseccompletes = 0;
+			return FLUSH_RETRY_TIMEOUT;
+
+		case UV2H_DESC_DEST_STRONG_NACK:
+			stat->s_plugged++;
+			bcp->conseccompletes = 0;
+			return FLUSH_RETRY_PLUGGED;
+
+		case UV2H_DESC_DEST_PUT_ERR:
+			bcp->conseccompletes = 0;
+			return FLUSH_GIVEUP;
+
+		default:
+			/* descriptor_stat is still BUSY */
+			cpu_relax();
+		}
+		descriptor_stat = read_status(mmr, index, desc);
+	}
+	bcp->conseccompletes++;
+	return FLUSH_COMPLETE;
+}
+
+/*
  * Our retries are blocked by all destination sw ack resources being
  * in use, and a timeout is pending. In that case hardware immediately
  * returns the ERROR that looks like a destination timeout.
@@ -2157,7 +2213,7 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
 	.write_g_sw_ack          = write_gmmr_proc_sw_ack,
 	.write_payload_first     = write_mmr_proc_payload_first,
 	.write_payload_last      = write_mmr_proc_payload_last,
-	.wait_completion	 = uv2_3_wait_completion,
+	.wait_completion         = uv4_wait_completion,
 };
 
 /*
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to bau_control
  2017-02-17 22:06 ` [PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to bau_control Andrew Banman
@ 2017-02-27 19:14   ` Andrew Banman
  2017-02-27 19:14     ` [PATCH v2 " Andrew Banman
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Banman @ 2017-02-27 19:14 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

The above patch has a conflict in wait_completion with the enumerated UV_BAU
symbol because I let an old version of the patch creep into the set. Following
is the correct version.

Sorry about that!

Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 4/6] x86/platform/uv/BAU: Add status mmr location fields to bau_control
  2017-02-27 19:14   ` Andrew Banman
@ 2017-02-27 19:14     ` Andrew Banman
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Banman @ 2017-02-27 19:14 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

The location of the ERROR and BUSY status bits depends on the descriptor
index, i.e. the CPU, of the message. Since this index does not change,
there is no need to calculate the mmr and index location during message
processing. The less work we do in the hot path the better.

Add stats_mmr and status_index fields to bau_control and compute their
values during initialization. Update uv*_wait_completion to use these
fields rather than receiving the information as parameters.

Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Mike Travis <mike.travis@hpe.com>
---
 arch/x86/include/asm/uv/uv_bau.h | 10 +++++++--
 arch/x86/platform/uv/tlb_uv.c    | 46 +++++++++++++++++++---------------------
 2 files changed, 30 insertions(+), 26 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 59ae8a7..a02019a 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -600,8 +600,12 @@ struct uvhub_desc {
 	struct socket_desc	socket[2];
 };
 
-/*
- * one per-cpu; to locate the software tables
+/**
+ * struct bau_control
+ * @status_mmr: location of status mrr, determined by uvhub_cpu
+ * @status_index: index of ERR|BUSY bits in status mrr, determined by uvhub_cpu
+ *
+ * Per-cpu control struct containing CPU topology information and BAU tuneables.
  */
 struct bau_control {
 	struct bau_desc		*descriptor_base;
@@ -619,6 +623,8 @@ struct bau_control {
 	int			timeout_tries;
 	int			ipi_attempts;
 	int			conseccompletes;
+	u64			status_mmr;
+	int			status_index;
 	bool			nobau;
 	short			baudisabled;
 	short			cpu;
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index e6994fd..13a7055 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -527,11 +527,12 @@ static unsigned long uv1_read_status(unsigned long mmr_offset, int right_shift)
  * return COMPLETE, RETRY(PLUGGED or TIMEOUT) or GIVEUP
  */
 static int uv1_wait_completion(struct bau_desc *bau_desc,
-				unsigned long mmr_offset, int right_shift,
 				struct bau_control *bcp, long try)
 {
 	unsigned long descriptor_status;
 	cycles_t ttm;
+	u64 mmr_offset = bcp->status_mmr;
+	int right_shift = bcp->status_index;
 	struct ptc_stats *stat = bcp->statp;
 
 	descriptor_status = uv1_read_status(mmr_offset, right_shift);
@@ -619,11 +620,12 @@ int handle_uv2_busy(struct bau_control *bcp)
 }
 
 static int uv2_3_wait_completion(struct bau_desc *bau_desc,
-				unsigned long mmr_offset, int right_shift,
 				struct bau_control *bcp, long try)
 {
 	unsigned long descriptor_stat;
 	cycles_t ttm;
+	u64 mmr_offset = bcp->status_mmr;
+	int right_shift = bcp->status_index;
 	int desc = bcp->uvhub_cpu;
 	long busy_reps = 0;
 	struct ptc_stats *stat = bcp->statp;
@@ -684,29 +686,12 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
 	return FLUSH_COMPLETE;
 }
 
-/*
- * There are 2 status registers; each and array[32] of 2 bits. Set up for
- * which register to read and position in that register based on cpu in
- * current hub.
- */
 static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, long try)
 {
-	int right_shift;
-	unsigned long mmr_offset;
-	int desc = bcp->uvhub_cpu;
-
-	if (desc < UV_CPUS_PER_AS) {
-		mmr_offset = UVH_LB_BAU_SB_ACTIVATION_STATUS_0;
-		right_shift = desc * UV_ACT_STATUS_SIZE;
-	} else {
-		mmr_offset = UVH_LB_BAU_SB_ACTIVATION_STATUS_1;
-		right_shift = ((desc - UV_CPUS_PER_AS) * UV_ACT_STATUS_SIZE);
-	}
-
 	if (bcp->uvhub_version == UV_BAU_V1)
-		return uv1_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
+		return uv1_wait_completion(bau_desc, bcp, try);
 	else
-		return uv2_3_wait_completion(bau_desc, mmr_offset, right_shift, bcp, try);
+		return uv2_3_wait_completion(bau_desc, bcp, try);
 }
 
 /*
@@ -2024,8 +2009,7 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
 			struct bau_control **smasterp,
 			struct bau_control **hmasterp)
 {
-	int i;
-	int cpu;
+	int i, cpu, uvhub_cpu;
 	struct bau_control *bcp;
 
 	for (i = 0; i < sdp->num_cpus; i++) {
@@ -2054,7 +2038,21 @@ static int scan_sock(struct socket_desc *sdp, struct uvhub_desc *bdp,
 			return 1;
 		}
 		bcp->uvhub_master = *hmasterp;
-		bcp->uvhub_cpu = uv_cpu_blade_processor_id(cpu);
+		uvhub_cpu = uv_cpu_blade_processor_id(cpu);
+		bcp->uvhub_cpu = uvhub_cpu;
+
+		/*
+		 * The ERROR and BUSY status registers are located pairwise over
+		 * the STATUS_0 and STATUS_1 mmrs; each an array[32] of 2 bits.
+		 */
+		if (uvhub_cpu < UV_CPUS_PER_AS) {
+			bcp->status_mmr = UVH_LB_BAU_SB_ACTIVATION_STATUS_0;
+			bcp->status_index = uvhub_cpu * UV_ACT_STATUS_SIZE;
+		} else {
+			bcp->status_mmr = UVH_LB_BAU_SB_ACTIVATION_STATUS_1;
+			bcp->status_index = (uvhub_cpu - UV_CPUS_PER_AS)
+						* UV_ACT_STATUS_SIZE;
+		}
 
 		if (bcp->uvhub_cpu >= MAX_CPUS_PER_UVHUB) {
 			pr_emerg("%d cpus per uvhub invalid\n",
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 5/6] x86/platform/uv/BAU: Add wait_completion to bau_operations
  2017-02-17 22:06 ` [PATCH v2 5/6] x86/platform/uv/BAU: Add wait_completion to bau_operations Andrew Banman
@ 2017-02-27 19:26   ` Andrew Banman
  2017-02-27 19:26     ` [PATCH v3 " Andrew Banman
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Banman @ 2017-02-27 19:26 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

The above patch has a conflict in wait_completion with the enumerated UV_BAU
symbol because I let an old version of the patch creep into the set. Following
is the correct version.

Sorry about that!

Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 5/6] x86/platform/uv/BAU: Add wait_completion to bau_operations
  2017-02-27 19:26   ` Andrew Banman
@ 2017-02-27 19:26     ` Andrew Banman
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Banman @ 2017-02-27 19:26 UTC (permalink / raw)
  To: tglx
  Cc: mingo, akpm, hpa, mike.travis, rja, sivanich, x86, linux-kernel, abanman

Remove the present wait_completion routine and add a function pointer by
the same name to the bau_operations struct. Rather than switching on the
UV hub version during message processing, set the architecture-specific
uv*_wait_completion during initialization.

The uv123_bau_ops struct must be split into uv1 and uv2_3 versions to
accommodate the corresponding wait_completion routines.

Signed-off-by: Andrew Banman <abanman@hpe.com>
Acked-by: Mike Travis <mike.travis@hpe.com>
---
 arch/x86/include/asm/uv/uv_bau.h |  2 ++
 arch/x86/platform/uv/tlb_uv.c    | 31 ++++++++++++++++++-------------
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index a02019a..b52b356 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -671,6 +671,8 @@ struct bau_operations {
 	void		(*write_g_sw_ack)(int pnode, unsigned long mmr);
 	void		(*write_payload_first)(int pnode, unsigned long mmr);
 	void		(*write_payload_last)(int pnode, unsigned long mmr);
+	int		(*wait_completion)(struct bau_desc*,
+				struct bau_control*, long try);
 };
 
 static inline void write_mmr_data_broadcast(int pnode, unsigned long mmr_image)
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 13a7055..2a826dd 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -686,14 +686,6 @@ static int uv2_3_wait_completion(struct bau_desc *bau_desc,
 	return FLUSH_COMPLETE;
 }
 
-static int wait_completion(struct bau_desc *bau_desc, struct bau_control *bcp, long try)
-{
-	if (bcp->uvhub_version == UV_BAU_V1)
-		return uv1_wait_completion(bau_desc, bcp, try);
-	else
-		return uv2_3_wait_completion(bau_desc, bcp, try);
-}
-
 /*
  * Our retries are blocked by all destination sw ack resources being
  * in use, and a timeout is pending. In that case hardware immediately
@@ -922,7 +914,7 @@ int uv_flush_send_and_wait(struct cpumask *flush_mask, struct bau_control *bcp,
 		write_mmr_activation(index);
 
 		try++;
-		completion_stat = wait_completion(bau_desc, bcp, try);
+		completion_stat = ops.wait_completion(bau_desc, bcp, try);
 
 		handle_cmplt(completion_stat, bau_desc, bcp, hmaster, stat);
 
@@ -2135,7 +2127,18 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
 	return 1;
 }
 
-static const struct bau_operations uv123_bau_ops __initconst = {
+static const struct bau_operations uv1_bau_ops __initconst = {
+	.bau_gpa_to_offset       = uv_gpa_to_offset,
+	.read_l_sw_ack           = read_mmr_sw_ack,
+	.read_g_sw_ack           = read_gmmr_sw_ack,
+	.write_l_sw_ack          = write_mmr_sw_ack,
+	.write_g_sw_ack          = write_gmmr_sw_ack,
+	.write_payload_first     = write_mmr_payload_first,
+	.write_payload_last      = write_mmr_payload_last,
+	.wait_completion	 = uv1_wait_completion,
+};
+
+static const struct bau_operations uv2_3_bau_ops __initconst = {
 	.bau_gpa_to_offset       = uv_gpa_to_offset,
 	.read_l_sw_ack           = read_mmr_sw_ack,
 	.read_g_sw_ack           = read_gmmr_sw_ack,
@@ -2143,6 +2146,7 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
 	.write_g_sw_ack          = write_gmmr_sw_ack,
 	.write_payload_first     = write_mmr_payload_first,
 	.write_payload_last      = write_mmr_payload_last,
+	.wait_completion	 = uv2_3_wait_completion,
 };
 
 static const struct bau_operations uv4_bau_ops __initconst = {
@@ -2153,6 +2157,7 @@ static int __init init_per_cpu(int nuvhubs, int base_part_pnode)
 	.write_g_sw_ack          = write_gmmr_proc_sw_ack,
 	.write_payload_first     = write_mmr_proc_payload_first,
 	.write_payload_last      = write_mmr_proc_payload_last,
+	.wait_completion	 = uv2_3_wait_completion,
 };
 
 /*
@@ -2174,11 +2179,11 @@ static int __init uv_bau_init(void)
 	if (is_uv4_hub())
 		ops = uv4_bau_ops;
 	else if (is_uv3_hub())
-		ops = uv123_bau_ops;
+		ops = uv2_3_bau_ops;
 	else if (is_uv2_hub())
-		ops = uv123_bau_ops;
+		ops = uv2_3_bau_ops;
 	else if (is_uv1_hub())
-		ops = uv123_bau_ops;
+		ops = uv1_bau_ops;
 
 	for_each_possible_cpu(cur_cpu) {
 		mask = &per_cpu(uv_flush_tlb_mask, cur_cpu);
-- 
1.8.2.1

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates
  2017-02-17 22:06 [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
                   ` (5 preceding siblings ...)
  2017-02-17 22:06 ` [PATCH v2 6/6] x86/platform/uv/BAU: Implement uv4_wait_completion with read_status Andrew Banman
@ 2017-03-09  1:42 ` Andrew Banman
  2017-03-09 13:22   ` Ingo Molnar
  6 siblings, 1 reply; 13+ messages in thread
From: Andrew Banman @ 2017-03-09  1:42 UTC (permalink / raw)
  To: tglx, mingo; +Cc: abanman, akpm, mike.travis, rja, sivanich, x86, linux-kernel

Hi Ingo and Thomas,

Are these patches acceptable to you? We want to get these upstream as soon as 
possible, so please send along any more comments you have. If you're annoyed by 
the format of the emails just let me know and I'll resubmit.

Thank you,

Andrew

On 2/17/17 4:06 PM, Andrew Banman wrote:
> The following patch series adds the necessary functionality to make the BAU
> on UV4 operational. The purpose of these patches is to implement the correct
> message completion logic on UV4. Also included is a bug fix to add a field
> to the INTD payload. This is needed to verify the source of each message.
>
> As of this patch set, the BAU operates without errors and performance tests
> show TLB shootdowns take up to 42% less time with the BAU enabled.
>
> The patches are summarized as follows:
>
> (1) Populate a message payload field to verify messages at the destination.
>     Without this verification, the destination agent triggers a HUB error,
>     resulting in an NMI.
>
>     [PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated
>     [PATCH v2 2/6] x86/platform/uv/BAU: Add payload descriptor qualifier
>
>     This bug fix is included at the start of the series to avoid conflicts
>     in a code path shared by the rest of the series.
>
> (2) Make the wait_completion routine part of the bau_operations interface,
>     and add a uv4_wait_completion routine to employ new completion logic.
>
>     The message completion logic for previous generations relies on software-
>     defined timeouts that are not implemented on UV4. Without these patches,
>     the BAU driver on UV4 erroneously identifies a UV2-WAR timeout during
>     normal operation.
>
>     [PATCH 3/6] x86/platform/uv/BAU: Cleanup bau_operations declaration
>     [PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to
>     [PATCH v2 5/6] x86/platform/uv/BAU: Add wait_completion to
>     [PATCH v2 6/6] x86/platform/uv/BAU: Implement uv4_wait_completion with
>
>
> Please see the commit messages for details on the motivation and content of
> each patch.
>
> Thank you,
>
> Andrew Banman
> HPE, Linux Kernel Engineer
> <abanman@hpe.com>
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates
  2017-03-09  1:42 ` [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
@ 2017-03-09 13:22   ` Ingo Molnar
  0 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2017-03-09 13:22 UTC (permalink / raw)
  To: Andrew Banman
  Cc: tglx, mingo, akpm, mike.travis, rja, sivanich, x86, linux-kernel


* Andrew Banman <abanman@hpe.com> wrote:

> Hi Ingo and Thomas,
> 
> Are these patches acceptable to you? We want to get these upstream as soon
> as possible, so please send along any more comments you have. If you're
> annoyed by the format of the emails just let me know and I'll resubmit.
> 
> Thank you,

It's not entirely clear to me which are the latest patches (there were several 
v1/v2/v3 updates within the same discussion), so please re-send the latest against 
current -tip.

I had a quick look and the patches appear to be OK, so I don't expect any big 
problems.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-03-09 13:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-17 22:06 [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
2017-02-17 22:06 ` [PATCH 1/6] x86/platform/uv/BAU: Add uv_bau_version enumerated constants Andrew Banman
2017-02-17 22:06 ` [PATCH v2 2/6] x86/platform/uv/BAU: Add payload descriptor qualifier Andrew Banman
2017-02-17 22:06 ` [PATCH 3/6] x86/platform/uv/BAU: Cleanup bau_operations declaration and instances Andrew Banman
2017-02-17 22:06 ` [PATCH 4/6] x86/platform/uv/BAU: Add status mmr location fields to bau_control Andrew Banman
2017-02-27 19:14   ` Andrew Banman
2017-02-27 19:14     ` [PATCH v2 " Andrew Banman
2017-02-17 22:06 ` [PATCH v2 5/6] x86/platform/uv/BAU: Add wait_completion to bau_operations Andrew Banman
2017-02-27 19:26   ` Andrew Banman
2017-02-27 19:26     ` [PATCH v3 " Andrew Banman
2017-02-17 22:06 ` [PATCH v2 6/6] x86/platform/uv/BAU: Implement uv4_wait_completion with read_status Andrew Banman
2017-03-09  1:42 ` [PATCHv2 0/6] x86/platform/uv/BAU: UV4 message completion and initialization updates Andrew Banman
2017-03-09 13:22   ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).