linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests
@ 2020-01-22  7:56 Haren Myneni
  2020-01-22  8:06 ` [PATCH V5 01/14] powerpc/xive: Define xive_native_alloc_irq_on_chip() Haren Myneni
                   ` (13 more replies)
  0 siblings, 14 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  7:56 UTC (permalink / raw)
  To: mpe, linuxppc-dev, hch, mikey, herbert, npiggin, oohall; +Cc: sukadev


On power9, Virtual Accelerator Switchboard (VAS) allows user space or
kernel to communicate with Nest Accelerator (NX) directly using COPY/PASTE
instructions. NX provides verious functionalities such as compression,
encryption and etc. But only compression (842 and GZIP formats) is
supported in Linux kernel on power9.

842 compression driver (drivers/crypto/nx/nx-842-powernv.c)
is already included in Linux. Only GZIP support will be available from
user space.

Applications can issue GZIP compression / decompression requests to NX with
COPY/PASTE instructions. When NX is processing these requests, can hit
fault on the request buffer (not in memory). It issues an interrupt and
pastes fault CRB in fault FIFO. Expects kernel to handle this fault and
return credits for both send and fault windows after processing.

This patch series adds IRQ and fault window setup, and NX fault handling:
- Alloc IRQ and trigger port address, and configure IRQ per VAS instance.
- Set port# for each window to generate an interrupt when noticed fault.
- Set fault window and FIFO on which NX paste fault CRB.
- Setup IRQ thread fault handler per VAS instance.
- When receiving an interrupt, Read CRBs from fault FIFO and update
  coprocessor_status_block (CSB) in the corresponding CRB with translation
  failure (CSB_CC_TRANSLATION). After issuing NX requests, process polls
  on CSB address. When it sees translation error, can touch the request
  buffer to bring the page in to memory and reissue NX request.
- If copy_to_user fails on user space CSB address, OS sends SEGV signal.

Tested these patches with NX-GZIP support and will be posting this series
soon.

Patches 1 & 2: Define alloc IRQ and get port address per chip which are needed
               to alloc IRQ per VAS instance.
Patch 3: Define nx_fault_stamp on which NX writes fault status for the fault
         CRB
Patch 4: Alloc and setup IRQ and trigger port address for each VAS instance
Patch 5: Setup fault window per each VAS instance. This window is used for
         NX to paste fault CRB in FIFO.
Patches 6 & 7: Setup threaded IRQ per VAS and register NX with fault window
         ID and port number for each send window so that NX paste fault CRB
         in this window.
Patch 8: Reference to pid and mm so that pid is not used until window closed.
         Needed for multi thread application where child can open a window
         and can be used by parent later.
Patches 9 and 10: Process CRBs from fault FIFO and notify tasks by
         updating CSB or through signals.
Patches 11 and 12: Return credits for send and fault windows after handling
        faults.
Patch 14:Fix closing send window after all credits are returned. This issue
         happens only for user space requests. No page faults on kernel
         request buffer.

Changelog:
V2:
  - Use threaded IRQ instead of own kernel thread handler
  - Use pswid insted of user space CSB address to find valid CRB
  - Removed unused macros and other changes as suggested by Christoph Hellwig

V3:
  - Rebased to 5.5-rc2
  - Use struct pid * instead of pid_t for vas_window tgid
  - Code cleanup as suggested by Christoph Hellwig

V4:
  - Define xive alloc and get IRQ info based on chip ID and use these
    functions for IRQ setup per VAS instance. It eliminates skiboot
    dependency as suggested by Oliver.

V5:
  - Do not update CSB if the process is exiting (patch9)

Haren Myneni (14):
  powerpc/xive: Define xive_native_alloc_irq_on_chip()
  powerpc/xive: Define xive_native_alloc_get_irq_info()
  powerpc/vas: Define nx_fault_stamp in coprocessor_request_block
  powerpc/vas: Alloc and setup IRQ and trigger port address
  powerpc/vas: Setup fault window per VAS instance
  powerpc/vas: Setup thread IRQ handler per VAS instance
  powerpc/vas: Register NX with fault window ID and IRQ port value
  powerpc/vas: Take reference to PID and mm for user space windows
  powerpc/vas: Update CSB and notify process for fault CRBs
  powerpc/vas: Print CRB and FIFO values
  powerpc/vas: Do not use default credits for receive window
  powerpc/VAS: Return credits after handling fault
  powerpc/vas: Display process stuck message
  powerpc/vas: Free send window in VAS instance after credits returned

 arch/powerpc/include/asm/icswx.h            |  18 +-
 arch/powerpc/include/asm/xive.h             |  11 +-
 arch/powerpc/platforms/powernv/Makefile     |   2 +-
 arch/powerpc/platforms/powernv/ocxl.c       |  20 +-
 arch/powerpc/platforms/powernv/vas-debug.c  |   2 +-
 arch/powerpc/platforms/powernv/vas-fault.c  | 325 ++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/vas-window.c | 184 ++++++++++++++--
 arch/powerpc/platforms/powernv/vas.c        |  73 ++++++-
 arch/powerpc/platforms/powernv/vas.h        |  38 +++-
 arch/powerpc/sysdev/xive/native.c           |  29 ++-
 10 files changed, 655 insertions(+), 47 deletions(-)
 create mode 100644 arch/powerpc/platforms/powernv/vas-fault.c

-- 
1.8.3.1




^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH V5 01/14] powerpc/xive: Define xive_native_alloc_irq_on_chip()
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
@ 2020-01-22  8:06 ` Haren Myneni
  2020-01-22  8:06 ` [PATCH V5 02/14] powerpc/xive: Define xive_native_alloc_get_irq_info() Haren Myneni
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:06 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


This function allocates IRQ on a specific chip. VAS needs per chip
IRQ allocation and will have IRQ handler per VAS instance.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/include/asm/xive.h   | 9 ++++++++-
 arch/powerpc/sysdev/xive/native.c | 6 +++---
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
index 24cdf97..7ba6a90 100644
--- a/arch/powerpc/include/asm/xive.h
+++ b/arch/powerpc/include/asm/xive.h
@@ -5,6 +5,8 @@
 #ifndef _ASM_POWERPC_XIVE_H
 #define _ASM_POWERPC_XIVE_H
 
+#include <asm/opal-api.h>
+
 #define XIVE_INVALID_VP	0xffffffff
 
 #ifdef CONFIG_PPC_XIVE
@@ -108,7 +110,6 @@ struct xive_q {
 extern int xive_native_populate_irq_data(u32 hw_irq,
 					 struct xive_irq_data *data);
 extern void xive_cleanup_irq_data(struct xive_irq_data *xd);
-extern u32 xive_native_alloc_irq(void);
 extern void xive_native_free_irq(u32 irq);
 extern int xive_native_configure_irq(u32 hw_irq, u32 target, u8 prio, u32 sw_irq);
 
@@ -137,6 +138,12 @@ extern int xive_native_set_queue_state(u32 vp_id, uint32_t prio, u32 qtoggle,
 				       u32 qindex);
 extern int xive_native_get_vp_state(u32 vp_id, u64 *out_state);
 extern bool xive_native_has_queue_state_support(void);
+extern u32 xive_native_alloc_irq_on_chip(u32 chip_id);
+
+static inline u32 xive_native_alloc_irq(void)
+{
+	return xive_native_alloc_irq_on_chip(OPAL_XIVE_ANY_CHIP);
+}
 
 #else
 
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 0ff6b73..14d4406 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -279,12 +279,12 @@ static int xive_native_get_ipi(unsigned int cpu, struct xive_cpu *xc)
 }
 #endif /* CONFIG_SMP */
 
-u32 xive_native_alloc_irq(void)
+u32 xive_native_alloc_irq_on_chip(u32 chip_id)
 {
 	s64 rc;
 
 	for (;;) {
-		rc = opal_xive_allocate_irq(OPAL_XIVE_ANY_CHIP);
+		rc = opal_xive_allocate_irq(chip_id);
 		if (rc != OPAL_BUSY)
 			break;
 		msleep(OPAL_BUSY_DELAY_MS);
@@ -293,7 +293,7 @@ u32 xive_native_alloc_irq(void)
 		return 0;
 	return rc;
 }
-EXPORT_SYMBOL_GPL(xive_native_alloc_irq);
+EXPORT_SYMBOL_GPL(xive_native_alloc_irq_on_chip);
 
 void xive_native_free_irq(u32 irq)
 {
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 02/14] powerpc/xive: Define xive_native_alloc_get_irq_info()
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
  2020-01-22  8:06 ` [PATCH V5 01/14] powerpc/xive: Define xive_native_alloc_irq_on_chip() Haren Myneni
@ 2020-01-22  8:06 ` Haren Myneni
  2020-01-22  8:07 ` [PATCH V5 03/14] powerpc/vas: Define nx_fault_stamp in coprocessor_request_block Haren Myneni
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:06 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


pnv_ocxl_alloc_xive_irq() in ocxl.c allocates IRQ and gets trigger port
address. VAS also needs this function, but based on chip ID. So moved
this common function to xive/native.c.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/include/asm/xive.h       |  2 ++
 arch/powerpc/platforms/powernv/ocxl.c | 20 ++------------------
 arch/powerpc/sysdev/xive/native.c     | 23 +++++++++++++++++++++++
 3 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
index 7ba6a90..382f5ed 100644
--- a/arch/powerpc/include/asm/xive.h
+++ b/arch/powerpc/include/asm/xive.h
@@ -139,6 +139,8 @@ extern int xive_native_set_queue_state(u32 vp_id, uint32_t prio, u32 qtoggle,
 extern int xive_native_get_vp_state(u32 vp_id, u64 *out_state);
 extern bool xive_native_has_queue_state_support(void);
 extern u32 xive_native_alloc_irq_on_chip(u32 chip_id);
+extern int xive_native_alloc_get_irq_info(u32 chip_id, u32 *irq,
+					u64 *trigger_addr);
 
 static inline u32 xive_native_alloc_irq(void)
 {
diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c
index 8c65aac..fb8f99a 100644
--- a/arch/powerpc/platforms/powernv/ocxl.c
+++ b/arch/powerpc/platforms/powernv/ocxl.c
@@ -487,24 +487,8 @@ int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle)
 
 int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr)
 {
-	__be64 flags, trigger_page;
-	s64 rc;
-	u32 hwirq;
-
-	hwirq = xive_native_alloc_irq();
-	if (!hwirq)
-		return -ENOENT;
-
-	rc = opal_xive_get_irq_info(hwirq, &flags, NULL, &trigger_page, NULL,
-				NULL);
-	if (rc || !trigger_page) {
-		xive_native_free_irq(hwirq);
-		return -ENOENT;
-	}
-	*irq = hwirq;
-	*trigger_addr = be64_to_cpu(trigger_page);
-	return 0;
-
+	return xive_native_alloc_get_irq_info(OPAL_XIVE_ANY_CHIP, irq,
+						trigger_addr);
 }
 EXPORT_SYMBOL_GPL(pnv_ocxl_alloc_xive_irq);
 
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 14d4406..abdd892 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -295,6 +295,29 @@ u32 xive_native_alloc_irq_on_chip(u32 chip_id)
 }
 EXPORT_SYMBOL_GPL(xive_native_alloc_irq_on_chip);
 
+int xive_native_alloc_get_irq_info(u32 chip_id, u32 *irq, u64 *trigger_addr)
+{
+	__be64 flags, trigger_page;
+	u32 hwirq;
+	s64 rc;
+
+	hwirq = xive_native_alloc_irq_on_chip(chip_id);
+	if (!hwirq)
+		return -ENOENT;
+
+	rc = opal_xive_get_irq_info(hwirq, &flags, NULL, &trigger_page, NULL,
+				NULL);
+	if (rc || !trigger_page) {
+		xive_native_free_irq(hwirq);
+		return -ENOENT;
+	}
+	*irq = hwirq;
+	*trigger_addr = be64_to_cpu(trigger_page);
+
+	return 0;
+}
+EXPORT_SYMBOL(xive_native_alloc_get_irq_info);
+
 void xive_native_free_irq(u32 irq)
 {
 	for (;;) {
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 03/14] powerpc/vas: Define nx_fault_stamp in coprocessor_request_block
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
  2020-01-22  8:06 ` [PATCH V5 01/14] powerpc/xive: Define xive_native_alloc_irq_on_chip() Haren Myneni
  2020-01-22  8:06 ` [PATCH V5 02/14] powerpc/xive: Define xive_native_alloc_get_irq_info() Haren Myneni
@ 2020-01-22  8:07 ` Haren Myneni
  2020-01-22  8:08 ` [PATCH V5 04/14] powerpc/vas: Alloc and setup IRQ and trigger port address Haren Myneni
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:07 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


Kernel sets fault address and status in CRB for NX page fault on user
space address after processing page fault. User space gets the signal
and handles the fault mentioned in CRB by bringing the page in to
memory and send NX request again.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/include/asm/icswx.h | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/icswx.h b/arch/powerpc/include/asm/icswx.h
index 9872f85..b233d1e 100644
--- a/arch/powerpc/include/asm/icswx.h
+++ b/arch/powerpc/include/asm/icswx.h
@@ -108,6 +108,17 @@ struct data_descriptor_entry {
 	__be64 address;
 } __packed __aligned(DDE_ALIGN);
 
+/* 4.3.2 NX-stamped Fault CRB */
+
+#define NX_STAMP_ALIGN          (0x10)
+
+struct nx_fault_stamp {
+	__be64 fault_storage_addr;
+	__be16 reserved;
+	__u8   flags;
+	__u8   fault_status;
+	__be32 pswid;
+} __packed __aligned(NX_STAMP_ALIGN);
 
 /* Chapter 6.5.2 Coprocessor-Request Block (CRB) */
 
@@ -135,7 +146,12 @@ struct coprocessor_request_block {
 
 	struct coprocessor_completion_block ccb;
 
-	u8 reserved[48];
+	union {
+		struct nx_fault_stamp nx;
+		u8 reserved[16];
+	} stamp;
+
+	u8 reserved[32];
 
 	struct coprocessor_status_block csb;
 } __packed __aligned(CRB_ALIGN);
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 04/14] powerpc/vas: Alloc and setup IRQ and trigger port address
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (2 preceding siblings ...)
  2020-01-22  8:07 ` [PATCH V5 03/14] powerpc/vas: Define nx_fault_stamp in coprocessor_request_block Haren Myneni
@ 2020-01-22  8:08 ` Haren Myneni
  2020-01-22  8:08 ` [PATCH V5 05/14] powerpc/vas: Setup fault window per VAS instance Haren Myneni
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:08 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


Alloc IRQ and get trigger port address for each VAS instance. Kernel
register this IRQ per VAS instance and sets this port for each send
window. NX interrupts the kernel when it sees page fault.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas.c | 34 ++++++++++++++++++++++++++++------
 arch/powerpc/platforms/powernv/vas.h |  2 ++
 2 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/vas.c b/arch/powerpc/platforms/powernv/vas.c
index ed9cc6d..168ab68 100644
--- a/arch/powerpc/platforms/powernv/vas.c
+++ b/arch/powerpc/platforms/powernv/vas.c
@@ -15,6 +15,7 @@
 #include <linux/of_address.h>
 #include <linux/of.h>
 #include <asm/prom.h>
+#include <asm/xive.h>
 
 #include "vas.h"
 
@@ -25,10 +26,12 @@
 
 static int init_vas_instance(struct platform_device *pdev)
 {
-	int rc, cpu, vasid;
-	struct resource *res;
-	struct vas_instance *vinst;
 	struct device_node *dn = pdev->dev.of_node;
+	struct vas_instance *vinst;
+	uint32_t chipid, irq;
+	struct resource *res;
+	int rc, cpu, vasid;
+	uint64_t port;
 
 	rc = of_property_read_u32(dn, "ibm,vas-id", &vasid);
 	if (rc) {
@@ -36,6 +39,12 @@ static int init_vas_instance(struct platform_device *pdev)
 		return -ENODEV;
 	}
 
+	rc = of_property_read_u32(dn, "ibm,chip-id", &chipid);
+	if (rc) {
+		pr_err("No ibm,chip-id property for %s?\n", pdev->name);
+		return -ENODEV;
+	}
+
 	if (pdev->num_resources != 4) {
 		pr_err("Unexpected DT configuration for [%s, %d]\n",
 				pdev->name, vasid);
@@ -69,9 +78,22 @@ static int init_vas_instance(struct platform_device *pdev)
 
 	vinst->paste_win_id_shift = 63 - res->end;
 
-	pr_devel("Initialized instance [%s, %d], paste_base 0x%llx, "
-			"paste_win_id_shift 0x%llx\n", pdev->name, vasid,
-			vinst->paste_base_addr, vinst->paste_win_id_shift);
+	rc = xive_native_alloc_get_irq_info(chipid, &irq, &port);
+	if (rc)
+		return rc;
+
+	vinst->virq = irq_create_mapping(NULL, irq);
+	if (!vinst->virq) {
+		pr_err("Inst%d: Unable to map global irq %d\n",
+				vinst->vas_id, irq);
+		return -EINVAL;
+	}
+
+	vinst->irq_port = port;
+	pr_devel("Initialized instance [%s, %d] paste_base 0x%llx paste_win_id_shift 0x%llx IRQ %d Port 0x%llx\n",
+			pdev->name, vasid, vinst->paste_base_addr,
+			vinst->paste_win_id_shift, vinst->virq,
+			vinst->irq_port);
 
 	for_each_possible_cpu(cpu) {
 		if (cpu_to_chip_id(cpu) == of_get_ibm_chip_id(dn))
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 5574aec..598608b 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -313,6 +313,8 @@ struct vas_instance {
 	u64 paste_base_addr;
 	u64 paste_win_id_shift;
 
+	u64 irq_port;
+	int virq;
 	struct mutex mutex;
 	struct vas_window *rxwin[VAS_COP_TYPE_MAX];
 	struct vas_window *windows[VAS_WINDOWS_PER_CHIP];
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 05/14] powerpc/vas: Setup fault window per VAS instance
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (3 preceding siblings ...)
  2020-01-22  8:08 ` [PATCH V5 04/14] powerpc/vas: Alloc and setup IRQ and trigger port address Haren Myneni
@ 2020-01-22  8:08 ` Haren Myneni
  2020-01-22  8:10 ` [PATCH V5 06/14] powerpc/vas: Setup thread IRQ handler " Haren Myneni
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:08 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


Setup fault window for each VAS instance. When NX gets fault on request
buffer, write fault CRBs in the corresponding fault FIFO and then sends
an interrupt to the OS.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/Makefile     |  2 +-
 arch/powerpc/platforms/powernv/vas-fault.c  | 73 +++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/vas-window.c |  4 +-
 arch/powerpc/platforms/powernv/vas.c        | 20 ++++++++
 arch/powerpc/platforms/powernv/vas.h        |  5 ++
 5 files changed, 101 insertions(+), 3 deletions(-)
 create mode 100644 arch/powerpc/platforms/powernv/vas-fault.c

diff --git a/arch/powerpc/platforms/powernv/Makefile b/arch/powerpc/platforms/powernv/Makefile
index c0f8120..395789f 100644
--- a/arch/powerpc/platforms/powernv/Makefile
+++ b/arch/powerpc/platforms/powernv/Makefile
@@ -17,7 +17,7 @@ obj-$(CONFIG_MEMORY_FAILURE)	+= opal-memory-errors.o
 obj-$(CONFIG_OPAL_PRD)	+= opal-prd.o
 obj-$(CONFIG_PERF_EVENTS) += opal-imc.o
 obj-$(CONFIG_PPC_MEMTRACE)	+= memtrace.o
-obj-$(CONFIG_PPC_VAS)	+= vas.o vas-window.o vas-debug.o
+obj-$(CONFIG_PPC_VAS)	+= vas.o vas-window.o vas-debug.o vas-fault.o
 obj-$(CONFIG_OCXL_BASE)	+= ocxl.o
 obj-$(CONFIG_SCOM_DEBUGFS) += opal-xscom.o
 obj-$(CONFIG_PPC_SECURE_BOOT) += opal-secvar.o
diff --git a/arch/powerpc/platforms/powernv/vas-fault.c b/arch/powerpc/platforms/powernv/vas-fault.c
new file mode 100644
index 0000000..b0258ed
--- /dev/null
+++ b/arch/powerpc/platforms/powernv/vas-fault.c
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * VAS Fault handling.
+ * Copyright 2019, IBM Corporation
+ */
+
+#define pr_fmt(fmt) "vas: " fmt
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/kthread.h>
+#include <asm/icswx.h>
+
+#include "vas.h"
+
+/*
+ * The maximum FIFO size for fault window can be 8MB
+ * (VAS_RX_FIFO_SIZE_MAX). Using 4MB FIFO since each VAS
+ * instance will be having fault window.
+ * 8MB FIFO can be used if expects more faults for each VAS
+ * instance.
+ */
+#define VAS_FAULT_WIN_FIFO_SIZE	(4 << 20)
+
+/*
+ * Fault window is opened per VAS instance. NX pastes fault CRB in fault
+ * FIFO upon page faults.
+ */
+int vas_setup_fault_window(struct vas_instance *vinst)
+{
+	struct vas_rx_win_attr attr;
+
+	vinst->fault_fifo_size = VAS_FAULT_WIN_FIFO_SIZE;
+	vinst->fault_fifo = kzalloc(vinst->fault_fifo_size, GFP_KERNEL);
+	if (!vinst->fault_fifo) {
+		pr_err("Unable to alloc %d bytes for fault_fifo\n",
+				vinst->fault_fifo_size);
+		return -ENOMEM;
+	}
+
+	vas_init_rx_win_attr(&attr, VAS_COP_TYPE_FAULT);
+
+	attr.rx_fifo_size = vinst->fault_fifo_size;
+	attr.rx_fifo = vinst->fault_fifo;
+
+	/*
+	 * Max creds is based on number of CRBs can fit in the FIFO.
+	 * (fault_fifo_size/CRB_SIZE). If 8MB FIFO is used, max creds
+	 * will be 0xffff since the receive creds field is 16bits wide.
+	 */
+	attr.wcreds_max = vinst->fault_fifo_size / CRB_SIZE;
+	attr.lnotify_lpid = 0;
+	attr.lnotify_pid = mfspr(SPRN_PID);
+	attr.lnotify_tid = mfspr(SPRN_PID);
+
+	vinst->fault_win = vas_rx_win_open(vinst->vas_id, VAS_COP_TYPE_FAULT,
+					&attr);
+
+	if (IS_ERR(vinst->fault_win)) {
+		pr_err("VAS: Error %ld opening FaultWin\n",
+			PTR_ERR(vinst->fault_win));
+		kfree(vinst->fault_fifo);
+		return PTR_ERR(vinst->fault_win);
+	}
+
+	pr_devel("VAS: Created FaultWin %d, LPID/PID/TID [%d/%d/%d]\n",
+			vinst->fault_win->winid, attr.lnotify_lpid,
+			attr.lnotify_pid, attr.lnotify_tid);
+
+	return 0;
+}
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 0c0d27d..1783fa9 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -827,9 +827,9 @@ void vas_init_rx_win_attr(struct vas_rx_win_attr *rxattr, enum vas_cop_type cop)
 		rxattr->fault_win = true;
 		rxattr->notify_disable = true;
 		rxattr->rx_wcred_mode = true;
-		rxattr->tx_wcred_mode = true;
 		rxattr->rx_win_ord_mode = true;
-		rxattr->tx_win_ord_mode = true;
+		rxattr->rej_no_credit = true;
+		rxattr->tc_mode = VAS_THRESH_DISABLED;
 	} else if (cop == VAS_COP_TYPE_FTW) {
 		rxattr->user_win = true;
 		rxattr->intr_disable = true;
diff --git a/arch/powerpc/platforms/powernv/vas.c b/arch/powerpc/platforms/powernv/vas.c
index 168ab68..557c8e4 100644
--- a/arch/powerpc/platforms/powernv/vas.c
+++ b/arch/powerpc/platforms/powernv/vas.c
@@ -24,6 +24,11 @@
 
 static DEFINE_PER_CPU(int, cpu_vas_id);
 
+static int vas_irq_fault_window_setup(struct vas_instance *vinst)
+{
+	return vas_setup_fault_window(vinst);
+}
+
 static int init_vas_instance(struct platform_device *pdev)
 {
 	struct device_node *dn = pdev->dev.of_node;
@@ -104,6 +109,21 @@ static int init_vas_instance(struct platform_device *pdev)
 	list_add(&vinst->node, &vas_instances);
 	mutex_unlock(&vas_mutex);
 
+	/*
+	 * IRQ and fault handling setup is needed only for user space
+	 * send windows.
+	 */
+	if (vinst->virq) {
+		rc = vas_irq_fault_window_setup(vinst);
+		/*
+		 * Fault window is used only for user space send windows.
+		 * So if vinst->virq is NULL, tx_win_open returns -ENODEV
+		 * for user space.
+		 */
+		if (rc)
+			vinst->virq = 0;
+	}
+
 	vas_instance_init_dbgdir(vinst);
 
 	dev_set_drvdata(&pdev->dev, vinst);
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 598608b..9f08daa 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -315,6 +315,10 @@ struct vas_instance {
 
 	u64 irq_port;
 	int virq;
+	int fault_fifo_size;
+	void *fault_fifo;
+	struct vas_window *fault_win; /* Fault window */
+
 	struct mutex mutex;
 	struct vas_window *rxwin[VAS_COP_TYPE_MAX];
 	struct vas_window *windows[VAS_WINDOWS_PER_CHIP];
@@ -408,6 +412,7 @@ struct vas_winctx {
 extern void vas_instance_init_dbgdir(struct vas_instance *vinst);
 extern void vas_window_init_dbgdir(struct vas_window *win);
 extern void vas_window_free_dbgdir(struct vas_window *win);
+extern int vas_setup_fault_window(struct vas_instance *vinst);
 
 static inline void vas_log_write(struct vas_window *win, char *name,
 			void *regptr, u64 val)
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 06/14] powerpc/vas: Setup thread IRQ handler per VAS instance
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (4 preceding siblings ...)
  2020-01-22  8:08 ` [PATCH V5 05/14] powerpc/vas: Setup fault window per VAS instance Haren Myneni
@ 2020-01-22  8:10 ` Haren Myneni
  2020-02-07  5:57   ` Michael Neuling
  2020-01-22  8:11 ` [PATCH V5 07/14] powerpc/vas: Register NX with fault window ID and IRQ port value Haren Myneni
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:10 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


Setup thread IRQ handler per each VAS instance. When NX sees a fault
on CRB, kernel gets an interrupt and vas_fault_handler will be
executed to process fault CRBs. Read all valid CRBs from fault FIFO,
determine the corresponding send window from CRB and process fault
requests.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-fault.c  | 85 +++++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/vas-window.c | 60 ++++++++++++++++++++
 arch/powerpc/platforms/powernv/vas.c        | 21 ++++++-
 arch/powerpc/platforms/powernv/vas.h        |  4 ++
 4 files changed, 169 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/powernv/vas-fault.c b/arch/powerpc/platforms/powernv/vas-fault.c
index b0258ed..5c2cada 100644
--- a/arch/powerpc/platforms/powernv/vas-fault.c
+++ b/arch/powerpc/platforms/powernv/vas-fault.c
@@ -11,6 +11,7 @@
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 #include <linux/kthread.h>
+#include <linux/mmu_context.h>
 #include <asm/icswx.h>
 
 #include "vas.h"
@@ -25,6 +26,90 @@
 #define VAS_FAULT_WIN_FIFO_SIZE	(4 << 20)
 
 /*
+ * Process CRBs that we receive on the fault window.
+ */
+irqreturn_t vas_fault_handler(int irq, void *data)
+{
+	struct vas_instance *vinst = data;
+	struct coprocessor_request_block buf, *crb;
+	struct vas_window *window;
+	void *fifo;
+
+	/*
+	 * VAS can interrupt with multiple page faults. So process all
+	 * valid CRBs within fault FIFO until reaches invalid CRB.
+	 * NX updates nx_fault_stamp in CRB and pastes in fault FIFO.
+	 * kernel retrives send window from parition send window ID
+	 * (pswid) in nx_fault_stamp. So pswid should be non-zero and
+	 * use this to check whether CRB is valid.
+	 * After reading CRB entry, it is reset with 0's in fault FIFO.
+	 *
+	 * In case kernel receives another interrupt with different page
+	 * fault and CRBs are processed by the previous handling, will be
+	 * returned from this function when it sees invalid CRB (means 0's).
+	 */
+	do {
+		mutex_lock(&vinst->mutex);
+
+		/*
+		 * Advance the fault fifo pointer to next CRB.
+		 * Use CRB_SIZE rather than sizeof(*crb) since the latter is
+		 * aligned to CRB_ALIGN (256) but the CRB written to by VAS is
+		 * only CRB_SIZE in len.
+		 */
+		fifo = vinst->fault_fifo + (vinst->fault_crbs * CRB_SIZE);
+		crb = fifo;
+
+		/*
+		 * NX pastes nx_fault_stamp in fault FIFO for each fault.
+		 * So use pswid to check whether fault CRB is valid.
+		 * pswid returned from NX will be in _be32, but just
+		 * checking non-zero value to make sure the CRB is valid.
+		 * Return if reached invalid CRB.
+		 */
+		if (!crb->stamp.nx.pswid) {
+			mutex_unlock(&vinst->mutex);
+			return IRQ_HANDLED;
+		}
+
+		vinst->fault_crbs++;
+		if (vinst->fault_crbs == (vinst->fault_fifo_size / CRB_SIZE))
+			vinst->fault_crbs = 0;
+
+		crb = &buf;
+		memcpy(crb, fifo, CRB_SIZE);
+		memset(fifo, 0, CRB_SIZE);
+		mutex_unlock(&vinst->mutex);
+
+		pr_devel("VAS[%d] fault_fifo %p, fifo %p, fault_crbs %d\n",
+				vinst->vas_id, vinst->fault_fifo, fifo,
+				vinst->fault_crbs);
+
+		window = vas_pswid_to_window(vinst,
+				be32_to_cpu(crb->stamp.nx.pswid));
+
+		if (IS_ERR(window)) {
+			/*
+			 * We got an interrupt about a specific send
+			 * window but we can't find that window and we can't
+			 * even clean it up (return credit).
+			 * But we should not get here.
+			 */
+			pr_err("VAS[%d] fault_fifo %p, fifo %p, pswid 0x%x, fault_crbs %d bad CRB?\n",
+				vinst->vas_id, vinst->fault_fifo, fifo,
+				be32_to_cpu(crb->stamp.nx.pswid),
+				vinst->fault_crbs);
+
+			WARN_ON_ONCE(1);
+			return IRQ_HANDLED;
+		}
+
+	} while (true);
+
+	return IRQ_HANDLED;
+}
+
+/*
  * Fault window is opened per VAS instance. NX pastes fault CRB in fault
  * FIFO upon page faults.
  */
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 1783fa9..7c6f55f 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -1040,6 +1040,15 @@ struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
 		}
 	} else {
 		/*
+		 * Interrupt hanlder or fault window setup failed. Means
+		 * NX can not generate fault for page fault. So not
+		 * opening for user space tx window.
+		 */
+		if (!vinst->virq) {
+			rc = -ENODEV;
+			goto free_window;
+		}
+		/*
 		 * A user mapping must ensure that context switch issues
 		 * CP_ABORT for this thread.
 		 */
@@ -1254,3 +1263,54 @@ int vas_win_close(struct vas_window *window)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(vas_win_close);
+
+struct vas_window *vas_pswid_to_window(struct vas_instance *vinst,
+		uint32_t pswid)
+{
+	int winid;
+	struct vas_window *window;
+
+	if (!pswid) {
+		pr_devel("%s: called for pswid 0!\n", __func__);
+		return ERR_PTR(-ESRCH);
+	}
+
+	decode_pswid(pswid, NULL, &winid);
+
+	if (winid >= VAS_WINDOWS_PER_CHIP)
+		return ERR_PTR(-ESRCH);
+
+	/*
+	 * If application closes the window before the hardware
+	 * returns the fault CRB, we should wait in vas_win_close()
+	 * for the pending requests. so the window must be active
+	 * and the process alive.
+	 *
+	 * If its a kernel process, we should not get any faults and
+	 * should not get here.
+	 */
+	window = vinst->windows[winid];
+
+	if (!window) {
+		pr_err("PSWID decode: Could not find window for winid %d pswid %d vinst 0x%p\n",
+			winid, pswid, vinst);
+		return NULL;
+	}
+
+	/*
+	 * Do some sanity checks on the decoded window.  Window should be
+	 * NX GZIP user send window. FTW windows should not incur faults
+	 * since their CRBs are ignored (not queued on FIFO or processed
+	 * by NX).
+	 */
+	if (!window->tx_win || !window->user_win || !window->nx_win ||
+			window->cop == VAS_COP_TYPE_FAULT ||
+			window->cop == VAS_COP_TYPE_FTW) {
+		pr_err("PSWID decode: id %d, tx %d, user %d, nx %d, cop %d\n",
+			winid, window->tx_win, window->user_win,
+			window->nx_win, window->cop);
+		WARN_ON(1);
+	}
+
+	return window;
+}
diff --git a/arch/powerpc/platforms/powernv/vas.c b/arch/powerpc/platforms/powernv/vas.c
index 557c8e4..46ea57d 100644
--- a/arch/powerpc/platforms/powernv/vas.c
+++ b/arch/powerpc/platforms/powernv/vas.c
@@ -14,6 +14,8 @@
 #include <linux/of_platform.h>
 #include <linux/of_address.h>
 #include <linux/of.h>
+#include <linux/irqdomain.h>
+#include <linux/interrupt.h>
 #include <asm/prom.h>
 #include <asm/xive.h>
 
@@ -26,7 +28,24 @@
 
 static int vas_irq_fault_window_setup(struct vas_instance *vinst)
 {
-	return vas_setup_fault_window(vinst);
+	char devname[64];
+	int rc = 0;
+
+	snprintf(devname, sizeof(devname), "vas-%d", vinst->vas_id);
+	rc = request_threaded_irq(vinst->virq, NULL, vas_fault_handler,
+					IRQF_ONESHOT, devname, vinst);
+	if (rc) {
+		pr_err("VAS[%d]: Request IRQ(%d) failed with %d\n",
+				vinst->vas_id, vinst->virq, rc);
+		goto out;
+	}
+
+	rc = vas_setup_fault_window(vinst);
+	if (rc)
+		free_irq(vinst->virq, vinst);
+
+out:
+	return rc;
 }
 
 static int init_vas_instance(struct platform_device *pdev)
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 9f08daa..879f5b4 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -315,6 +315,7 @@ struct vas_instance {
 
 	u64 irq_port;
 	int virq;
+	int fault_crbs;
 	int fault_fifo_size;
 	void *fault_fifo;
 	struct vas_window *fault_win; /* Fault window */
@@ -413,6 +414,9 @@ struct vas_winctx {
 extern void vas_window_init_dbgdir(struct vas_window *win);
 extern void vas_window_free_dbgdir(struct vas_window *win);
 extern int vas_setup_fault_window(struct vas_instance *vinst);
+extern irqreturn_t vas_fault_handler(int irq, void *data);
+extern struct vas_window *vas_pswid_to_window(struct vas_instance *vinst,
+						uint32_t pswid);
 
 static inline void vas_log_write(struct vas_window *win, char *name,
 			void *regptr, u64 val)
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 07/14] powerpc/vas: Register NX with fault window ID and IRQ port value
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (5 preceding siblings ...)
  2020-01-22  8:10 ` [PATCH V5 06/14] powerpc/vas: Setup thread IRQ handler " Haren Myneni
@ 2020-01-22  8:11 ` Haren Myneni
  2020-01-22  8:12 ` [PATCH V5 08/14] powerpc/vas: Take reference to PID and mm for user space windows Haren Myneni
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:11 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


For each user space send window, register NX with fault window ID
and port value so that NX paste CRBs in this fault FIFO when it
sees fault on the request buffer.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-window.c | 15 +++++++++++++--
 arch/powerpc/platforms/powernv/vas.h        | 15 +++++++++++++++
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 7c6f55f..a45d81d 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -373,7 +373,7 @@ int init_winctx_regs(struct vas_window *window, struct vas_winctx *winctx)
 	init_xlate_regs(window, winctx->user_win);
 
 	val = 0ULL;
-	val = SET_FIELD(VAS_FAULT_TX_WIN, val, 0);
+	val = SET_FIELD(VAS_FAULT_TX_WIN, val, winctx->fault_win_id);
 	write_hvwc_reg(window, VREG(FAULT_TX_WIN), val);
 
 	/* In PowerNV, interrupts go to HV. */
@@ -748,6 +748,8 @@ static void init_winctx_for_rxwin(struct vas_window *rxwin,
 
 	winctx->min_scope = VAS_SCOPE_LOCAL;
 	winctx->max_scope = VAS_SCOPE_VECTORED_GROUP;
+	if (rxwin->vinst->virq)
+		winctx->irq_port = rxwin->vinst->irq_port;
 }
 
 static bool rx_win_args_valid(enum vas_cop_type cop,
@@ -944,13 +946,22 @@ static void init_winctx_for_txwin(struct vas_window *txwin,
 	winctx->lpid = txattr->lpid;
 	winctx->pidr = txattr->pidr;
 	winctx->rx_win_id = txwin->rxwin->winid;
+	/*
+	 * IRQ and fault window setup is successful. Set fault window
+	 * for the send window so that ready to handle faults.
+	 */
+	if (txwin->vinst->virq)
+		winctx->fault_win_id = txwin->vinst->fault_win->winid;
 
 	winctx->dma_type = VAS_DMA_TYPE_INJECT;
 	winctx->tc_mode = txattr->tc_mode;
 	winctx->min_scope = VAS_SCOPE_LOCAL;
 	winctx->max_scope = VAS_SCOPE_VECTORED_GROUP;
+	if (txwin->vinst->virq)
+		winctx->irq_port = txwin->vinst->irq_port;
 
-	winctx->pswid = 0;
+	winctx->pswid = txattr->pswid ? txattr->pswid :
+			encode_pswid(txwin->vinst->vas_id, txwin->winid);
 }
 
 static bool tx_win_args_valid(enum vas_cop_type cop,
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 879f5b4..2621df1 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -455,6 +455,21 @@ static inline u64 read_hvwc_reg(struct vas_window *win,
 	return in_be64(win->hvwc_map+reg);
 }
 
+/*
+ * Encode/decode the Partition Send Window ID (PSWID) for a window in
+ * a way that we can uniquely identify any window in the system. i.e.
+ * we should be able to locate the 'struct vas_window' given the PSWID.
+ *
+ *	Bits	Usage
+ *	0:7	VAS id (8 bits)
+ *	8:15	Unused, 0 (3 bits)
+ *	16:31	Window id (16 bits)
+ */
+static inline u32 encode_pswid(int vasid, int winid)
+{
+	return ((u32)winid | (vasid << (31 - 7)));
+}
+
 static inline void decode_pswid(u32 pswid, int *vasid, int *winid)
 {
 	if (vasid)
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 08/14] powerpc/vas: Take reference to PID and mm for user space windows
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (6 preceding siblings ...)
  2020-01-22  8:11 ` [PATCH V5 07/14] powerpc/vas: Register NX with fault window ID and IRQ port value Haren Myneni
@ 2020-01-22  8:12 ` Haren Myneni
  2020-01-22  8:17 ` [PATCH V5 09/14] powerpc/vas: Update CSB and notify process for fault CRBs Haren Myneni
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:12 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


Process close windows after its requests are completed. In multi-thread
applications, child can open a window but release FD will not be called
upon its exit. Parent thread will be closing it later upon its exit.

The parent can also send NX requests with this window and NX can
generate page faults. After kernel handles the page fault, send
signal to process by using PID if CSB address is invalid. Parent
thread will not receive signal since its PID is different from the one
saved in vas_window. So use tgid in case if the task for the pid saved
in window is not running and send signal to its parent.

To prevent reusing the pid until the window closed, take reference to
pid and task mm.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-debug.c  |  2 +-
 arch/powerpc/platforms/powernv/vas-window.c | 53 ++++++++++++++++++++++++++---
 arch/powerpc/platforms/powernv/vas.h        |  9 ++++-
 3 files changed, 57 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/vas-debug.c b/arch/powerpc/platforms/powernv/vas-debug.c
index 09e63df..ef9a717 100644
--- a/arch/powerpc/platforms/powernv/vas-debug.c
+++ b/arch/powerpc/platforms/powernv/vas-debug.c
@@ -38,7 +38,7 @@ static int info_show(struct seq_file *s, void *private)
 
 	seq_printf(s, "Type: %s, %s\n", cop_to_str(window->cop),
 					window->tx_win ? "Send" : "Receive");
-	seq_printf(s, "Pid : %d\n", window->pid);
+	seq_printf(s, "Pid : %d\n", vas_window_pid(window));
 
 unlock:
 	mutex_unlock(&vas_mutex);
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index a45d81d..7587258 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -12,6 +12,8 @@
 #include <linux/log2.h>
 #include <linux/rcupdate.h>
 #include <linux/cred.h>
+#include <linux/sched/mm.h>
+#include <linux/mmu_context.h>
 #include <asm/switch_to.h>
 #include <asm/ppc-opcode.h>
 #include "vas.h"
@@ -876,8 +878,6 @@ struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
 	rxwin->user_win = rxattr->user_win;
 	rxwin->cop = cop;
 	rxwin->wcreds_max = rxattr->wcreds_max ?: VAS_WCREDS_DEFAULT;
-	if (rxattr->user_win)
-		rxwin->pid = task_pid_vnr(current);
 
 	init_winctx_for_rxwin(rxwin, rxattr, &winctx);
 	init_winctx_regs(rxwin, &winctx);
@@ -1027,7 +1027,6 @@ struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
 	txwin->tx_win = 1;
 	txwin->rxwin = rxwin;
 	txwin->nx_win = txwin->rxwin->nx_win;
-	txwin->pid = attr->pid;
 	txwin->user_win = attr->user_win;
 	txwin->wcreds_max = attr->wcreds_max ?: VAS_WCREDS_DEFAULT;
 
@@ -1068,8 +1067,43 @@ struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
 			goto free_window;
 	}
 
-	set_vinst_win(vinst, txwin);
+	if (txwin->user_win) {
+		/*
+		 * Window opened by child thread may not be closed when
+		 * it exits. So take reference to its pid and release it
+		 * when the window is free by parent thread.
+		 * Acquire a reference to the task's pid to make sure
+		 * pid will not be re-used - needed only for multithread
+		 * applications.
+		 */
+		txwin->pid = get_task_pid(current, PIDTYPE_PID);
+		/*
+		 * Acquire a reference to the task's mm.
+		 */
+		txwin->mm = get_task_mm(current);
 
+		if (!txwin->mm) {
+			put_pid(txwin->pid);
+			pr_err("VAS: pid(%d): mm_struct is not found\n",
+					current->pid);
+			rc = -EPERM;
+			goto free_window;
+		}
+
+		mmgrab(txwin->mm);
+		mmput(txwin->mm);
+		mm_context_add_copro(txwin->mm);
+		/*
+		 * Process closes window during exit. In the case of
+		 * multithread application, child can open window and
+		 * can exit without closing it. Expects parent thread
+		 * to use and close the window. So do not need to take
+		 * pid reference for parent thread.
+		 */
+		txwin->tgid = find_get_pid(task_tgid_vnr(current));
+	}
+
+	set_vinst_win(vinst, txwin);
 	return txwin;
 
 free_window:
@@ -1266,8 +1300,17 @@ int vas_win_close(struct vas_window *window)
 	poll_window_castout(window);
 
 	/* if send window, drop reference to matching receive window */
-	if (window->tx_win)
+	if (window->tx_win) {
+		if (window->user_win) {
+			/* Drop references to pid and mm */
+			put_pid(window->pid);
+			if (window->mm) {
+				mmdrop(window->mm);
+				mm_context_remove_copro(window->mm);
+			}
+		}
 		put_rx_win(window->rxwin);
+	}
 
 	vas_window_free(window);
 
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 2621df1..af03aa0 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -340,7 +340,9 @@ struct vas_window {
 	bool user_win;		/* True if user space window */
 	void *hvwc_map;		/* HV window context */
 	void *uwc_map;		/* OS/User window context */
-	pid_t pid;		/* Linux process id of owner */
+	struct pid *pid;	/* Linux process id of owner */
+	struct pid *tgid;	/* Thread group ID of owner */
+	struct mm_struct *mm;	/* Linux process mm_struct */
 	int wcreds_max;		/* Window credits */
 
 	char *dbgname;
@@ -418,6 +420,11 @@ struct vas_winctx {
 extern struct vas_window *vas_pswid_to_window(struct vas_instance *vinst,
 						uint32_t pswid);
 
+static inline int vas_window_pid(struct vas_window *window)
+{
+	return pid_vnr(window->pid);
+}
+
 static inline void vas_log_write(struct vas_window *win, char *name,
 			void *regptr, u64 val)
 {
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 09/14] powerpc/vas: Update CSB and notify process for fault CRBs
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (7 preceding siblings ...)
  2020-01-22  8:12 ` [PATCH V5 08/14] powerpc/vas: Take reference to PID and mm for user space windows Haren Myneni
@ 2020-01-22  8:17 ` Haren Myneni
  2020-02-07  5:46   ` Michael Neuling
  2020-01-22  8:18 ` [PATCH V5 10/14] powerpc/vas: Print CRB and FIFO values Haren Myneni
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:17 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


For each fault CRB, update fault address in CRB (fault_storage_addr)
and translation error status in CSB so that user space can touch the
fault address and resend the request. If the user space passed invalid
CSB address send signal to process with SIGSEGV.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-fault.c | 116 +++++++++++++++++++++++++++++
 1 file changed, 116 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-fault.c b/arch/powerpc/platforms/powernv/vas-fault.c
index 5c2cada..2cfab0c 100644
--- a/arch/powerpc/platforms/powernv/vas-fault.c
+++ b/arch/powerpc/platforms/powernv/vas-fault.c
@@ -11,6 +11,7 @@
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 #include <linux/kthread.h>
+#include <linux/sched/signal.h>
 #include <linux/mmu_context.h>
 #include <asm/icswx.h>
 
@@ -26,6 +27,120 @@
 #define VAS_FAULT_WIN_FIFO_SIZE	(4 << 20)
 
 /*
+ * Update the CSB to indicate a translation error.
+ *
+ * If the fault is in the CSB address itself or if we are unable to
+ * update the CSB, send a signal to the process, because we have no
+ * other way of notifying the user process.
+ *
+ * Remaining settings in the CSB are based on wait_for_csb() of
+ * NX-GZIP.
+ */
+static void update_csb(struct vas_window *window,
+			struct coprocessor_request_block *crb)
+{
+	int rc;
+	struct pid *pid;
+	void __user *csb_addr;
+	struct task_struct *tsk;
+	struct kernel_siginfo info;
+	struct coprocessor_status_block csb;
+
+	/*
+	 * NX user space windows can not be opened for task->mm=NULL
+	 * and faults will not be generated for kernel requests.
+	 */
+	if (!window->mm || !window->user_win)
+		return;
+
+	csb_addr = (void *)be64_to_cpu(crb->csb_addr);
+
+	csb.cc = CSB_CC_TRANSLATION;
+	csb.ce = CSB_CE_TERMINATION;
+	csb.cs = 0;
+	csb.count = 0;
+
+	/*
+	 * Returns the fault address in CPU format since it is passed with
+	 * signal. But if the user space expects BE format, need changes.
+	 * i.e either kernel (here) or user should convert to CPU format.
+	 * Not both!
+	 */
+	csb.address = be64_to_cpu(crb->stamp.nx.fault_storage_addr);
+	csb.flags = 0;
+
+	pid = window->pid;
+	tsk = get_pid_task(pid, PIDTYPE_PID);
+	/*
+	 * Send window will be closed after processing all NX requests
+	 * and process exits after closing all windows. In multi-thread
+	 * applications, thread may not exists, but does not close FD
+	 * (means send window) upon exit. Parent thread (tgid) can use
+	 * and close the window later.
+	 * pid and mm references are taken when window is opened by
+	 * process (pid). So tgid is used only when child thread opens
+	 * a window and exits without closing it in multithread tasks.
+	 */
+	if (!tsk) {
+		pid = window->tgid;
+		tsk = get_pid_task(pid, PIDTYPE_PID);
+		/*
+		 * Parent thread will be closing window during its exit.
+		 * So should not get here.
+		 */
+		if (!tsk)
+			return;
+	}
+
+	/* Return if the task is exiting. */
+	if (tsk->flags & PF_EXITING) {
+		put_task_struct(tsk);
+		return;
+	}
+
+	use_mm(window->mm);
+	rc = copy_to_user(csb_addr, &csb, sizeof(csb));
+	/*
+	 * User space polls on csb.flags (first byte). So add barrier
+	 * then copy first byte with csb flags update.
+	 */
+	smp_mb();
+	if (!rc) {
+		csb.flags = CSB_V;
+		rc = copy_to_user(csb_addr, &csb, sizeof(u8));
+	}
+	unuse_mm(window->mm);
+	put_task_struct(tsk);
+
+	/* Success */
+	if (!rc)
+		return;
+
+	pr_err("Invalid CSB address 0x%p signalling pid(%d)\n",
+			csb_addr, pid_vnr(pid));
+
+	clear_siginfo(&info);
+	info.si_signo = SIGSEGV;
+	info.si_errno = EFAULT;
+	info.si_code = SEGV_MAPERR;
+	info.si_addr = csb_addr;
+
+	/*
+	 * process will be polling on csb.flags after request is sent to
+	 * NX. So generally CSB update should not fail except when an
+	 * application does not follow the process properly. So an error
+	 * message will be displayed and leave it to user space whether
+	 * to ignore or handle this signal.
+	 */
+	rcu_read_lock();
+	rc = kill_pid_info(SIGSEGV, &info, pid);
+	rcu_read_unlock();
+
+	pr_devel("%s(): pid %d kill_proc_info() rc %d\n", __func__,
+			pid_vnr(pid), rc);
+}
+
+/*
  * Process CRBs that we receive on the fault window.
  */
 irqreturn_t vas_fault_handler(int irq, void *data)
@@ -104,6 +219,7 @@ irqreturn_t vas_fault_handler(int irq, void *data)
 			return IRQ_HANDLED;
 		}
 
+		update_csb(window, crb);
 	} while (true);
 
 	return IRQ_HANDLED;
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 10/14] powerpc/vas: Print CRB and FIFO values
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (8 preceding siblings ...)
  2020-01-22  8:17 ` [PATCH V5 09/14] powerpc/vas: Update CSB and notify process for fault CRBs Haren Myneni
@ 2020-01-22  8:18 ` Haren Myneni
  2020-01-22  8:21 ` [PATCH V5 11/14] powerpc/vas: Do not use default credits for receive window Haren Myneni
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:18 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


Dump FIFO entry values if could not find send window and print CRB
for debugging.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-fault.c | 41 ++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-fault.c b/arch/powerpc/platforms/powernv/vas-fault.c
index 2cfab0c..6431240 100644
--- a/arch/powerpc/platforms/powernv/vas-fault.c
+++ b/arch/powerpc/platforms/powernv/vas-fault.c
@@ -26,6 +26,28 @@
  */
 #define VAS_FAULT_WIN_FIFO_SIZE	(4 << 20)
 
+static void dump_crb(struct coprocessor_request_block *crb)
+{
+	struct data_descriptor_entry *dde;
+	struct nx_fault_stamp *nx;
+
+	dde = &crb->source;
+	pr_devel("SrcDDE: addr 0x%llx, len %d, count %d, idx %d, flags %d\n",
+		be64_to_cpu(dde->address), be32_to_cpu(dde->length),
+		dde->count, dde->index, dde->flags);
+
+	dde = &crb->target;
+	pr_devel("TgtDDE: addr 0x%llx, len %d, count %d, idx %d, flags %d\n",
+		be64_to_cpu(dde->address), be32_to_cpu(dde->length),
+		dde->count, dde->index, dde->flags);
+
+	nx = &crb->stamp.nx;
+	pr_devel("NX Stamp: PSWID 0x%x, FSA 0x%llx, flags 0x%x, FS 0x%x\n",
+		be32_to_cpu(nx->pswid),
+		be64_to_cpu(crb->stamp.nx.fault_storage_addr),
+		nx->flags, be32_to_cpu(nx->fault_status));
+}
+
 /*
  * Update the CSB to indicate a translation error.
  *
@@ -140,6 +162,23 @@ static void update_csb(struct vas_window *window,
 			pid_vnr(pid), rc);
 }
 
+static void dump_fifo(struct vas_instance *vinst, void *entry)
+{
+	int i;
+	unsigned long *fifo = entry;
+
+	pr_err("Fault fifo size %d, max crbs %d, crb size %lu\n",
+			vinst->fault_fifo_size,
+			vinst->fault_fifo_size / CRB_SIZE,
+			sizeof(struct coprocessor_request_block));
+
+	pr_err("Fault FIFO Entry Dump:\n");
+	for (i = 0; i < CRB_SIZE; i += 4, fifo += 4) {
+		pr_err("[%.3d, %p]: 0x%.16lx 0x%.16lx 0x%.16lx 0x%.16lx\n",
+			i, fifo, *fifo, *(fifo+1), *(fifo+2), *(fifo+3));
+	}
+}
+
 /*
  * Process CRBs that we receive on the fault window.
  */
@@ -200,6 +239,7 @@ irqreturn_t vas_fault_handler(int irq, void *data)
 				vinst->vas_id, vinst->fault_fifo, fifo,
 				vinst->fault_crbs);
 
+		dump_crb(crb);
 		window = vas_pswid_to_window(vinst,
 				be32_to_cpu(crb->stamp.nx.pswid));
 
@@ -210,6 +250,7 @@ irqreturn_t vas_fault_handler(int irq, void *data)
 			 * even clean it up (return credit).
 			 * But we should not get here.
 			 */
+			dump_fifo(vinst, (void *)crb);
 			pr_err("VAS[%d] fault_fifo %p, fifo %p, pswid 0x%x, fault_crbs %d bad CRB?\n",
 				vinst->vas_id, vinst->fault_fifo, fifo,
 				be32_to_cpu(crb->stamp.nx.pswid),
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 11/14] powerpc/vas: Do not use default credits for receive window
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (9 preceding siblings ...)
  2020-01-22  8:18 ` [PATCH V5 10/14] powerpc/vas: Print CRB and FIFO values Haren Myneni
@ 2020-01-22  8:21 ` Haren Myneni
  2020-01-22  8:24 ` [PATCH V5 12/14] powerpc/VAS: Return credits after handling fault Haren Myneni
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:21 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


System checkstops if RxFIFO overruns with more requests than the
maximum possible number of CRBs allowed in FIFO at any time. So
max credits value (rxattr.wcreds_max) is set and is passed to
vas_rx_win_open() by the the driver.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-window.c | 4 ++--
 arch/powerpc/platforms/powernv/vas.h        | 2 --
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 7587258..427a884 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -772,7 +772,7 @@ static bool rx_win_args_valid(enum vas_cop_type cop,
 	if (attr->rx_fifo_size > VAS_RX_FIFO_SIZE_MAX)
 		return false;
 
-	if (attr->wcreds_max > VAS_RX_WCREDS_MAX)
+	if (!attr->wcreds_max)
 		return false;
 
 	if (attr->nx_win) {
@@ -877,7 +877,7 @@ struct vas_window *vas_rx_win_open(int vasid, enum vas_cop_type cop,
 	rxwin->nx_win = rxattr->nx_win;
 	rxwin->user_win = rxattr->user_win;
 	rxwin->cop = cop;
-	rxwin->wcreds_max = rxattr->wcreds_max ?: VAS_WCREDS_DEFAULT;
+	rxwin->wcreds_max = rxattr->wcreds_max;
 
 	init_winctx_for_rxwin(rxwin, rxattr, &winctx);
 	init_winctx_regs(rxwin, &winctx);
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index af03aa0..f5f45ea 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -101,11 +101,9 @@
 /*
  * Initial per-process credits.
  * Max send window credits:    4K-1 (12-bits in VAS_TX_WCRED)
- * Max receive window credits: 64K-1 (16 bits in VAS_LRX_WCRED)
  *
  * TODO: Needs tuning for per-process credits
  */
-#define VAS_RX_WCREDS_MAX		((64 << 10) - 1)
 #define VAS_TX_WCREDS_MAX		((4 << 10) - 1)
 #define VAS_WCREDS_DEFAULT		(1 << 10)
 
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 12/14] powerpc/VAS: Return credits after handling fault
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (10 preceding siblings ...)
  2020-01-22  8:21 ` [PATCH V5 11/14] powerpc/vas: Do not use default credits for receive window Haren Myneni
@ 2020-01-22  8:24 ` Haren Myneni
  2020-01-22  8:25 ` [PATCH V5 13/14] powerpc/vas: Display process stuck message Haren Myneni
  2020-01-22  8:26 ` [PATCH V5 14/14] powerpc/vas: Free send window in VAS instance after credits returned Haren Myneni
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:24 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


NX expects OS to return credit for send window after processing each
fault. Also credit has to be returned even for fault window.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-fault.c  | 10 ++++++++++
 arch/powerpc/platforms/powernv/vas-window.c | 17 +++++++++++++++++
 arch/powerpc/platforms/powernv/vas.h        |  1 +
 3 files changed, 28 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-fault.c b/arch/powerpc/platforms/powernv/vas-fault.c
index 6431240..a993c5b 100644
--- a/arch/powerpc/platforms/powernv/vas-fault.c
+++ b/arch/powerpc/platforms/powernv/vas-fault.c
@@ -235,6 +235,11 @@ irqreturn_t vas_fault_handler(int irq, void *data)
 		memset(fifo, 0, CRB_SIZE);
 		mutex_unlock(&vinst->mutex);
 
+		/*
+		 * Return credit for the fault window.
+		 */
+		vas_return_credit(vinst->fault_win, 0);
+
 		pr_devel("VAS[%d] fault_fifo %p, fifo %p, fault_crbs %d\n",
 				vinst->vas_id, vinst->fault_fifo, fifo,
 				vinst->fault_crbs);
@@ -261,6 +266,11 @@ irqreturn_t vas_fault_handler(int irq, void *data)
 		}
 
 		update_csb(window, crb);
+		/*
+		 * Return credit for send window after processing
+		 * fault CRB.
+		 */
+		vas_return_credit(window, 1);
 	} while (true);
 
 	return IRQ_HANDLED;
diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 427a884..1439a6f 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -1318,6 +1318,23 @@ int vas_win_close(struct vas_window *window)
 }
 EXPORT_SYMBOL_GPL(vas_win_close);
 
+/*
+ * Return credit for the given window.
+ */
+void vas_return_credit(struct vas_window *window, bool tx)
+{
+	uint64_t val;
+
+	val = 0ULL;
+	if (tx) { /* send window */
+		val = SET_FIELD(VAS_TX_WCRED, val, 1);
+		write_hvwc_reg(window, VREG(TX_WCRED_ADDER), val);
+	} else {
+		val = SET_FIELD(VAS_LRX_WCRED, val, 1);
+		write_hvwc_reg(window, VREG(LRX_WCRED_ADDER), val);
+	}
+}
+
 struct vas_window *vas_pswid_to_window(struct vas_instance *vinst,
 		uint32_t pswid)
 {
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index f5f45ea..495937a 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -415,6 +415,7 @@ struct vas_winctx {
 extern void vas_window_free_dbgdir(struct vas_window *win);
 extern int vas_setup_fault_window(struct vas_instance *vinst);
 extern irqreturn_t vas_fault_handler(int irq, void *data);
+extern void vas_return_credit(struct vas_window *window, bool tx);
 extern struct vas_window *vas_pswid_to_window(struct vas_instance *vinst,
 						uint32_t pswid);
 
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 13/14] powerpc/vas: Display process stuck message
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (11 preceding siblings ...)
  2020-01-22  8:24 ` [PATCH V5 12/14] powerpc/VAS: Return credits after handling fault Haren Myneni
@ 2020-01-22  8:25 ` Haren Myneni
  2020-01-22  8:26 ` [PATCH V5 14/14] powerpc/vas: Free send window in VAS instance after credits returned Haren Myneni
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:25 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


Process can not close send window until all requests are processed.
Means wait until window state is not busy and send credits are
returned. Display debug message in case taking longer to close the
window.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-window.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 1439a6f..88cecff 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -1182,6 +1182,7 @@ static void poll_window_credits(struct vas_window *window)
 {
 	u64 val;
 	int creds, mode;
+	int count = 0;
 
 	val = read_hvwc_reg(window, VREG(WINCTL));
 	if (window->tx_win)
@@ -1200,10 +1201,26 @@ static void poll_window_credits(struct vas_window *window)
 		creds = GET_FIELD(VAS_LRX_WCRED, val);
 	}
 
+	/*
+	 * Takes around few microseconds to complete all pending requests
+	 * and return credits.
+	 * TODO: Scan fault FIFO and invalidate CRBs points to this window
+	 *       and issue CRB Kill to stop all pending requests. Need only
+	 *       if there is a bug in NX or fault handling in kernel.
+	 */
 	if (creds < window->wcreds_max) {
 		val = 0;
 		set_current_state(TASK_UNINTERRUPTIBLE);
 		schedule_timeout(msecs_to_jiffies(10));
+		count++;
+		/*
+		 * Process can not close send window until all credits are
+		 * returned.
+		 */
+		if (!(count % 10000))
+			pr_debug("%s() pid %d stuck? retries %d\n", __func__,
+				vas_window_pid(window), count);
+
 		goto retry;
 	}
 }
@@ -1217,6 +1234,7 @@ static void poll_window_busy_state(struct vas_window *window)
 {
 	int busy;
 	u64 val;
+	int count = 0;
 
 retry:
 	val = read_hvwc_reg(window, VREG(WIN_STATUS));
@@ -1225,6 +1243,15 @@ static void poll_window_busy_state(struct vas_window *window)
 		val = 0;
 		set_current_state(TASK_UNINTERRUPTIBLE);
 		schedule_timeout(msecs_to_jiffies(5));
+		count++;
+		/*
+		 * Takes around 5 microseconds to process all pending
+		 * requests.
+		 */
+		if (!(count % 10000))
+			pr_debug("%s() pid %d stuck? retries %d\n", __func__,
+				vas_window_pid(window), count);
+
 		goto retry;
 	}
 }
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V5 14/14] powerpc/vas: Free send window in VAS instance after credits returned
  2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
                   ` (12 preceding siblings ...)
  2020-01-22  8:25 ` [PATCH V5 13/14] powerpc/vas: Display process stuck message Haren Myneni
@ 2020-01-22  8:26 ` Haren Myneni
  13 siblings, 0 replies; 21+ messages in thread
From: Haren Myneni @ 2020-01-22  8:26 UTC (permalink / raw)
  To: mpe; +Cc: mikey, herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


NX may be processing requests while trying to close window. Wait until
all credits are returned and then free send window from VAS instance.

Signed-off-by: Haren Myneni <haren@linux.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-window.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index 88cecff..5f9c915 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -1316,14 +1316,14 @@ int vas_win_close(struct vas_window *window)
 
 	unmap_paste_region(window);
 
-	clear_vinst_win(window);
-
 	poll_window_busy_state(window);
 
 	unpin_close_window(window);
 
 	poll_window_credits(window);
 
+	clear_vinst_win(window);
+
 	poll_window_castout(window);
 
 	/* if send window, drop reference to matching receive window */
-- 
1.8.3.1




^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH V5 09/14] powerpc/vas: Update CSB and notify process for fault CRBs
  2020-01-22  8:17 ` [PATCH V5 09/14] powerpc/vas: Update CSB and notify process for fault CRBs Haren Myneni
@ 2020-02-07  5:46   ` Michael Neuling
  2020-02-10  5:12     ` Haren Myneni
  0 siblings, 1 reply; 21+ messages in thread
From: Michael Neuling @ 2020-02-07  5:46 UTC (permalink / raw)
  To: Haren Myneni, mpe; +Cc: herbert, npiggin, hch, oohall, sukadev, linuxppc-dev

On Wed, 2020-01-22 at 00:17 -0800, Haren Myneni wrote:
> For each fault CRB, update fault address in CRB (fault_storage_addr)
> and translation error status in CSB so that user space can touch the
> fault address and resend the request. If the user space passed invalid
> CSB address send signal to process with SIGSEGV.
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Signed-off-by: Haren Myneni <haren@linux.ibm.com>
> ---
>  arch/powerpc/platforms/powernv/vas-fault.c | 116
> +++++++++++++++++++++++++++++
>  1 file changed, 116 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/vas-fault.c
> b/arch/powerpc/platforms/powernv/vas-fault.c
> index 5c2cada..2cfab0c 100644
> --- a/arch/powerpc/platforms/powernv/vas-fault.c
> +++ b/arch/powerpc/platforms/powernv/vas-fault.c
> @@ -11,6 +11,7 @@
>  #include <linux/slab.h>
>  #include <linux/uaccess.h>
>  #include <linux/kthread.h>
> +#include <linux/sched/signal.h>
>  #include <linux/mmu_context.h>
>  #include <asm/icswx.h>
>  
> @@ -26,6 +27,120 @@
>  #define VAS_FAULT_WIN_FIFO_SIZE	(4 << 20)
>  
>  /*
> + * Update the CSB to indicate a translation error.
> + *
> + * If the fault is in the CSB address itself or if we are unable to
> + * update the CSB, send a signal to the process, because we have no
> + * other way of notifying the user process.
> + *
> + * Remaining settings in the CSB are based on wait_for_csb() of
> + * NX-GZIP.
> + */
> +static void update_csb(struct vas_window *window,
> +			struct coprocessor_request_block *crb)
> +{
> +	int rc;
> +	struct pid *pid;
> +	void __user *csb_addr;
> +	struct task_struct *tsk;
> +	struct kernel_siginfo info;
> +	struct coprocessor_status_block csb;
> +
> +	/*
> +	 * NX user space windows can not be opened for task->mm=NULL
> +	 * and faults will not be generated for kernel requests.
> +	 */
> +	if (!window->mm || !window->user_win)
> +		return;
> +
> +	csb_addr = (void *)be64_to_cpu(crb->csb_addr);
> +
> +	csb.cc = CSB_CC_TRANSLATION;
> +	csb.ce = CSB_CE_TERMINATION;
> +	csb.cs = 0;
> +	csb.count = 0;
> +
> +	/*
> +	 * Returns the fault address in CPU format since it is passed with
> +	 * signal. But if the user space expects BE format, need changes.
> +	 * i.e either kernel (here) or user should convert to CPU format.
> +	 * Not both!
> +	 */
> +	csb.address = be64_to_cpu(crb->stamp.nx.fault_storage_addr);

This looks wrong and I don't understand the comment. You need to convert this
back to be64 to write it to csb.address. ie.

  csb.address = cpu_to_be64(be64_to_cpu(crb->stamp.nx.fault_storage_addr));

Which I think you can just avoid the endian conversion all together.

> +	csb.flags = 0;
> +
> +	pid = window->pid;
> +	tsk = get_pid_task(pid, PIDTYPE_PID);
> +	/*
> +	 * Send window will be closed after processing all NX requests
> +	 * and process exits after closing all windows. In multi-thread
> +	 * applications, thread may not exists, but does not close FD
> +	 * (means send window) upon exit. Parent thread (tgid) can use
> +	 * and close the window later.
> +	 * pid and mm references are taken when window is opened by
> +	 * process (pid). So tgid is used only when child thread opens
> +	 * a window and exits without closing it in multithread tasks.
> +	 */
> +	if (!tsk) {
> +		pid = window->tgid;
> +		tsk = get_pid_task(pid, PIDTYPE_PID);
> +		/*
> +		 * Parent thread will be closing window during its exit.
> +		 * So should not get here.
> +		 */
> +		if (!tsk)
> +			return;
> +	}
> +
> +	/* Return if the task is exiting. */
> +	if (tsk->flags & PF_EXITING) {
> +		put_task_struct(tsk);
> +		return;
> +	}
> +
> +	use_mm(window->mm);
> +	rc = copy_to_user(csb_addr, &csb, sizeof(csb));
> +	/*
> +	 * User space polls on csb.flags (first byte). So add barrier
> +	 * then copy first byte with csb flags update.
> +	 */
> +	smp_mb();
> +	if (!rc) {
> +		csb.flags = CSB_V;
> +		rc = copy_to_user(csb_addr, &csb, sizeof(u8));
> +	}
> +	unuse_mm(window->mm);
> +	put_task_struct(tsk);
> +
> +	/* Success */
> +	if (!rc)
> +		return;
> +
> +	pr_err("Invalid CSB address 0x%p signalling pid(%d)\n",
> +			csb_addr, pid_vnr(pid));

This is a userspace error, not a kernel error. This should not be a pr_err().

Userspace could spam the console with this.

> +
> +	clear_siginfo(&info);
> +	info.si_signo = SIGSEGV;
> +	info.si_errno = EFAULT;
> +	info.si_code = SEGV_MAPERR;
> +	info.si_addr = csb_addr;
> +
> +	/*
> +	 * process will be polling on csb.flags after request is sent to
> +	 * NX. So generally CSB update should not fail except when an
> +	 * application does not follow the process properly. So an error
> +	 * message will be displayed and leave it to user space whether
> +	 * to ignore or handle this signal.
> +	 */
> +	rcu_read_lock();
> +	rc = kill_pid_info(SIGSEGV, &info, pid);
> +	rcu_read_unlock();

why the rcu_read_un/lock() here?

> +
> +	pr_devel("%s(): pid %d kill_proc_info() rc %d\n", __func__,
> +			pid_vnr(pid), rc);
> +}
> +
> +/*
>   * Process CRBs that we receive on the fault window.
>   */
>  irqreturn_t vas_fault_handler(int irq, void *data)
> @@ -104,6 +219,7 @@ irqreturn_t vas_fault_handler(int irq, void *data)
>  			return IRQ_HANDLED;
>  		}
>  
> +		update_csb(window, crb);
>  	} while (true);
>  
>  	return IRQ_HANDLED;


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V5 06/14] powerpc/vas: Setup thread IRQ handler per VAS instance
  2020-01-22  8:10 ` [PATCH V5 06/14] powerpc/vas: Setup thread IRQ handler " Haren Myneni
@ 2020-02-07  5:57   ` Michael Neuling
  2020-02-10  5:17     ` Haren Myneni
  0 siblings, 1 reply; 21+ messages in thread
From: Michael Neuling @ 2020-02-07  5:57 UTC (permalink / raw)
  To: Haren Myneni, mpe; +Cc: herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


>  /*
> + * Process CRBs that we receive on the fault window.
> + */
> +irqreturn_t vas_fault_handler(int irq, void *data)
> +{
> +	struct vas_instance *vinst = data;
> +	struct coprocessor_request_block buf, *crb;
> +	struct vas_window *window;
> +	void *fifo;
> +
> +	/*
> +	 * VAS can interrupt with multiple page faults. So process all
> +	 * valid CRBs within fault FIFO until reaches invalid CRB.
> +	 * NX updates nx_fault_stamp in CRB and pastes in fault FIFO.
> +	 * kernel retrives send window from parition send window ID
> +	 * (pswid) in nx_fault_stamp. So pswid should be non-zero and
> +	 * use this to check whether CRB is valid.
> +	 * After reading CRB entry, it is reset with 0's in fault FIFO.
> +	 *
> +	 * In case kernel receives another interrupt with different page
> +	 * fault and CRBs are processed by the previous handling, will be
> +	 * returned from this function when it sees invalid CRB (means 0's).
> +	 */
> +	do {
> +		mutex_lock(&vinst->mutex);

This isn't going to work.

From Documentation/locking/mutex-design.rst

    - Mutexes may not be used in hardware or software interrupt
      contexts such as tasklets and timers.

Mikey


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V5 09/14] powerpc/vas: Update CSB and notify process for fault CRBs
  2020-02-07  5:46   ` Michael Neuling
@ 2020-02-10  5:12     ` Haren Myneni
  2020-02-10  9:25       ` Michael Neuling
  0 siblings, 1 reply; 21+ messages in thread
From: Haren Myneni @ 2020-02-10  5:12 UTC (permalink / raw)
  To: Michael Neuling; +Cc: herbert, npiggin, hch, oohall, sukadev, linuxppc-dev

Mikey, Thanks for your review comments.

On Fri, 2020-02-07 at 16:46 +1100, Michael Neuling wrote:
> On Wed, 2020-01-22 at 00:17 -0800, Haren Myneni wrote:
> > For each fault CRB, update fault address in CRB (fault_storage_addr)
> > and translation error status in CSB so that user space can touch the
> > fault address and resend the request. If the user space passed invalid
> > CSB address send signal to process with SIGSEGV.
> > 
> > Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> > Signed-off-by: Haren Myneni <haren@linux.ibm.com>
> > ---
> >  arch/powerpc/platforms/powernv/vas-fault.c | 116
> > +++++++++++++++++++++++++++++
> >  1 file changed, 116 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/vas-fault.c
> > b/arch/powerpc/platforms/powernv/vas-fault.c
> > index 5c2cada..2cfab0c 100644
> > --- a/arch/powerpc/platforms/powernv/vas-fault.c
> > +++ b/arch/powerpc/platforms/powernv/vas-fault.c
> > @@ -11,6 +11,7 @@
> >  #include <linux/slab.h>
> >  #include <linux/uaccess.h>
> >  #include <linux/kthread.h>
> > +#include <linux/sched/signal.h>
> >  #include <linux/mmu_context.h>
> >  #include <asm/icswx.h>
> >  
> > @@ -26,6 +27,120 @@
> >  #define VAS_FAULT_WIN_FIFO_SIZE	(4 << 20)
> >  
> >  /*
> > + * Update the CSB to indicate a translation error.
> > + *
> > + * If the fault is in the CSB address itself or if we are unable to
> > + * update the CSB, send a signal to the process, because we have no
> > + * other way of notifying the user process.
> > + *
> > + * Remaining settings in the CSB are based on wait_for_csb() of
> > + * NX-GZIP.
> > + */
> > +static void update_csb(struct vas_window *window,
> > +			struct coprocessor_request_block *crb)
> > +{
> > +	int rc;
> > +	struct pid *pid;
> > +	void __user *csb_addr;
> > +	struct task_struct *tsk;
> > +	struct kernel_siginfo info;
> > +	struct coprocessor_status_block csb;
> > +
> > +	/*
> > +	 * NX user space windows can not be opened for task->mm=NULL
> > +	 * and faults will not be generated for kernel requests.
> > +	 */
> > +	if (!window->mm || !window->user_win)
> > +		return;
> > +
> > +	csb_addr = (void *)be64_to_cpu(crb->csb_addr);
> > +
> > +	csb.cc = CSB_CC_TRANSLATION;
> > +	csb.ce = CSB_CE_TERMINATION;
> > +	csb.cs = 0;
> > +	csb.count = 0;
> > +
> > +	/*
> > +	 * Returns the fault address in CPU format since it is passed with
> > +	 * signal. But if the user space expects BE format, need changes.
> > +	 * i.e either kernel (here) or user should convert to CPU format.
> > +	 * Not both!
> > +	 */
> > +	csb.address = be64_to_cpu(crb->stamp.nx.fault_storage_addr);
> 
> This looks wrong and I don't understand the comment. You need to convert this
> back to be64 to write it to csb.address. ie.
> 
>   csb.address = cpu_to_be64(be64_to_cpu(crb->stamp.nx.fault_storage_addr));
> 
> Which I think you can just avoid the endian conversion all together.

NX pastes fault CRB in big-endian, so passing this address in CPU format
to user space, otherwise the library has to convert. 

What is the standard way for passing to user space? 

> 
> > +	csb.flags = 0;
> > +
> > +	pid = window->pid;
> > +	tsk = get_pid_task(pid, PIDTYPE_PID);
> > +	/*
> > +	 * Send window will be closed after processing all NX requests
> > +	 * and process exits after closing all windows. In multi-thread
> > +	 * applications, thread may not exists, but does not close FD
> > +	 * (means send window) upon exit. Parent thread (tgid) can use
> > +	 * and close the window later.
> > +	 * pid and mm references are taken when window is opened by
> > +	 * process (pid). So tgid is used only when child thread opens
> > +	 * a window and exits without closing it in multithread tasks.
> > +	 */
> > +	if (!tsk) {
> > +		pid = window->tgid;
> > +		tsk = get_pid_task(pid, PIDTYPE_PID);
> > +		/*
> > +		 * Parent thread will be closing window during its exit.
> > +		 * So should not get here.
> > +		 */
> > +		if (!tsk)
> > +			return;
> > +	}
> > +
> > +	/* Return if the task is exiting. */
> > +	if (tsk->flags & PF_EXITING) {
> > +		put_task_struct(tsk);
> > +		return;
> > +	}
> > +
> > +	use_mm(window->mm);
> > +	rc = copy_to_user(csb_addr, &csb, sizeof(csb));
> > +	/*
> > +	 * User space polls on csb.flags (first byte). So add barrier
> > +	 * then copy first byte with csb flags update.
> > +	 */
> > +	smp_mb();
> > +	if (!rc) {
> > +		csb.flags = CSB_V;
> > +		rc = copy_to_user(csb_addr, &csb, sizeof(u8));
> > +	}
> > +	unuse_mm(window->mm);
> > +	put_task_struct(tsk);
> > +
> > +	/* Success */
> > +	if (!rc)
> > +		return;
> > +
> > +	pr_err("Invalid CSB address 0x%p signalling pid(%d)\n",
> > +			csb_addr, pid_vnr(pid));
> 
> This is a userspace error, not a kernel error. This should not be a pr_err().
> 
> Userspace could spam the console with this.

Will change it to pr_debug/info. Added pr_err() during development and
missed to remove. 
> 
> > +
> > +	clear_siginfo(&info);
> > +	info.si_signo = SIGSEGV;
> > +	info.si_errno = EFAULT;
> > +	info.si_code = SEGV_MAPERR;
> > +	info.si_addr = csb_addr;
> > +
> > +	/*
> > +	 * process will be polling on csb.flags after request is sent to
> > +	 * NX. So generally CSB update should not fail except when an
> > +	 * application does not follow the process properly. So an error
> > +	 * message will be displayed and leave it to user space whether
> > +	 * to ignore or handle this signal.
> > +	 */
> > +	rcu_read_lock();
> > +	rc = kill_pid_info(SIGSEGV, &info, pid);
> > +	rcu_read_unlock();
> 
> why the rcu_read_un/lock() here?

Used same as in kill_proc_info()/kill_something_info()
> 
> > +
> > +	pr_devel("%s(): pid %d kill_proc_info() rc %d\n", __func__,
> > +			pid_vnr(pid), rc);
> > +}
> > +
> > +/*
> >   * Process CRBs that we receive on the fault window.
> >   */
> >  irqreturn_t vas_fault_handler(int irq, void *data)
> > @@ -104,6 +219,7 @@ irqreturn_t vas_fault_handler(int irq, void *data)
> >  			return IRQ_HANDLED;
> >  		}
> >  
> > +		update_csb(window, crb);
> >  	} while (true);
> >  
> >  	return IRQ_HANDLED;
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V5 06/14] powerpc/vas: Setup thread IRQ handler per VAS instance
  2020-02-07  5:57   ` Michael Neuling
@ 2020-02-10  5:17     ` Haren Myneni
  2020-02-11  4:08       ` Michael Neuling
  0 siblings, 1 reply; 21+ messages in thread
From: Haren Myneni @ 2020-02-10  5:17 UTC (permalink / raw)
  To: Michael Neuling; +Cc: herbert, npiggin, hch, oohall, sukadev, linuxppc-dev

On Fri, 2020-02-07 at 16:57 +1100, Michael Neuling wrote:
> >  /*
> > + * Process CRBs that we receive on the fault window.
> > + */
> > +irqreturn_t vas_fault_handler(int irq, void *data)
> > +{
> > +	struct vas_instance *vinst = data;
> > +	struct coprocessor_request_block buf, *crb;
> > +	struct vas_window *window;
> > +	void *fifo;
> > +
> > +	/*
> > +	 * VAS can interrupt with multiple page faults. So process all
> > +	 * valid CRBs within fault FIFO until reaches invalid CRB.
> > +	 * NX updates nx_fault_stamp in CRB and pastes in fault FIFO.
> > +	 * kernel retrives send window from parition send window ID
> > +	 * (pswid) in nx_fault_stamp. So pswid should be non-zero and
> > +	 * use this to check whether CRB is valid.
> > +	 * After reading CRB entry, it is reset with 0's in fault FIFO.
> > +	 *
> > +	 * In case kernel receives another interrupt with different page
> > +	 * fault and CRBs are processed by the previous handling, will be
> > +	 * returned from this function when it sees invalid CRB (means 0's).
> > +	 */
> > +	do {
> > +		mutex_lock(&vinst->mutex);
> 
> This isn't going to work.
> 
> From Documentation/locking/mutex-design.rst
> 
>     - Mutexes may not be used in hardware or software interrupt
>       contexts such as tasklets and timers.

Initially used kernel thread per VAS instance and later using IRQ
thread. 

vas_fault_handler() is IRQ thread function, not IRQ handler. I thought
we can use mutex_lock() in thread function.

> 
> Mikey
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V5 09/14] powerpc/vas: Update CSB and notify process for fault CRBs
  2020-02-10  5:12     ` Haren Myneni
@ 2020-02-10  9:25       ` Michael Neuling
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Neuling @ 2020-02-10  9:25 UTC (permalink / raw)
  To: Haren Myneni; +Cc: herbert, npiggin, hch, oohall, sukadev, linuxppc-dev


> > > +
> > > +	csb.cc = CSB_CC_TRANSLATION;
> > > +	csb.ce = CSB_CE_TERMINATION;
> > > +	csb.cs = 0;
> > > +	csb.count = 0;
> > > +
> > > +	/*
> > > +	 * Returns the fault address in CPU format since it is passed with
> > > +	 * signal. But if the user space expects BE format, need changes.
> > > +	 * i.e either kernel (here) or user should convert to CPU format.
> > > +	 * Not both!
> > > +	 */
> > > +	csb.address = be64_to_cpu(crb->stamp.nx.fault_storage_addr);
> > 
> > This looks wrong and I don't understand the comment. You need to convert
> > this
> > back to be64 to write it to csb.address. ie.
> > 
> >   csb.address = cpu_to_be64(be64_to_cpu(crb->stamp.nx.fault_storage_addr));
> > 
> > Which I think you can just avoid the endian conversion all together.
> 
> NX pastes fault CRB in big-endian, so passing this address in CPU format
> to user space, otherwise the library has to convert. 

OK, then please change the definition in struct coprocessor_status_block to just
__u64.

struct coprocessor_status_block {
	u8 flags;
	u8 cs;
	u8 cc;
	u8 ce;
	__be32 count;
	__be64 address;
} __packed __aligned(CSB_ALIGN);

Big but....

I thought "struct coprocessor_status_block" was also written by hardware. If
that's the case then it needs to be __be64 and you need the kernel to synthesize
exactly what the hardware is doing. Hence the struct definition is correct and
the kernel needs to convert to _be64 on writing. 

> What is the standard way for passing to user space? 

CPU endian.

> > > +	 * process will be polling on csb.flags after request is sent to
> > > +	 * NX. So generally CSB update should not fail except when an
> > > +	 * application does not follow the process properly. So an error
> > > +	 * message will be displayed and leave it to user space whether
> > > +	 * to ignore or handle this signal.
> > > +	 */
> > > +	rcu_read_lock();
> > > +	rc = kill_pid_info(SIGSEGV, &info, pid);
> > > +	rcu_read_unlock();
> > 
> > why the rcu_read_un/lock() here?
> 
> Used same as in kill_proc_info()/kill_something_info()

Please document.

Mikey

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V5 06/14] powerpc/vas: Setup thread IRQ handler per VAS instance
  2020-02-10  5:17     ` Haren Myneni
@ 2020-02-11  4:08       ` Michael Neuling
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Neuling @ 2020-02-11  4:08 UTC (permalink / raw)
  To: Haren Myneni; +Cc: herbert, npiggin, hch, oohall, sukadev, linuxppc-dev

On Sun, 2020-02-09 at 21:17 -0800, Haren Myneni wrote:
> On Fri, 2020-02-07 at 16:57 +1100, Michael Neuling wrote:
> > >  /*
> > > + * Process CRBs that we receive on the fault window.
> > > + */
> > > +irqreturn_t vas_fault_handler(int irq, void *data)
> > > +{
> > > +	struct vas_instance *vinst = data;
> > > +	struct coprocessor_request_block buf, *crb;
> > > +	struct vas_window *window;
> > > +	void *fifo;
> > > +
> > > +	/*
> > > +	 * VAS can interrupt with multiple page faults. So process all
> > > +	 * valid CRBs within fault FIFO until reaches invalid CRB.
> > > +	 * NX updates nx_fault_stamp in CRB and pastes in fault FIFO.
> > > +	 * kernel retrives send window from parition send window ID
> > > +	 * (pswid) in nx_fault_stamp. So pswid should be non-zero and
> > > +	 * use this to check whether CRB is valid.
> > > +	 * After reading CRB entry, it is reset with 0's in fault FIFO.
> > > +	 *
> > > +	 * In case kernel receives another interrupt with different page
> > > +	 * fault and CRBs are processed by the previous handling, will be
> > > +	 * returned from this function when it sees invalid CRB (means 0's).
> > > +	 */
> > > +	do {
> > > +		mutex_lock(&vinst->mutex);
> > 
> > This isn't going to work.
> > 
> > From Documentation/locking/mutex-design.rst
> > 
> >     - Mutexes may not be used in hardware or software interrupt
> >       contexts such as tasklets and timers.
> 
> Initially used kernel thread per VAS instance and later using IRQ
> thread. 
> 
> vas_fault_handler() is IRQ thread function, not IRQ handler. I thought
> we can use mutex_lock() in thread function.

Sorry, I missed it was a threaded IRQ handler, so I think is ok to use a
mutex_lock() in there.

You should run with CONFIG DEBUG_MUTEXES and CONFIG_LOCKDEP enabled to give you
some more confidence.

It would be good to document how this mutex is used and document the start of
the function so it doesn't get changed later to a non-threaded handler. 

Mikey

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-02-11  4:10 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-22  7:56 [PATCH V5 00/14] powerpc/vas: Page fault handling for user space NX requests Haren Myneni
2020-01-22  8:06 ` [PATCH V5 01/14] powerpc/xive: Define xive_native_alloc_irq_on_chip() Haren Myneni
2020-01-22  8:06 ` [PATCH V5 02/14] powerpc/xive: Define xive_native_alloc_get_irq_info() Haren Myneni
2020-01-22  8:07 ` [PATCH V5 03/14] powerpc/vas: Define nx_fault_stamp in coprocessor_request_block Haren Myneni
2020-01-22  8:08 ` [PATCH V5 04/14] powerpc/vas: Alloc and setup IRQ and trigger port address Haren Myneni
2020-01-22  8:08 ` [PATCH V5 05/14] powerpc/vas: Setup fault window per VAS instance Haren Myneni
2020-01-22  8:10 ` [PATCH V5 06/14] powerpc/vas: Setup thread IRQ handler " Haren Myneni
2020-02-07  5:57   ` Michael Neuling
2020-02-10  5:17     ` Haren Myneni
2020-02-11  4:08       ` Michael Neuling
2020-01-22  8:11 ` [PATCH V5 07/14] powerpc/vas: Register NX with fault window ID and IRQ port value Haren Myneni
2020-01-22  8:12 ` [PATCH V5 08/14] powerpc/vas: Take reference to PID and mm for user space windows Haren Myneni
2020-01-22  8:17 ` [PATCH V5 09/14] powerpc/vas: Update CSB and notify process for fault CRBs Haren Myneni
2020-02-07  5:46   ` Michael Neuling
2020-02-10  5:12     ` Haren Myneni
2020-02-10  9:25       ` Michael Neuling
2020-01-22  8:18 ` [PATCH V5 10/14] powerpc/vas: Print CRB and FIFO values Haren Myneni
2020-01-22  8:21 ` [PATCH V5 11/14] powerpc/vas: Do not use default credits for receive window Haren Myneni
2020-01-22  8:24 ` [PATCH V5 12/14] powerpc/VAS: Return credits after handling fault Haren Myneni
2020-01-22  8:25 ` [PATCH V5 13/14] powerpc/vas: Display process stuck message Haren Myneni
2020-01-22  8:26 ` [PATCH V5 14/14] powerpc/vas: Free send window in VAS instance after credits returned Haren Myneni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).