All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] ARC do_page_fault rework
@ 2019-05-15  0:29 ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: paltsev, linux-kernel, Vineet Gupta

This was on my todo list and Eugeniy's patch in this area pestered me to
finish it finally.

The ideas it to cleanup/tidyup ancient do_page_fault() code and make
it more readable and hopefully better generated code. There are only
a few/subtle functional changes
 - Eugeniy's fix to prevent user space looping
 - reduction in mmap_sem hold times

Also this could have been 1 single patch but this is tricky part of mm
handling so better done as bite sized pieces to track down any regressions

Eugeniy Paltsev (1):
  ARC: mm: SIGSEGV userspace trying to access kernel virtual memory

Vineet Gupta (8):
  ARC: mm: do_page_fault refactor #1: remove label @good_area
  ARC: mm: do_page_fault refactor #2: remove short lived variable
  ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
  ARC: mm: do_page_fault refactor #4: consolidate retry related logic
  ARC: mm: do_page_fault refactor #5: scoot no_context to end
  ARC: mm: do_page_fault refactor #6: error handlers to use same pattern
  ARC: mm: do_page_fault refactor #7: fold the various error handling
  ARC: mm: do_page_fault refactor #8: release mmap_sem sooner

 arch/arc/mm/fault.c | 192 +++++++++++++++++++++-------------------------------
 1 file changed, 79 insertions(+), 113 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 0/9] ARC do_page_fault rework
@ 2019-05-15  0:29 ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

This was on my todo list and Eugeniy's patch in this area pestered me to
finish it finally.

The ideas it to cleanup/tidyup ancient do_page_fault() code and make
it more readable and hopefully better generated code. There are only
a few/subtle functional changes
 - Eugeniy's fix to prevent user space looping
 - reduction in mmap_sem hold times

Also this could have been 1 single patch but this is tricky part of mm
handling so better done as bite sized pieces to track down any regressions

Eugeniy Paltsev (1):
  ARC: mm: SIGSEGV userspace trying to access kernel virtual memory

Vineet Gupta (8):
  ARC: mm: do_page_fault refactor #1: remove label @good_area
  ARC: mm: do_page_fault refactor #2: remove short lived variable
  ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
  ARC: mm: do_page_fault refactor #4: consolidate retry related logic
  ARC: mm: do_page_fault refactor #5: scoot no_context to end
  ARC: mm: do_page_fault refactor #6: error handlers to use same pattern
  ARC: mm: do_page_fault refactor #7: fold the various error handling
  ARC: mm: do_page_fault refactor #8: release mmap_sem sooner

 arch/arc/mm/fault.c | 192 +++++++++++++++++++++-------------------------------
 1 file changed, 79 insertions(+), 113 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/9] ARC: mm: SIGSEGV userspace trying to access kernel virtual memory
  2019-05-15  0:29 ` Vineet Gupta
@ 2019-05-15  0:29   ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc
  Cc: paltsev, linux-kernel, Eugeniy Paltsev, stable, Vineet Gupta

From: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>

As of today if userspace process tries to access a kernel virtual addres
(0x7000_0000 to 0x7ffff_ffff) such that a legit kernel mapping already
exists, that process hangs instead of being killed with SIGSEGV

Fix that by ensuring that do_page_fault() handles kenrel vaddr only if
in kernel mode.

And given this, we can also simplify the code a bit. Now a vmalloc fault
implies kernel mode so its failure (for some reason) can reuse the
@no_context label and we can remove @bad_area_nosemaphore.

Reproduce user test for original problem:

------------------------>8-----------------
 #include <stdlib.h>
 #include <stdint.h>

 int main(int argc, char *argv[])
 {
 	volatile uint32_t temp;

 	temp = *(uint32_t *)(0x70000000);
 }
------------------------>8-----------------

Cc: <stable@vger.kernel.org>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/mm/fault.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 8df1638259f3..6836095251ed 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -66,7 +66,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct vm_area_struct *vma = NULL;
 	struct task_struct *tsk = current;
 	struct mm_struct *mm = tsk->mm;
-	int si_code = 0;
+	int si_code = SEGV_MAPERR;
 	int ret;
 	vm_fault_t fault;
 	int write = regs->ecr_cause & ECR_C_PROTV_STORE;  /* ST/EX */
@@ -81,16 +81,14 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	 * only copy the information from the master page table,
 	 * nothing more.
 	 */
-	if (address >= VMALLOC_START) {
+	if (address >= VMALLOC_START && !user_mode(regs)) {
 		ret = handle_kernel_vaddr_fault(address);
 		if (unlikely(ret))
-			goto bad_area_nosemaphore;
+			goto no_context;
 		else
 			return;
 	}
 
-	si_code = SEGV_MAPERR;
-
 	/*
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
@@ -198,7 +196,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 bad_area:
 	up_read(&mm->mmap_sem);
 
-bad_area_nosemaphore:
 	/* User mode accesses just cause a SIGSEGV */
 	if (user_mode(regs)) {
 		tsk->thread.fault_address = address;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 1/9] ARC: mm: SIGSEGV userspace trying to access kernel virtual memory
@ 2019-05-15  0:29   ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

From: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>

As of today if userspace process tries to access a kernel virtual addres
(0x7000_0000 to 0x7ffff_ffff) such that a legit kernel mapping already
exists, that process hangs instead of being killed with SIGSEGV

Fix that by ensuring that do_page_fault() handles kenrel vaddr only if
in kernel mode.

And given this, we can also simplify the code a bit. Now a vmalloc fault
implies kernel mode so its failure (for some reason) can reuse the
@no_context label and we can remove @bad_area_nosemaphore.

Reproduce user test for original problem:

------------------------>8-----------------
 #include <stdlib.h>
 #include <stdint.h>

 int main(int argc, char *argv[])
 {
 	volatile uint32_t temp;

 	temp = *(uint32_t *)(0x70000000);
 }
------------------------>8-----------------

Cc: <stable at vger.kernel.org>
Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev at synopsys.com>
Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
 arch/arc/mm/fault.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 8df1638259f3..6836095251ed 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -66,7 +66,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct vm_area_struct *vma = NULL;
 	struct task_struct *tsk = current;
 	struct mm_struct *mm = tsk->mm;
-	int si_code = 0;
+	int si_code = SEGV_MAPERR;
 	int ret;
 	vm_fault_t fault;
 	int write = regs->ecr_cause & ECR_C_PROTV_STORE;  /* ST/EX */
@@ -81,16 +81,14 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	 * only copy the information from the master page table,
 	 * nothing more.
 	 */
-	if (address >= VMALLOC_START) {
+	if (address >= VMALLOC_START && !user_mode(regs)) {
 		ret = handle_kernel_vaddr_fault(address);
 		if (unlikely(ret))
-			goto bad_area_nosemaphore;
+			goto no_context;
 		else
 			return;
 	}
 
-	si_code = SEGV_MAPERR;
-
 	/*
 	 * If we're in an interrupt or have no user
 	 * context, we must not take the fault..
@@ -198,7 +196,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 bad_area:
 	up_read(&mm->mmap_sem);
 
-bad_area_nosemaphore:
 	/* User mode accesses just cause a SIGSEGV */
 	if (user_mode(regs)) {
 		tsk->thread.fault_address = address;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/9] ARC: mm: do_page_fault refactor #1: remove label @good_area
  2019-05-15  0:29 ` Vineet Gupta
@ 2019-05-15  0:29   ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: paltsev, linux-kernel, Vineet Gupta

Invert the condition for stack expansion.
No functional change

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/mm/fault.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 6836095251ed..94d242740ac5 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -100,21 +100,19 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		flags |= FAULT_FLAG_USER;
 retry:
 	down_read(&mm->mmap_sem);
+
 	vma = find_vma(mm, address);
 	if (!vma)
 		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+	if (unlikely(address < vma->vm_start)) {
+		if (!(vma->vm_flags & VM_GROWSDOWN) || expand_stack(vma, address))
+			goto bad_area;
+	}
 
 	/*
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */
-good_area:
 	si_code = SEGV_ACCERR;
 
 	/* Handle protection violation, execute on heap or stack */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/9] ARC: mm: do_page_fault refactor #1: remove label @good_area
@ 2019-05-15  0:29   ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

Invert the condition for stack expansion.
No functional change

Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
 arch/arc/mm/fault.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 6836095251ed..94d242740ac5 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -100,21 +100,19 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		flags |= FAULT_FLAG_USER;
 retry:
 	down_read(&mm->mmap_sem);
+
 	vma = find_vma(mm, address);
 	if (!vma)
 		goto bad_area;
-	if (vma->vm_start <= address)
-		goto good_area;
-	if (!(vma->vm_flags & VM_GROWSDOWN))
-		goto bad_area;
-	if (expand_stack(vma, address))
-		goto bad_area;
+	if (unlikely(address < vma->vm_start)) {
+		if (!(vma->vm_flags & VM_GROWSDOWN) || expand_stack(vma, address))
+			goto bad_area;
+	}
 
 	/*
 	 * Ok, we have a good vm_area for this memory access, so
 	 * we can handle it..
 	 */
-good_area:
 	si_code = SEGV_ACCERR;
 
 	/* Handle protection violation, execute on heap or stack */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/9] ARC: mm: do_page_fault refactor #2: remove short lived variable
  2019-05-15  0:29 ` Vineet Gupta
@ 2019-05-15  0:29   ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: paltsev, linux-kernel, Vineet Gupta

Compiler will do this anyways, still..

No functional change.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/mm/fault.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 94d242740ac5..f1175685d914 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -67,23 +67,18 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct task_struct *tsk = current;
 	struct mm_struct *mm = tsk->mm;
 	int si_code = SEGV_MAPERR;
-	int ret;
 	vm_fault_t fault;
 	int write = regs->ecr_cause & ECR_C_PROTV_STORE;  /* ST/EX */
 	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 
 	/*
-	 * We fault-in kernel-space virtual memory on-demand. The
-	 * 'reference' page table is init_mm.pgd.
-	 *
 	 * NOTE! We MUST NOT take any locks for this case. We may
 	 * be in an interrupt or a critical region, and should
 	 * only copy the information from the master page table,
 	 * nothing more.
 	 */
 	if (address >= VMALLOC_START && !user_mode(regs)) {
-		ret = handle_kernel_vaddr_fault(address);
-		if (unlikely(ret))
+		if (unlikely(handle_kernel_vaddr_fault(address)))
 			goto no_context;
 		else
 			return;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/9] ARC: mm: do_page_fault refactor #2: remove short lived variable
@ 2019-05-15  0:29   ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

Compiler will do this anyways, still..

No functional change.

Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
 arch/arc/mm/fault.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 94d242740ac5..f1175685d914 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -67,23 +67,18 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct task_struct *tsk = current;
 	struct mm_struct *mm = tsk->mm;
 	int si_code = SEGV_MAPERR;
-	int ret;
 	vm_fault_t fault;
 	int write = regs->ecr_cause & ECR_C_PROTV_STORE;  /* ST/EX */
 	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 
 	/*
-	 * We fault-in kernel-space virtual memory on-demand. The
-	 * 'reference' page table is init_mm.pgd.
-	 *
 	 * NOTE! We MUST NOT take any locks for this case. We may
 	 * be in an interrupt or a critical region, and should
 	 * only copy the information from the master page table,
 	 * nothing more.
 	 */
 	if (address >= VMALLOC_START && !user_mode(regs)) {
-		ret = handle_kernel_vaddr_fault(address);
-		if (unlikely(ret))
+		if (unlikely(handle_kernel_vaddr_fault(address)))
 			goto no_context;
 		else
 			return;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
  2019-05-15  0:29 ` Vineet Gupta
@ 2019-05-15  0:29   ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: paltsev, linux-kernel, Vineet Gupta

The coding pattern to NOT intialize variables at declaration time but
rather near code which makes us eof them makes it much easier to grok
the overall logic, specially when the init is not simply 0 or 1

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/mm/fault.c | 39 +++++++++++++++++++++------------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index f1175685d914..ae890a8d5ebf 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -67,9 +67,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct task_struct *tsk = current;
 	struct mm_struct *mm = tsk->mm;
 	int si_code = SEGV_MAPERR;
+	unsigned int write = 0, exec = 0, mask;
 	vm_fault_t fault;
-	int write = regs->ecr_cause & ECR_C_PROTV_STORE;  /* ST/EX */
-	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
+	unsigned int flags;
 
 	/*
 	 * NOTE! We MUST NOT take any locks for this case. We may
@@ -91,8 +91,18 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
+	if (regs->ecr_cause & ECR_C_PROTV_STORE)	/* ST/EX */
+		write = 1;
+	else if ((regs->ecr_vec == ECR_V_PROTV) &&
+	         (regs->ecr_cause == ECR_C_PROTV_INST_FETCH))
+		exec = 1;
+
+	flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
+	if (write)
+		flags |= FAULT_FLAG_WRITE;
+
 retry:
 	down_read(&mm->mmap_sem);
 
@@ -105,24 +115,17 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	}
 
 	/*
-	 * Ok, we have a good vm_area for this memory access, so
-	 * we can handle it..
+	 * vm_area is good, now check permissions for this memory access
 	 */
-	si_code = SEGV_ACCERR;
-
-	/* Handle protection violation, execute on heap or stack */
-
-	if ((regs->ecr_vec == ECR_V_PROTV) &&
-	    (regs->ecr_cause == ECR_C_PROTV_INST_FETCH))
+	mask = VM_READ;
+	if (write)
+		mask = VM_WRITE;
+	if (exec)
+		mask = VM_EXEC;
+
+	if (!(vma->vm_flags & mask)) {
+		si_code = SEGV_ACCERR;
 		goto bad_area;
-
-	if (write) {
-		if (!(vma->vm_flags & VM_WRITE))
-			goto bad_area;
-		flags |= FAULT_FLAG_WRITE;
-	} else {
-		if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
-			goto bad_area;
 	}
 
 	/*
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
@ 2019-05-15  0:29   ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

The coding pattern to NOT intialize variables at declaration time but
rather near code which makes us eof them makes it much easier to grok
the overall logic, specially when the init is not simply 0 or 1

Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
 arch/arc/mm/fault.c | 39 +++++++++++++++++++++------------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index f1175685d914..ae890a8d5ebf 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -67,9 +67,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct task_struct *tsk = current;
 	struct mm_struct *mm = tsk->mm;
 	int si_code = SEGV_MAPERR;
+	unsigned int write = 0, exec = 0, mask;
 	vm_fault_t fault;
-	int write = regs->ecr_cause & ECR_C_PROTV_STORE;  /* ST/EX */
-	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
+	unsigned int flags;
 
 	/*
 	 * NOTE! We MUST NOT take any locks for this case. We may
@@ -91,8 +91,18 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
+	if (regs->ecr_cause & ECR_C_PROTV_STORE)	/* ST/EX */
+		write = 1;
+	else if ((regs->ecr_vec == ECR_V_PROTV) &&
+	         (regs->ecr_cause == ECR_C_PROTV_INST_FETCH))
+		exec = 1;
+
+	flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
+	if (write)
+		flags |= FAULT_FLAG_WRITE;
+
 retry:
 	down_read(&mm->mmap_sem);
 
@@ -105,24 +115,17 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	}
 
 	/*
-	 * Ok, we have a good vm_area for this memory access, so
-	 * we can handle it..
+	 * vm_area is good, now check permissions for this memory access
 	 */
-	si_code = SEGV_ACCERR;
-
-	/* Handle protection violation, execute on heap or stack */
-
-	if ((regs->ecr_vec == ECR_V_PROTV) &&
-	    (regs->ecr_cause == ECR_C_PROTV_INST_FETCH))
+	mask = VM_READ;
+	if (write)
+		mask = VM_WRITE;
+	if (exec)
+		mask = VM_EXEC;
+
+	if (!(vma->vm_flags & mask)) {
+		si_code = SEGV_ACCERR;
 		goto bad_area;
-
-	if (write) {
-		if (!(vma->vm_flags & VM_WRITE))
-			goto bad_area;
-		flags |= FAULT_FLAG_WRITE;
-	} else {
-		if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
-			goto bad_area;
 	}
 
 	/*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 5/9] ARC: mm: do_page_fault refactor #4: consolidate retry related logic
  2019-05-15  0:29 ` Vineet Gupta
@ 2019-05-15  0:29   ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: paltsev, linux-kernel, Vineet Gupta

stats update code can now elide "retry" check and additional level of
indentation since all retry handling is done ahead of it already

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/mm/fault.c | 60 +++++++++++++++++++++++++++--------------------------
 1 file changed, 31 insertions(+), 29 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index ae890a8d5ebf..7f211b493170 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -68,8 +68,8 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct mm_struct *mm = tsk->mm;
 	int si_code = SEGV_MAPERR;
 	unsigned int write = 0, exec = 0, mask;
-	vm_fault_t fault;
-	unsigned int flags;
+	vm_fault_t fault;			/* handle_mm_fault() output */
+	unsigned int flags;			/* handle_mm_fault() input */
 
 	/*
 	 * NOTE! We MUST NOT take any locks for this case. We may
@@ -128,49 +128,51 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		goto bad_area;
 	}
 
-	/*
-	 * If for any reason at all we couldn't handle the fault,
-	 * make sure we exit gracefully rather than endlessly redo
-	 * the fault.
-	 */
 	fault = handle_mm_fault(vma, address, flags);
 
-	if (fatal_signal_pending(current)) {
+	/*
+	 * Fault retry nuances
+	 */
+	if (unlikely(fault & VM_FAULT_RETRY)) {
 
 		/*
-		 * if fault retry, mmap_sem already relinquished by core mm
-		 * so OK to return to user mode (with signal handled first)
+		 * If fault needs to be retried, handle any pending signals
+		 * first (by returning to user mode).
+		 * mmap_sem already relinquished by core mm for RETRY case
 		 */
-		if (fault & VM_FAULT_RETRY) {
+		if (fatal_signal_pending(current)) {
 			if (!user_mode(regs))
 				goto no_context;
 			return;
 		}
+		/*
+		 * retry state machine
+		 */
+		if (flags & FAULT_FLAG_ALLOW_RETRY) {
+			flags &= ~FAULT_FLAG_ALLOW_RETRY;
+			flags |= FAULT_FLAG_TRIED;
+			goto retry;
+		}
 	}
 
+	/*
+	 * Major/minor page fault accounting
+	 * (in case of retry we only land here once)
+	 */
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
 	if (likely(!(fault & VM_FAULT_ERROR))) {
-		if (flags & FAULT_FLAG_ALLOW_RETRY) {
-			/* To avoid updating stats twice for retry case */
-			if (fault & VM_FAULT_MAJOR) {
-				tsk->maj_flt++;
-				perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
-					      regs, address);
-			} else {
-				tsk->min_flt++;
-				perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
-					      regs, address);
-			}
-
-			if (fault & VM_FAULT_RETRY) {
-				flags &= ~FAULT_FLAG_ALLOW_RETRY;
-				flags |= FAULT_FLAG_TRIED;
-				goto retry;
-			}
+		if (fault & VM_FAULT_MAJOR) {
+			tsk->maj_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
+				      regs, address);
+		} else {
+			tsk->min_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
+				      regs, address);
 		}
 
-		/* Fault Handled Gracefully */
+		/* Normal return path: fault Handled Gracefully */
 		up_read(&mm->mmap_sem);
 		return;
 	}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 5/9] ARC: mm: do_page_fault refactor #4: consolidate retry related logic
@ 2019-05-15  0:29   ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

stats update code can now elide "retry" check and additional level of
indentation since all retry handling is done ahead of it already

Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
 arch/arc/mm/fault.c | 60 +++++++++++++++++++++++++++--------------------------
 1 file changed, 31 insertions(+), 29 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index ae890a8d5ebf..7f211b493170 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -68,8 +68,8 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct mm_struct *mm = tsk->mm;
 	int si_code = SEGV_MAPERR;
 	unsigned int write = 0, exec = 0, mask;
-	vm_fault_t fault;
-	unsigned int flags;
+	vm_fault_t fault;			/* handle_mm_fault() output */
+	unsigned int flags;			/* handle_mm_fault() input */
 
 	/*
 	 * NOTE! We MUST NOT take any locks for this case. We may
@@ -128,49 +128,51 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		goto bad_area;
 	}
 
-	/*
-	 * If for any reason at all we couldn't handle the fault,
-	 * make sure we exit gracefully rather than endlessly redo
-	 * the fault.
-	 */
 	fault = handle_mm_fault(vma, address, flags);
 
-	if (fatal_signal_pending(current)) {
+	/*
+	 * Fault retry nuances
+	 */
+	if (unlikely(fault & VM_FAULT_RETRY)) {
 
 		/*
-		 * if fault retry, mmap_sem already relinquished by core mm
-		 * so OK to return to user mode (with signal handled first)
+		 * If fault needs to be retried, handle any pending signals
+		 * first (by returning to user mode).
+		 * mmap_sem already relinquished by core mm for RETRY case
 		 */
-		if (fault & VM_FAULT_RETRY) {
+		if (fatal_signal_pending(current)) {
 			if (!user_mode(regs))
 				goto no_context;
 			return;
 		}
+		/*
+		 * retry state machine
+		 */
+		if (flags & FAULT_FLAG_ALLOW_RETRY) {
+			flags &= ~FAULT_FLAG_ALLOW_RETRY;
+			flags |= FAULT_FLAG_TRIED;
+			goto retry;
+		}
 	}
 
+	/*
+	 * Major/minor page fault accounting
+	 * (in case of retry we only land here once)
+	 */
 	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
 
 	if (likely(!(fault & VM_FAULT_ERROR))) {
-		if (flags & FAULT_FLAG_ALLOW_RETRY) {
-			/* To avoid updating stats twice for retry case */
-			if (fault & VM_FAULT_MAJOR) {
-				tsk->maj_flt++;
-				perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
-					      regs, address);
-			} else {
-				tsk->min_flt++;
-				perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
-					      regs, address);
-			}
-
-			if (fault & VM_FAULT_RETRY) {
-				flags &= ~FAULT_FLAG_ALLOW_RETRY;
-				flags |= FAULT_FLAG_TRIED;
-				goto retry;
-			}
+		if (fault & VM_FAULT_MAJOR) {
+			tsk->maj_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1,
+				      regs, address);
+		} else {
+			tsk->min_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1,
+				      regs, address);
 		}
 
-		/* Fault Handled Gracefully */
+		/* Normal return path: fault Handled Gracefully */
 		up_read(&mm->mmap_sem);
 		return;
 	}
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 6/9] ARC: mm: do_page_fault refactor #5: scoot no_context to end
  2019-05-15  0:29 ` Vineet Gupta
@ 2019-05-15  0:29   ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: paltsev, linux-kernel, Vineet Gupta

This is different than the rest of signal handling stuff

No functional change

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/mm/fault.c | 21 +++++++--------------
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 7f211b493170..c0a60aeb4abd 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -201,20 +201,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		return;
 	}
 
-no_context:
-	/* Are we prepared to handle this kernel fault?
-	 *
-	 * (The kernel has valid exception-points in the source
-	 *  when it accesses user-memory. When it fails in one
-	 *  of those points, we find it in a table and do a jump
-	 *  to some fixup code that loads an appropriate error
-	 *  code)
-	 */
-	if (fixup_exception(regs))
-		return;
-
-	die("Oops", regs, address);
-
 out_of_memory:
 	up_read(&mm->mmap_sem);
 
@@ -233,4 +219,11 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 
 	tsk->thread.fault_address = address;
 	force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)address, tsk);
+	return;
+
+no_context:
+	if (fixup_exception(regs))
+		return;
+
+	die("Oops", regs, address);
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 6/9] ARC: mm: do_page_fault refactor #5: scoot no_context to end
@ 2019-05-15  0:29   ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

This is different than the rest of signal handling stuff

No functional change

Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
 arch/arc/mm/fault.c | 21 +++++++--------------
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 7f211b493170..c0a60aeb4abd 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -201,20 +201,6 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		return;
 	}
 
-no_context:
-	/* Are we prepared to handle this kernel fault?
-	 *
-	 * (The kernel has valid exception-points in the source
-	 *  when it accesses user-memory. When it fails in one
-	 *  of those points, we find it in a table and do a jump
-	 *  to some fixup code that loads an appropriate error
-	 *  code)
-	 */
-	if (fixup_exception(regs))
-		return;
-
-	die("Oops", regs, address);
-
 out_of_memory:
 	up_read(&mm->mmap_sem);
 
@@ -233,4 +219,11 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 
 	tsk->thread.fault_address = address;
 	force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)address, tsk);
+	return;
+
+no_context:
+	if (fixup_exception(regs))
+		return;
+
+	die("Oops", regs, address);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 7/9] ARC: mm: do_page_fault refactor #6: error handlers to use same pattern
  2019-05-15  0:29 ` Vineet Gupta
@ 2019-05-15  0:29   ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: paltsev, linux-kernel, Vineet Gupta

 - up_read
 - if !user_mode
 - whatever error handling

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/mm/fault.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index c0a60aeb4abd..0e1a222a97ad 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -194,22 +194,21 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 bad_area:
 	up_read(&mm->mmap_sem);
 
-	/* User mode accesses just cause a SIGSEGV */
-	if (user_mode(regs)) {
-		tsk->thread.fault_address = address;
-		force_sig_fault(SIGSEGV, si_code, (void __user *)address, tsk);
-		return;
-	}
+	if (!user_mode(regs))
+		goto no_context;
+
+	tsk->thread.fault_address = address;
+	force_sig_fault(SIGSEGV, si_code, (void __user *)address, tsk);
+	return;
 
 out_of_memory:
 	up_read(&mm->mmap_sem);
 
-	if (user_mode(regs)) {
-		pagefault_out_of_memory();
-		return;
-	}
+	if (!user_mode(regs))
+		goto no_context;
 
-	goto no_context;
+	pagefault_out_of_memory();
+	return;
 
 do_sigbus:
 	up_read(&mm->mmap_sem);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 7/9] ARC: mm: do_page_fault refactor #6: error handlers to use same pattern
@ 2019-05-15  0:29   ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

 - up_read
 - if !user_mode
 - whatever error handling

Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
 arch/arc/mm/fault.c | 21 ++++++++++-----------
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index c0a60aeb4abd..0e1a222a97ad 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -194,22 +194,21 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 bad_area:
 	up_read(&mm->mmap_sem);
 
-	/* User mode accesses just cause a SIGSEGV */
-	if (user_mode(regs)) {
-		tsk->thread.fault_address = address;
-		force_sig_fault(SIGSEGV, si_code, (void __user *)address, tsk);
-		return;
-	}
+	if (!user_mode(regs))
+		goto no_context;
+
+	tsk->thread.fault_address = address;
+	force_sig_fault(SIGSEGV, si_code, (void __user *)address, tsk);
+	return;
 
 out_of_memory:
 	up_read(&mm->mmap_sem);
 
-	if (user_mode(regs)) {
-		pagefault_out_of_memory();
-		return;
-	}
+	if (!user_mode(regs))
+		goto no_context;
 
-	goto no_context;
+	pagefault_out_of_memory();
+	return;
 
 do_sigbus:
 	up_read(&mm->mmap_sem);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 8/9] ARC: mm: do_page_fault refactor #7: fold the various error handling
  2019-05-15  0:29 ` Vineet Gupta
@ 2019-05-15  0:29   ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: paltsev, linux-kernel, Vineet Gupta

 - single up_read() call vs. 4
 - so much easier on eyes

Technically it seems like @bad_area label moved up, but even in old
regime, it was a special case of delivering SIGSEGV unconditionally
which we now do as well, although with checks.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/mm/fault.c | 46 +++++++++++++---------------------------------
 1 file changed, 13 insertions(+), 33 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 0e1a222a97ad..20402217d9da 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -66,7 +66,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct vm_area_struct *vma = NULL;
 	struct task_struct *tsk = current;
 	struct mm_struct *mm = tsk->mm;
-	int si_code = SEGV_MAPERR;
+	int sig, si_code = SEGV_MAPERR;
 	unsigned int write = 0, exec = 0, mask;
 	vm_fault_t fault;			/* handle_mm_fault() output */
 	unsigned int flags;			/* handle_mm_fault() input */
@@ -177,47 +177,27 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		return;
 	}
 
-	if (fault & VM_FAULT_OOM)
-		goto out_of_memory;
-	else if (fault & VM_FAULT_SIGSEGV)
-		goto bad_area;
-	else if (fault & VM_FAULT_SIGBUS)
-		goto do_sigbus;
-
-	/* no man's land */
-	BUG();
-
-	/*
-	 * Something tried to access memory that isn't in our memory map..
-	 * Fix it, but check if it's kernel or user first..
-	 */
 bad_area:
 	up_read(&mm->mmap_sem);
 
 	if (!user_mode(regs))
 		goto no_context;
 
-	tsk->thread.fault_address = address;
-	force_sig_fault(SIGSEGV, si_code, (void __user *)address, tsk);
-	return;
-
-out_of_memory:
-	up_read(&mm->mmap_sem);
-
-	if (!user_mode(regs))
-		goto no_context;
-
-	pagefault_out_of_memory();
-	return;
-
-do_sigbus:
-	up_read(&mm->mmap_sem);
+	if (fault & VM_FAULT_OOM) {
+		pagefault_out_of_memory();
+		return;
+	}
 
-	if (!user_mode(regs))
-		goto no_context;
+	if (fault & VM_FAULT_SIGBUS) {
+		sig = SIGBUS;
+		si_code = BUS_ADRERR;
+	}
+	else {
+		sig = SIGSEGV;
+	}
 
 	tsk->thread.fault_address = address;
-	force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)address, tsk);
+	force_sig_fault(sig, si_code, (void __user *)address, tsk);
 	return;
 
 no_context:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 8/9] ARC: mm: do_page_fault refactor #7: fold the various error handling
@ 2019-05-15  0:29   ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

 - single up_read() call vs. 4
 - so much easier on eyes

Technically it seems like @bad_area label moved up, but even in old
regime, it was a special case of delivering SIGSEGV unconditionally
which we now do as well, although with checks.

Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
 arch/arc/mm/fault.c | 46 +++++++++++++---------------------------------
 1 file changed, 13 insertions(+), 33 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 0e1a222a97ad..20402217d9da 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -66,7 +66,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct vm_area_struct *vma = NULL;
 	struct task_struct *tsk = current;
 	struct mm_struct *mm = tsk->mm;
-	int si_code = SEGV_MAPERR;
+	int sig, si_code = SEGV_MAPERR;
 	unsigned int write = 0, exec = 0, mask;
 	vm_fault_t fault;			/* handle_mm_fault() output */
 	unsigned int flags;			/* handle_mm_fault() input */
@@ -177,47 +177,27 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		return;
 	}
 
-	if (fault & VM_FAULT_OOM)
-		goto out_of_memory;
-	else if (fault & VM_FAULT_SIGSEGV)
-		goto bad_area;
-	else if (fault & VM_FAULT_SIGBUS)
-		goto do_sigbus;
-
-	/* no man's land */
-	BUG();
-
-	/*
-	 * Something tried to access memory that isn't in our memory map..
-	 * Fix it, but check if it's kernel or user first..
-	 */
 bad_area:
 	up_read(&mm->mmap_sem);
 
 	if (!user_mode(regs))
 		goto no_context;
 
-	tsk->thread.fault_address = address;
-	force_sig_fault(SIGSEGV, si_code, (void __user *)address, tsk);
-	return;
-
-out_of_memory:
-	up_read(&mm->mmap_sem);
-
-	if (!user_mode(regs))
-		goto no_context;
-
-	pagefault_out_of_memory();
-	return;
-
-do_sigbus:
-	up_read(&mm->mmap_sem);
+	if (fault & VM_FAULT_OOM) {
+		pagefault_out_of_memory();
+		return;
+	}
 
-	if (!user_mode(regs))
-		goto no_context;
+	if (fault & VM_FAULT_SIGBUS) {
+		sig = SIGBUS;
+		si_code = BUS_ADRERR;
+	}
+	else {
+		sig = SIGSEGV;
+	}
 
 	tsk->thread.fault_address = address;
-	force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)address, tsk);
+	force_sig_fault(sig, si_code, (void __user *)address, tsk);
 	return;
 
 no_context:
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 9/9] ARC: mm: do_page_fault refactor #8: release mmap_sem sooner
  2019-05-15  0:29 ` Vineet Gupta
@ 2019-05-15  0:29   ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: paltsev, linux-kernel, Vineet Gupta, Peter Zijlstra

In case of successful page fault handling, this patch releases mmap_sem
before updating the perf stat event for major/minor faults. So even
though the contention reduction is NOT supe rhigh, it is still an
improvement.

There's an additional code size improvement as we only have 2 up_read()
calls now.

Note to myself:
--------------

1. Given the way it is done, we are forced to move @bad_area label earlier
   causing the various "goto bad_area" cases to hit perf stat code.

 - PERF_COUNT_SW_PAGE_FAULTS is NOW updated for access errors which is what
   arm/arm64 seem to be doing as well (with slightly different code)
 - PERF_COUNT_SW_PAGE_FAULTS_{MAJ,MIN} must NOT be updated for the
   error case which is guarded by now setting @fault initial value
   to VM_FAULT_ERROR which serves both cases when handle_mm_fault()
   returns error or is not called at all.

2. arm/arm64 use two homebrew fault flags VM_FAULT_BAD{MAP,MAPACCESS}
   which I was inclined to add too but seems not needed for ARC

 - given that we have everything is 1 function we cabn stil use goto
 - we setup si_code at the right place (arm* do that in the end)
 - we init fault already to error value which guards entry into perf
   stats event update

Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/mm/fault.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 20402217d9da..e93ea06c214c 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -68,7 +68,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct mm_struct *mm = tsk->mm;
 	int sig, si_code = SEGV_MAPERR;
 	unsigned int write = 0, exec = 0, mask;
-	vm_fault_t fault;			/* handle_mm_fault() output */
+	vm_fault_t fault = VM_FAULT_ERROR;	/* handle_mm_fault() output */
 	unsigned int flags;			/* handle_mm_fault() input */
 
 	/*
@@ -155,6 +155,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		}
 	}
 
+bad_area:
+	up_read(&mm->mmap_sem);
+
 	/*
 	 * Major/minor page fault accounting
 	 * (in case of retry we only land here once)
@@ -173,13 +176,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		}
 
 		/* Normal return path: fault Handled Gracefully */
-		up_read(&mm->mmap_sem);
 		return;
 	}
 
-bad_area:
-	up_read(&mm->mmap_sem);
-
 	if (!user_mode(regs))
 		goto no_context;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 9/9] ARC: mm: do_page_fault refactor #8: release mmap_sem sooner
@ 2019-05-15  0:29   ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-15  0:29 UTC (permalink / raw)
  To: linux-snps-arc

In case of successful page fault handling, this patch releases mmap_sem
before updating the perf stat event for major/minor faults. So even
though the contention reduction is NOT supe rhigh, it is still an
improvement.

There's an additional code size improvement as we only have 2 up_read()
calls now.

Note to myself:
--------------

1. Given the way it is done, we are forced to move @bad_area label earlier
   causing the various "goto bad_area" cases to hit perf stat code.

 - PERF_COUNT_SW_PAGE_FAULTS is NOW updated for access errors which is what
   arm/arm64 seem to be doing as well (with slightly different code)
 - PERF_COUNT_SW_PAGE_FAULTS_{MAJ,MIN} must NOT be updated for the
   error case which is guarded by now setting @fault initial value
   to VM_FAULT_ERROR which serves both cases when handle_mm_fault()
   returns error or is not called at all.

2. arm/arm64 use two homebrew fault flags VM_FAULT_BAD{MAP,MAPACCESS}
   which I was inclined to add too but seems not needed for ARC

 - given that we have everything is 1 function we cabn stil use goto
 - we setup si_code at the right place (arm* do that in the end)
 - we init fault already to error value which guards entry into perf
   stats event update

Cc: Peter Zijlstra <peterz at infradead.org>
Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
---
 arch/arc/mm/fault.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
index 20402217d9da..e93ea06c214c 100644
--- a/arch/arc/mm/fault.c
+++ b/arch/arc/mm/fault.c
@@ -68,7 +68,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 	struct mm_struct *mm = tsk->mm;
 	int sig, si_code = SEGV_MAPERR;
 	unsigned int write = 0, exec = 0, mask;
-	vm_fault_t fault;			/* handle_mm_fault() output */
+	vm_fault_t fault = VM_FAULT_ERROR;	/* handle_mm_fault() output */
 	unsigned int flags;			/* handle_mm_fault() input */
 
 	/*
@@ -155,6 +155,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		}
 	}
 
+bad_area:
+	up_read(&mm->mmap_sem);
+
 	/*
 	 * Major/minor page fault accounting
 	 * (in case of retry we only land here once)
@@ -173,13 +176,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
 		}
 
 		/* Normal return path: fault Handled Gracefully */
-		up_read(&mm->mmap_sem);
 		return;
 	}
 
-bad_area:
-	up_read(&mm->mmap_sem);
-
 	if (!user_mode(regs))
 		goto no_context;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 1/9] ARC: mm: SIGSEGV userspace trying to access kernel virtual memory
  2019-05-15  0:29   ` Vineet Gupta
  (?)
@ 2019-05-15 10:54   ` Sasha Levin
  -1 siblings, 0 replies; 35+ messages in thread
From: Sasha Levin @ 2019-05-15 10:54 UTC (permalink / raw)
  To: linux-snps-arc

Hi,

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.1.1, v5.0.15, v4.19.42, v4.14.118, v4.9.175, v4.4.179, v3.18.139.

v5.1.1: Build OK!
v5.0.15: Build OK!
v4.19.42: Failed to apply! Possible dependencies:
    121e38e5acdc ("ARC: mm: fix uninitialised signal code in do_page_fault")
    15773ae938d8 ("signal/arc: Use force_sig_fault where appropriate")

v4.14.118: Failed to apply! Possible dependencies:
    121e38e5acdc ("ARC: mm: fix uninitialised signal code in do_page_fault")
    15773ae938d8 ("signal/arc: Use force_sig_fault where appropriate")
    1fc5dce78ad1 ("arm64/sve: Low-level SVE architectural state manipulation functions")
    2c9120f3a86a ("arm64: signal: Make force_signal_inject more robust")
    3eb0f5193b49 ("signal: Ensure every siginfo we send has all bits initialized")
    3f7c86b2382e ("arm64: Update fault_info table with new exception types")
    50a7ca3c6fc8 ("mm: convert return type of handle_mm_fault() caller to vm_fault_t")
    526c3ddb6aa2 ("signal/arm64: Document conflicts with SI_USER and SIGFPE,SIGTRAP,SIGBUS")
    532826f3712b ("arm64: Mirror arm for unimplemented compat syscalls")
    5f74972ce69f ("signal: Don't use structure initializers for struct siginfo")
    92ff0674f5d8 ("arm64: mm: Rework unhandled user pagefaults to call arm64_force_sig_info")
    94ef7ecbdf6f ("arm64: fpsimd: Correctly annotate exception helpers called from asm")
    af40ff687bc9 ("arm64: signal: Ensure si_code is valid for all fault signals")
    bc0ee4760364 ("arm64/sve: Core task context handling")
    f43a54a0d916 ("signal/mips: Use force_sig_fault where appropriate")

v4.9.175: Failed to apply! Possible dependencies:
    0e3a9026396c ("arm64: mm: Update perf accounting to handle poison faults")
    121e38e5acdc ("ARC: mm: fix uninitialised signal code in do_page_fault")
    15773ae938d8 ("signal/arc: Use force_sig_fault where appropriate")
    32015c235603 ("arm64: exception: handle Synchronous External Abort")
    3eb0f5193b49 ("signal: Ensure every siginfo we send has all bits initialized")
    3f7c86b2382e ("arm64: Update fault_info table with new exception types")
    50a7ca3c6fc8 ("mm: convert return type of handle_mm_fault() caller to vm_fault_t")
    526c3ddb6aa2 ("signal/arm64: Document conflicts with SI_USER and SIGFPE,SIGTRAP,SIGBUS")
    532826f3712b ("arm64: Mirror arm for unimplemented compat syscalls")
    5b53696a30d5 ("ACPI / APEI: Switch to use new generic UUID API")
    5f74972ce69f ("signal: Don't use structure initializers for struct siginfo")
    7edda0886bc3 ("acpi: apei: handle SEA notification type for ARMv8")
    83016b204225 ("arm64: mm: print file name of faulting vma")
    92ff0674f5d8 ("arm64: mm: Rework unhandled user pagefaults to call arm64_force_sig_info")
    a8ada146f517 ("arm64: Update the synchronous external abort fault description")
    af40ff687bc9 ("arm64: signal: Ensure si_code is valid for all fault signals")
    b123718b105b ("MIPS: signal: Remove unreachable code from force_fcr31_sig().")
    bbcc2e7b642e ("ras: acpi/apei: cper: add support for generic data v3 structure")
    c07ab957d9af ("arm64: Call __show_regs directly")
    e7c600f149b8 ("arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling")
    f43a54a0d916 ("signal/mips: Use force_sig_fault where appropriate")

v4.4.179: Failed to apply! Possible dependencies:
    09a6adf53d42 ("arm64: mm: unaligned access by user-land should be received as SIGBUS")
    0a8ea52c3eb1 ("arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature")
    0e3a9026396c ("arm64: mm: Update perf accounting to handle poison faults")
    121e38e5acdc ("ARC: mm: fix uninitialised signal code in do_page_fault")
    15773ae938d8 ("signal/arc: Use force_sig_fault where appropriate")
    2dd0e8d2d2a1 ("arm64: Kprobes with single stepping support")
    3eca86e75ec7 ("arm64: Remove fixmap include fragility")
    50a7ca3c6fc8 ("mm: convert return type of handle_mm_fault() caller to vm_fault_t")
    67ce16ec15ce ("arm64: mm: print out correct page table entries")
    83016b204225 ("arm64: mm: print file name of faulting vma")
    92ff0674f5d8 ("arm64: mm: Rework unhandled user pagefaults to call arm64_force_sig_info")
    bbb1681ee365 ("arm64: mm: mark fault_info table const")
    c07ab957d9af ("arm64: Call __show_regs directly")
    cab15ce604e5 ("arm64: Introduce execute-only page access permissions")
    d59bee887231 ("arm64: Add more test functions to insn.c")
    e04a28d45ff3 ("arm64: debug: re-enable irqs before sending breakpoint SIGTRAP")
    e7c600f149b8 ("arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling")

v3.18.139: Failed to apply! Possible dependencies:
    04597a65c5ef ("arm64: Track system support for mixed endian EL0")
    121e38e5acdc ("ARC: mm: fix uninitialised signal code in do_page_fault")
    15773ae938d8 ("signal/arc: Use force_sig_fault where appropriate")
    1b907f46db07 ("arm64: kconfig: move emulation option under kernel features")
    2d888f48e056 ("arm64: Emulate SETEND for AArch32 tasks")
    338d4f49d6f7 ("arm64: kernel: Add support for Privileged Access Never")
    359b706473b4 ("arm64: Extract feature parsing code from cpu_errata.c")
    50a7ca3c6fc8 ("mm: convert return type of handle_mm_fault() caller to vm_fault_t")
    587064b610c7 ("arm64: Add framework for legacy instruction emulation")
    7209c868600b ("arm64: mm: Set PSTATE.PAN from the cpu_enable_pan() call")
    736d474f0faf ("arm64: Consolidate hotplug notifier for instruction emulation")
    870828e57b14 ("arm64: kernel: Move config_sctlr_el1")
    92ff0674f5d8 ("arm64: mm: Rework unhandled user pagefaults to call arm64_force_sig_info")
    94a9e04aa16a ("arm64: alternative: Introduce feature for GICv3 CPU interface")
    9b79f52d1a70 ("arm64: Add support for hooks to handle undefined instructions")
    bd35a4adc413 ("arm64: Port SWP/SWPB emulation support from arm")
    c852f3205846 ("arm64: Emulate CP15 Barrier instructions")
    c9453a3ab1a3 ("arm64: alternatives: fix pr_fmt string for consistency")
    ceed97ab4ff7 ("ARC: perf: Enable generic software events")
    e7c600f149b8 ("arm64: hwpoison: add VM_FAULT_HWPOISON[_LARGE] handling")


How should we proceed with this patch?

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
  2019-05-15  0:29   ` Vineet Gupta
@ 2019-05-16 17:24     ` Eugeniy Paltsev
  -1 siblings, 0 replies; 35+ messages in thread
From: Eugeniy Paltsev @ 2019-05-16 17:24 UTC (permalink / raw)
  To: Vineet.Gupta1; +Cc: paltsev, linux-kernel, Alexey Brodkin, linux-snps-arc

On Tue, 2019-05-14 at 17:29 -0700, Vineet Gupta wrote:
> The coding pattern to NOT intialize variables at declaration time but
> rather near code which makes us eof them makes it much easier to grok
> the overall logic, specially when the init is not simply 0 or 1
> 
> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
> ---
>  arch/arc/mm/fault.c | 39 +++++++++++++++++++++------------------
>  1 file changed, 21 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
> index f1175685d914..ae890a8d5ebf 100644
> --- a/arch/arc/mm/fault.c
> +++ b/arch/arc/mm/fault.c
> @@ -67,9 +67,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  	struct task_struct *tsk = current;
>  	struct mm_struct *mm = tsk->mm;
>  	int si_code = SEGV_MAPERR;
> +	unsigned int write = 0, exec = 0, mask;

Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean variables.


>  	vm_fault_t fault;
> -	int write = regs->ecr_cause & ECR_C_PROTV_STORE;  /* ST/EX */
> -	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
> +	unsigned int flags;
>  
>  	/*
>  	 * NOTE! We MUST NOT take any locks for this case. We may
> @@ -91,8 +91,18 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  	if (faulthandler_disabled() || !mm)
>  		goto no_context;
>  
> +	if (regs->ecr_cause & ECR_C_PROTV_STORE)	/* ST/EX */
> +		write = 1;
> +	else if ((regs->ecr_vec == ECR_V_PROTV) &&
> +	         (regs->ecr_cause == ECR_C_PROTV_INST_FETCH))
> +		exec = 1;
> +
> +	flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
>  	if (user_mode(regs))
>  		flags |= FAULT_FLAG_USER;
> +	if (write)
> +		flags |= FAULT_FLAG_WRITE;
> +
>  retry:
>  	down_read(&mm->mmap_sem);
>  
> @@ -105,24 +115,17 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  	}
>  
>  	/*
> -	 * Ok, we have a good vm_area for this memory access, so
> -	 * we can handle it..
> +	 * vm_area is good, now check permissions for this memory access
>  	 */
> -	si_code = SEGV_ACCERR;
> -
> -	/* Handle protection violation, execute on heap or stack */
> -
> -	if ((regs->ecr_vec == ECR_V_PROTV) &&
> -	    (regs->ecr_cause == ECR_C_PROTV_INST_FETCH))
> +	mask = VM_READ;
> +	if (write)
> +		mask = VM_WRITE;
> +	if (exec)
> +		mask = VM_EXEC;
> +
> +	if (!(vma->vm_flags & mask)) {
> +		si_code = SEGV_ACCERR;
>  		goto bad_area;
> -
> -	if (write) {
> -		if (!(vma->vm_flags & VM_WRITE))
> -			goto bad_area;
> -		flags |= FAULT_FLAG_WRITE;
> -	} else {
> -		if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
> -			goto bad_area;
>  	}
>  
>  	/*
-- 
 Eugeniy Paltsev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
@ 2019-05-16 17:24     ` Eugeniy Paltsev
  0 siblings, 0 replies; 35+ messages in thread
From: Eugeniy Paltsev @ 2019-05-16 17:24 UTC (permalink / raw)
  To: linux-snps-arc

On Tue, 2019-05-14@17:29 -0700, Vineet Gupta wrote:
> The coding pattern to NOT intialize variables at declaration time but
> rather near code which makes us eof them makes it much easier to grok
> the overall logic, specially when the init is not simply 0 or 1
> 
> Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
> ---
>  arch/arc/mm/fault.c | 39 +++++++++++++++++++++------------------
>  1 file changed, 21 insertions(+), 18 deletions(-)
> 
> diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
> index f1175685d914..ae890a8d5ebf 100644
> --- a/arch/arc/mm/fault.c
> +++ b/arch/arc/mm/fault.c
> @@ -67,9 +67,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  	struct task_struct *tsk = current;
>  	struct mm_struct *mm = tsk->mm;
>  	int si_code = SEGV_MAPERR;
> +	unsigned int write = 0, exec = 0, mask;

Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean variables.


>  	vm_fault_t fault;
> -	int write = regs->ecr_cause & ECR_C_PROTV_STORE;  /* ST/EX */
> -	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
> +	unsigned int flags;
>  
>  	/*
>  	 * NOTE! We MUST NOT take any locks for this case. We may
> @@ -91,8 +91,18 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  	if (faulthandler_disabled() || !mm)
>  		goto no_context;
>  
> +	if (regs->ecr_cause & ECR_C_PROTV_STORE)	/* ST/EX */
> +		write = 1;
> +	else if ((regs->ecr_vec == ECR_V_PROTV) &&
> +	         (regs->ecr_cause == ECR_C_PROTV_INST_FETCH))
> +		exec = 1;
> +
> +	flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
>  	if (user_mode(regs))
>  		flags |= FAULT_FLAG_USER;
> +	if (write)
> +		flags |= FAULT_FLAG_WRITE;
> +
>  retry:
>  	down_read(&mm->mmap_sem);
>  
> @@ -105,24 +115,17 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  	}
>  
>  	/*
> -	 * Ok, we have a good vm_area for this memory access, so
> -	 * we can handle it..
> +	 * vm_area is good, now check permissions for this memory access
>  	 */
> -	si_code = SEGV_ACCERR;
> -
> -	/* Handle protection violation, execute on heap or stack */
> -
> -	if ((regs->ecr_vec == ECR_V_PROTV) &&
> -	    (regs->ecr_cause == ECR_C_PROTV_INST_FETCH))
> +	mask = VM_READ;
> +	if (write)
> +		mask = VM_WRITE;
> +	if (exec)
> +		mask = VM_EXEC;
> +
> +	if (!(vma->vm_flags & mask)) {
> +		si_code = SEGV_ACCERR;
>  		goto bad_area;
> -
> -	if (write) {
> -		if (!(vma->vm_flags & VM_WRITE))
> -			goto bad_area;
> -		flags |= FAULT_FLAG_WRITE;
> -	} else {
> -		if (!(vma->vm_flags & (VM_READ | VM_EXEC)))
> -			goto bad_area;
>  	}
>  
>  	/*
-- 
 Eugeniy Paltsev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
  2019-05-16 17:24     ` Eugeniy Paltsev
@ 2019-05-16 17:37       ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-16 17:37 UTC (permalink / raw)
  To: Eugeniy Paltsev; +Cc: paltsev, linux-kernel, Alexey Brodkin, linux-snps-arc

On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
>> +	unsigned int write = 0, exec = 0, mask;
> Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean variables.

Right those are semantics, but the generated code for "bool" is not ideal - given
it is inherently a "char" it is promoted first to an int with an additional EXTB
which I really dislike.
Guess it is more of a style thing.

-Vineet

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
@ 2019-05-16 17:37       ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-16 17:37 UTC (permalink / raw)
  To: linux-snps-arc

On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
>> +	unsigned int write = 0, exec = 0, mask;
> Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean variables.

Right those are semantics, but the generated code for "bool" is not ideal - given
it is inherently a "char" it is promoted first to an int with an additional EXTB
which I really dislike.
Guess it is more of a style thing.

-Vineet

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
  2019-05-16 17:37       ` Vineet Gupta
@ 2019-05-16 17:44         ` Alexey Brodkin
  -1 siblings, 0 replies; 35+ messages in thread
From: Alexey Brodkin @ 2019-05-16 17:44 UTC (permalink / raw)
  To: Vineet Gupta; +Cc: paltsev, linux-kernel, linux-snps-arc, Eugeniy Paltsev

Hi Vineet,

> -----Original Message-----
> From: Vineet Gupta <vgupta@synopsys.com>
> Sent: Thursday, May 16, 2019 8:38 PM
> To: Eugeniy Paltsev <paltsev@synopsys.com>
> Cc: paltsev@snyopsys.com; linux-kernel@vger.kernel.org; Alexey Brodkin <abrodkin@synopsys.com>; linux-
> snps-arc@lists.infradead.org
> Subject: Re: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
> 
> On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
> >> +	unsigned int write = 0, exec = 0, mask;
> > Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean
> variables.
> 
> Right those are semantics, but the generated code for "bool" is not ideal - given
> it is inherently a "char" it is promoted first to an int with an additional EXTB
> which I really dislike.
> Guess it is more of a style thing.

In that sense maybe think about re-definition of "bool" type to 32-bit one
for entire architecture and get that benefit across the entire source tree?

-Alexey

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
@ 2019-05-16 17:44         ` Alexey Brodkin
  0 siblings, 0 replies; 35+ messages in thread
From: Alexey Brodkin @ 2019-05-16 17:44 UTC (permalink / raw)
  To: linux-snps-arc

Hi Vineet,

> -----Original Message-----
> From: Vineet Gupta <vgupta at synopsys.com>
> Sent: Thursday, May 16, 2019 8:38 PM
> To: Eugeniy Paltsev <paltsev at synopsys.com>
> Cc: paltsev at snyopsys.com; linux-kernel at vger.kernel.org; Alexey Brodkin <abrodkin at synopsys.com>; linux-
> snps-arc at lists.infradead.org
> Subject: Re: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
> 
> On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
> >> +	unsigned int write = 0, exec = 0, mask;
> > Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean
> variables.
> 
> Right those are semantics, but the generated code for "bool" is not ideal - given
> it is inherently a "char" it is promoted first to an int with an additional EXTB
> which I really dislike.
> Guess it is more of a style thing.

In that sense maybe think about re-definition of "bool" type to 32-bit one
for entire architecture and get that benefit across the entire source tree?

-Alexey

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
  2019-05-16 17:44         ` Alexey Brodkin
@ 2019-05-16 18:57           ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-16 18:57 UTC (permalink / raw)
  To: Alexey Brodkin
  Cc: Claudiu Zissulescu, paltsev, linux-kernel, linux-snps-arc,
	Eugeniy Paltsev

On 5/16/19 10:44 AM, Alexey Brodkin wrote:
>> On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
>>>> +	unsigned int write = 0, exec = 0, mask;
>>> Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean
>> variables.
>>
>> Right those are semantics, but the generated code for "bool" is not ideal - given
>> it is inherently a "char" it is promoted first to an int with an additional EXTB
>> which I really dislike.
>> Guess it is more of a style thing.
> 
> In that sense maybe think about re-definition of "bool" type to 32-bit one
> for entire architecture and get that benefit across the entire source tree?

what bool maps to is driven by the ABI and while not explicit in the ARCv2 ABI
doc, I guess it is byte and hence can't be changed just like that.

-Vineet

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
@ 2019-05-16 18:57           ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-16 18:57 UTC (permalink / raw)
  To: linux-snps-arc

On 5/16/19 10:44 AM, Alexey Brodkin wrote:
>> On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
>>>> +	unsigned int write = 0, exec = 0, mask;
>>> Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean
>> variables.
>>
>> Right those are semantics, but the generated code for "bool" is not ideal - given
>> it is inherently a "char" it is promoted first to an int with an additional EXTB
>> which I really dislike.
>> Guess it is more of a style thing.
> 
> In that sense maybe think about re-definition of "bool" type to 32-bit one
> for entire architecture and get that benefit across the entire source tree?

what bool maps to is driven by the ABI and while not explicit in the ARCv2 ABI
doc, I guess it is byte and hence can't be changed just like that.

-Vineet

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
  2019-05-16 17:37       ` Vineet Gupta
@ 2019-05-17 22:23         ` Eugeniy Paltsev
  -1 siblings, 0 replies; 35+ messages in thread
From: Eugeniy Paltsev @ 2019-05-17 22:23 UTC (permalink / raw)
  To: Vineet Gupta; +Cc: paltsev, linux-kernel, Alexey Brodkin, linux-snps-arc

Hmmm,

so load the bool variable from memory is converted to such asm code:

----------------->8------------------- 
ldb	r2,[some_bool_address]
extb_s	r2,r2
----------------->8-------------------

Could you please describe that the magic is going on there?

This extb_s instruction looks completely useless here, according on the LDB description from PRM:
----------------->8-------------------
LD LDH LDW LDB LDD:
The size of the requested data is specified by the data size field <.zz> and by default, data is zero
extended from the most-significant bit of the data to the most-significant bit of the destination
register.
----------------->8-------------------

Am I missing something?

On Thu, 2019-05-16 at 17:37 +0000, Vineet Gupta wrote:
> On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
> > > +    unsigned int write = 0, exec = 0, mask;
> > 
> > Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean variables.
> 
> Right those are semantics, but the generated code for "bool" is not ideal - given
> it is inherently a "char" it is promoted first to an int with an additional EXTB
> which I really dislike.
> Guess it is more of a style thing.
> 
> -Vineet
-- 
 Eugeniy Paltsev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code
@ 2019-05-17 22:23         ` Eugeniy Paltsev
  0 siblings, 0 replies; 35+ messages in thread
From: Eugeniy Paltsev @ 2019-05-17 22:23 UTC (permalink / raw)
  To: linux-snps-arc

Hmmm,

so load the bool variable from memory is converted to such asm code:

----------------->8------------------- 
ldb	r2,[some_bool_address]
extb_s	r2,r2
----------------->8-------------------

Could you please describe that the magic is going on there?

This extb_s instruction looks completely useless here, according on the LDB description from PRM:
----------------->8-------------------
LD LDH LDW LDB LDD:
The size of the requested data is specified by the data size field <.zz> and by default, data is zero
extended from the most-significant bit of the data to the most-significant bit of the destination
register.
----------------->8-------------------

Am I missing something?

On Thu, 2019-05-16@17:37 +0000, Vineet Gupta wrote:
> On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
> > > +    unsigned int write = 0, exec = 0, mask;
> > 
> > Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean variables.
> 
> Right those are semantics, but the generated code for "bool" is not ideal - given
> it is inherently a "char" it is promoted first to an int with an additional EXTB
> which I really dislike.
> Guess it is more of a style thing.
> 
> -Vineet
-- 
 Eugeniy Paltsev

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 9/9] ARC: mm: do_page_fault refactor #8: release mmap_sem sooner
  2019-05-15  0:29   ` Vineet Gupta
@ 2019-05-30 16:48     ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-30 16:48 UTC (permalink / raw)
  To: linux-snps-arc; +Cc: Peter Zijlstra, paltsev, linux-kernel, linux-mm

On 5/14/19 5:29 PM, Vineet Gupta wrote:
> In case of successful page fault handling, this patch releases mmap_sem
> before updating the perf stat event for major/minor faults. So even
> though the contention reduction is NOT supe rhigh, it is still an
> improvement.

This patch causes regression: LMBench lat_sig triggers a bogus oom.

The issue is not due to code movement of up_read() but as artifact of @fault
initialized to VM_FAULT_ERROR. The lat_sig invalid access doesn't take
handle_mm_fault() instead it hits !VM_WRITE case for write access leading to use
of "initialized" value of @fault VM_FAULT_ERROR which includes VM_FAULT_OOM hence
the bogus oom handling.


> There's an additional code size improvement as we only have 2 up_read()
> calls now.
> 
> Note to myself:
> --------------
> 
> 1. Given the way it is done, we are forced to move @bad_area label earlier
>    causing the various "goto bad_area" cases to hit perf stat code.
> 
>  - PERF_COUNT_SW_PAGE_FAULTS is NOW updated for access errors which is what
>    arm/arm64 seem to be doing as well (with slightly different code)
>  - PERF_COUNT_SW_PAGE_FAULTS_{MAJ,MIN} must NOT be updated for the
>    error case which is guarded by now setting @fault initial value
>    to VM_FAULT_ERROR which serves both cases when handle_mm_fault()
>    returns error or is not called at all.
> 
> 2. arm/arm64 use two homebrew fault flags VM_FAULT_BAD{MAP,MAPACCESS}
>    which I was inclined to add too but seems not needed for ARC
> 
>  - given that we have everything is 1 function we cabn stil use goto
>  - we setup si_code at the right place (arm* do that in the end)
>  - we init fault already to error value which guards entry into perf
>    stats event update
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
> ---
>  arch/arc/mm/fault.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
> index 20402217d9da..e93ea06c214c 100644
> --- a/arch/arc/mm/fault.c
> +++ b/arch/arc/mm/fault.c
> @@ -68,7 +68,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  	struct mm_struct *mm = tsk->mm;
>  	int sig, si_code = SEGV_MAPERR;
>  	unsigned int write = 0, exec = 0, mask;
> -	vm_fault_t fault;			/* handle_mm_fault() output */
> +	vm_fault_t fault = VM_FAULT_ERROR;	/* handle_mm_fault() output */
>  	unsigned int flags;			/* handle_mm_fault() input */
>  
>  	/*
> @@ -155,6 +155,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  		}
>  	}
>  
> +bad_area:
> +	up_read(&mm->mmap_sem);
> +
>  	/*
>  	 * Major/minor page fault accounting
>  	 * (in case of retry we only land here once)
> @@ -173,13 +176,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  		}
>  
>  		/* Normal return path: fault Handled Gracefully */
> -		up_read(&mm->mmap_sem);
>  		return;
>  	}
>  
> -bad_area:
> -	up_read(&mm->mmap_sem);
> -
>  	if (!user_mode(regs))
>  		goto no_context;
>  
> 


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 9/9] ARC: mm: do_page_fault refactor #8: release mmap_sem sooner
@ 2019-05-30 16:48     ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-30 16:48 UTC (permalink / raw)
  To: linux-snps-arc

On 5/14/19 5:29 PM, Vineet Gupta wrote:
> In case of successful page fault handling, this patch releases mmap_sem
> before updating the perf stat event for major/minor faults. So even
> though the contention reduction is NOT supe rhigh, it is still an
> improvement.

This patch causes regression: LMBench lat_sig triggers a bogus oom.

The issue is not due to code movement of up_read() but as artifact of @fault
initialized to VM_FAULT_ERROR. The lat_sig invalid access doesn't take
handle_mm_fault() instead it hits !VM_WRITE case for write access leading to use
of "initialized" value of @fault VM_FAULT_ERROR which includes VM_FAULT_OOM hence
the bogus oom handling.


> There's an additional code size improvement as we only have 2 up_read()
> calls now.
> 
> Note to myself:
> --------------
> 
> 1. Given the way it is done, we are forced to move @bad_area label earlier
>    causing the various "goto bad_area" cases to hit perf stat code.
> 
>  - PERF_COUNT_SW_PAGE_FAULTS is NOW updated for access errors which is what
>    arm/arm64 seem to be doing as well (with slightly different code)
>  - PERF_COUNT_SW_PAGE_FAULTS_{MAJ,MIN} must NOT be updated for the
>    error case which is guarded by now setting @fault initial value
>    to VM_FAULT_ERROR which serves both cases when handle_mm_fault()
>    returns error or is not called at all.
> 
> 2. arm/arm64 use two homebrew fault flags VM_FAULT_BAD{MAP,MAPACCESS}
>    which I was inclined to add too but seems not needed for ARC
> 
>  - given that we have everything is 1 function we cabn stil use goto
>  - we setup si_code at the right place (arm* do that in the end)
>  - we init fault already to error value which guards entry into perf
>    stats event update
> 
> Cc: Peter Zijlstra <peterz at infradead.org>
> Signed-off-by: Vineet Gupta <vgupta at synopsys.com>
> ---
>  arch/arc/mm/fault.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arc/mm/fault.c b/arch/arc/mm/fault.c
> index 20402217d9da..e93ea06c214c 100644
> --- a/arch/arc/mm/fault.c
> +++ b/arch/arc/mm/fault.c
> @@ -68,7 +68,7 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  	struct mm_struct *mm = tsk->mm;
>  	int sig, si_code = SEGV_MAPERR;
>  	unsigned int write = 0, exec = 0, mask;
> -	vm_fault_t fault;			/* handle_mm_fault() output */
> +	vm_fault_t fault = VM_FAULT_ERROR;	/* handle_mm_fault() output */
>  	unsigned int flags;			/* handle_mm_fault() input */
>  
>  	/*
> @@ -155,6 +155,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  		}
>  	}
>  
> +bad_area:
> +	up_read(&mm->mmap_sem);
> +
>  	/*
>  	 * Major/minor page fault accounting
>  	 * (in case of retry we only land here once)
> @@ -173,13 +176,9 @@ void do_page_fault(unsigned long address, struct pt_regs *regs)
>  		}
>  
>  		/* Normal return path: fault Handled Gracefully */
> -		up_read(&mm->mmap_sem);
>  		return;
>  	}
>  
> -bad_area:
> -	up_read(&mm->mmap_sem);
> -
>  	if (!user_mode(regs))
>  		goto no_context;
>  
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* extraneous generated EXTB (was Re: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code)
  2019-05-17 22:23         ` Eugeniy Paltsev
@ 2019-05-30 17:58           ` Vineet Gupta
  -1 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-30 17:58 UTC (permalink / raw)
  To: Claudiu Zissulescu
  Cc: Eugeniy Paltsev, paltsev, linux-kernel, Alexey Brodkin, linux-snps-arc

On 5/17/19 3:23 PM, Eugeniy Paltsev wrote:
> Hmmm,
> 
> so load the bool variable from memory is converted to such asm code:
> 
> ----------------->8------------------- 
> ldb	r2,[some_bool_address]
> extb_s	r2,r2
> ----------------->8-------------------
> 
> Could you please describe that the magic is going on there?
> 
> This extb_s instruction looks completely useless here, according on the LDB description from PRM:
> ----------------->8-------------------
> LD LDH LDW LDB LDD:
> The size of the requested data is specified by the data size field <.zz> and by default, data is zero
> extended from the most-significant bit of the data to the most-significant bit of the destination
> register.
> ----------------->8-------------------
> 
> Am I missing something?


@Claudiu is that a target specific optimization/tuning in ARC backend ?


> 
> On Thu, 2019-05-16 at 17:37 +0000, Vineet Gupta wrote:
>> On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
>>>> +    unsigned int write = 0, exec = 0, mask;
>>>
>>> Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean variables.
>>
>> Right those are semantics, but the generated code for "bool" is not ideal - given
>> it is inherently a "char" it is promoted first to an int with an additional EXTB
>> which I really dislike.
>> Guess it is more of a style thing.
>>
>> -Vineet


^ permalink raw reply	[flat|nested] 35+ messages in thread

* extraneous generated EXTB (was Re: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code)
@ 2019-05-30 17:58           ` Vineet Gupta
  0 siblings, 0 replies; 35+ messages in thread
From: Vineet Gupta @ 2019-05-30 17:58 UTC (permalink / raw)
  To: linux-snps-arc

On 5/17/19 3:23 PM, Eugeniy Paltsev wrote:
> Hmmm,
> 
> so load the bool variable from memory is converted to such asm code:
> 
> ----------------->8------------------- 
> ldb	r2,[some_bool_address]
> extb_s	r2,r2
> ----------------->8-------------------
> 
> Could you please describe that the magic is going on there?
> 
> This extb_s instruction looks completely useless here, according on the LDB description from PRM:
> ----------------->8-------------------
> LD LDH LDW LDB LDD:
> The size of the requested data is specified by the data size field <.zz> and by default, data is zero
> extended from the most-significant bit of the data to the most-significant bit of the destination
> register.
> ----------------->8-------------------
> 
> Am I missing something?


@Claudiu is that a target specific optimization/tuning in ARC backend ?


> 
> On Thu, 2019-05-16@17:37 +0000, Vineet Gupta wrote:
>> On 5/16/19 10:24 AM, Eugeniy Paltsev wrote:
>>>> +    unsigned int write = 0, exec = 0, mask;
>>>
>>> Probably it's better to use 'bool' type for 'write' and 'exec' as we really use them as a boolean variables.
>>
>> Right those are semantics, but the generated code for "bool" is not ideal - given
>> it is inherently a "char" it is promoted first to an int with an additional EXTB
>> which I really dislike.
>> Guess it is more of a style thing.
>>
>> -Vineet

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2019-05-30 17:58 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-15  0:29 [PATCH 0/9] ARC do_page_fault rework Vineet Gupta
2019-05-15  0:29 ` Vineet Gupta
2019-05-15  0:29 ` [PATCH 1/9] ARC: mm: SIGSEGV userspace trying to access kernel virtual memory Vineet Gupta
2019-05-15  0:29   ` Vineet Gupta
2019-05-15 10:54   ` Sasha Levin
2019-05-15  0:29 ` [PATCH 2/9] ARC: mm: do_page_fault refactor #1: remove label @good_area Vineet Gupta
2019-05-15  0:29   ` Vineet Gupta
2019-05-15  0:29 ` [PATCH 3/9] ARC: mm: do_page_fault refactor #2: remove short lived variable Vineet Gupta
2019-05-15  0:29   ` Vineet Gupta
2019-05-15  0:29 ` [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code Vineet Gupta
2019-05-15  0:29   ` Vineet Gupta
2019-05-16 17:24   ` Eugeniy Paltsev
2019-05-16 17:24     ` Eugeniy Paltsev
2019-05-16 17:37     ` Vineet Gupta
2019-05-16 17:37       ` Vineet Gupta
2019-05-16 17:44       ` Alexey Brodkin
2019-05-16 17:44         ` Alexey Brodkin
2019-05-16 18:57         ` Vineet Gupta
2019-05-16 18:57           ` Vineet Gupta
2019-05-17 22:23       ` Eugeniy Paltsev
2019-05-17 22:23         ` Eugeniy Paltsev
2019-05-30 17:58         ` extraneous generated EXTB (was Re: [PATCH 4/9] ARC: mm: do_page_fault refactor #3: tidyup vma access permission code) Vineet Gupta
2019-05-30 17:58           ` Vineet Gupta
2019-05-15  0:29 ` [PATCH 5/9] ARC: mm: do_page_fault refactor #4: consolidate retry related logic Vineet Gupta
2019-05-15  0:29   ` Vineet Gupta
2019-05-15  0:29 ` [PATCH 6/9] ARC: mm: do_page_fault refactor #5: scoot no_context to end Vineet Gupta
2019-05-15  0:29   ` Vineet Gupta
2019-05-15  0:29 ` [PATCH 7/9] ARC: mm: do_page_fault refactor #6: error handlers to use same pattern Vineet Gupta
2019-05-15  0:29   ` Vineet Gupta
2019-05-15  0:29 ` [PATCH 8/9] ARC: mm: do_page_fault refactor #7: fold the various error handling Vineet Gupta
2019-05-15  0:29   ` Vineet Gupta
2019-05-15  0:29 ` [PATCH 9/9] ARC: mm: do_page_fault refactor #8: release mmap_sem sooner Vineet Gupta
2019-05-15  0:29   ` Vineet Gupta
2019-05-30 16:48   ` Vineet Gupta
2019-05-30 16:48     ` Vineet Gupta

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.